Notes on Stocastic Calculus for finance

Posted on September 8, 2015

General Probability theory

Probability can be thought of as a set function (measure?) that satisfies the following conditions:
1. \(\P(A) \in [0, 1]\) for elements in the sigma algebra \(\F\).
2. \(\P(\Omega) = 1\), here \(\Omega\) is the entire space.
3. Let \(\{A_i\}\) be pairwise disjoint subsets of \(\F\), then \(\P(\cup A_i) = \sum \P(A_i)\).
Compare the above definition with that of a measure \(\mu\) on a measure space \(\Omega\).
1. Positive, i.e., \(\mu(A) \ge 0\).
2. Countable additivity.
Thus the probability is indeed a measure!
In a probability space, a random variable is a measurable function. (There are other definitions for this. For example, a real valued function \(X\) from the sigma algebra \(\F\) is said to be a random variable if for every Borel set \(B\), the set \(\{X \in B\}\) is in the sigma algebra \(\F\).)
The expectation of the random variable \(X\) is the Lebesgue integral \(\int X d\P\).
The conditional probability \(\P(A \vert B)\) is given by the fraction \(\frac{\P(A \cap B)}{\P(B)}\).
If \(\alpha\) and \(\beta\) are positive constants, then \(E(\alpha X + \beta Y) = \alpha EX + \beta EY\).

Information and Conditioning

We say that two sets \(A\) and \(B\) in \(\F\) are independent if \(P(A\cap B) = P(A) P(B)\). Clearly, the independence depends on the probability measure too.
Similarly, if \(\F\) be a sigma algebra and if \(\G\) and \(\mathcal H\) be two sub sigma algebra. Then \(\G\) and \(\mathcal H\) are said to be independent if for every \(A \in \G\) and every \(B \in \mathcal H\), \(P(A \cap B) = P(A) P(B)\).
Sigma algebra generated by \(X\): If \(X\) is a random variable, then the sigma algebra generated by \(X\), denoted by \(\sigma(X)\) is the collection of all subsets of the form \(X^{-1}(B)\) where \(B\) ranges over the Borel sigma algebra.
G-measurable: G be a sigma algebra on \(\Omega\) and if every set of \(\sigma(X)\) is an element of \(\G\), then \(X\) is said to be \(\G\)-measurable.
Conditional expectation. In a probability space, let \(\G\) be a sub sigma algebra of \(\F\) and let \(X\) be a random variable that is either non-negative or integrable. Then the conditional expectation of \(X\) given \(\G\), denoted by \(E[X \vert G]\), is any random variable that satisfies.
1. (Measurability): \(E[X\vert G]\) is \(\G\)-measurable, and
2. (Partial Averaging): \[\int_A E[X\vert G]\,d\P = \int_AX\,d\P \, \text{ for all } A \in \G.\]
If \(\G\) is a \(\sigma\) algebra generated by a random variable \(W\), we generally write \(E[X\vert W]\) rather than \(E[X \vert\sigma(W)]\) or \(\E[X\vert \G]\).

Define \(\tP\) by \(\tP(A) = \int_A \frac{X + 1}{\E[X + 1]} d\P\). Define \(\Q\) and \(\tQ\) by restricting \(\P\) and \(\tP\) on the sigma algebra \(\G\). Let \(Z\) denote the random variable \(\frac{d\tQ}{d\Q}\).

Then the random variable \((Z \cdot \E[X + 1] - 1) = \E[X\vert \G]\).
When \(X\) and \(Y\) are discrete random variables, the conditional expectation \(\E[X\vert Y]\) is the function given by

\[\E[X\vert Y = y] = \sum_{x \in \chi} x \P(X = x \vert Y = y) = \sum_{x\in X} x \cdot \frac{\P(X = x, Y = y)}{\P(Y = y)}.\]
Theorem

Let \((\Omega, \F, \P)\) be a probability space and let \(\G\) be a sub sigma-algebra of \(\F\).
1. (Linearity of Conditional Expectation.) If \(X\) and \(Y\) are integrable random variables and \(c_1\) and \(c_2\) are constants, then
  
  \[\E[(c_1X + c_2Y)\vert \G] = c_1\E[X\vert \G] + c_2\E[X\vert \G].\]
2. (Taking out what is known.) If \(X\) and \(Y\) are integrable random variables, \(Y\) and \(XY\) are integrable, and \(X\) is \(\G\)-measurable, then
  
  \[\E[XY\vert \G] = X\E[Y\vert \G].\]
3. (Iterated Conditioning or Tower property) If \(H\) is a sub-sigma algebra of \(\G\) and \(X\) is an integrable random variable, then
  
  \[ \E[\E[X\vert \G]\vert {\mathcal H}] = \E[ X\vert\mathcal H].\]
4. (Independence.) If \(X\) is integrable and independent of \(\G\), then
  
  \[\E[X\vert \G] = \E X.\]
5. (Conditional Jensen’s inequality.) If \(\varphi(x)\) is a convex function and if \(X\) is integrable, then
  
  \[\E[\varphi(X)\vert \G] \ge \varphi(E[X\vert \G]).\]

Martingales

Definition: Let \(\{X_n\}\) be a stochastic process and \(\{\F_n\}\) be a filtration such that \(X_n\) is adapted with respect to \(\F_n\), (this means that \(X_n\) is \(\F_n\) measurable) then we say that \(X_n\) is a martingale, when
1. \(\E[X_n] < \infty\)
2. \(\E[X_{n+1}|\F_n] = X_n\) for all values of \(n\).
If the equality in the 2nd condition is replaced by less than or equal to sign, then the process is called supermartingale.

Similarly, if the equality is replaced by greater than or equal to sign, then the process is called as submartingale.
It should be noted, that in our definition, the indexing set is countable, but this need not be the case, in general. The definition can be naturally extended for arbitrary indexing sets.
Examples: random walks (?), a gambler’s fortune (assuming that all the games that he is playing is a fair game.)
In case of a binomial model, we can think of two probability measures on the same sample space. One of them is the natural probability measure or the one that assigns probability \(p\) to an uptick scenario and \(q\) to a down-tick scenario. The second one is more of a constructed probability measure and is called as “risk neutral probability”. This has an uptick probability \(\tilde{p}\) and down-tick probability \(\tilde{q}\).

Here \[\tilde{p} = \frac{1 + r - d}{u - d},\] and \[\tilde{q} = \frac{u - 1 - r}{u-d}.\]

If the value of the asset at time \(n\) is given by \(S_n\), then we can see that \[ \frac{S_n}{(1+r)^n} = \tilde{\mathbb{E}}_N\left[\frac{S_{n+1}}{(1+r)^{n+1}}\right].\]

The last conclusion shows that (?) the random variable \(\frac{S_n}{(1+r)^n}\) is a martingale. (Note: What Shreve means by \(\tE_n[X]\) is the conditional expectation \(\E[X\vert F_n]\) where \(F_n\) is the sigma algebra corresponding to \(n\) coin tosses.)
Theorem Consider a binomial model with \(N\) periods. If \(\Delta_0, \cdots, \Delta_{N-1}\) be an adapted portfolio process, let \(X_0\) be a real number, and let wealth process \(X_1, \cdots, X_N\) be generated by the equation:

\[X_{n+1} = \Delta_n S_{n+1} + (1 + r)(X_n - \Delta_n S_n), n = 0, 1, \cdots, N-1.\]

Then the discounted wealth process \(\frac{X_n}{(1+r)^n}\) is a martingale under the risk-neutral measure, i.e.,

\[\frac{X_n}{(1+r)^n} = \tilde{\mathbb{E}}_n\left[\frac{X_{n+1}}{(1+r)^{n+1}}\right].\]

Relevant Measure theory

Signed measures are real valued set functions.
We say that a set \(A^{\plus}\) is positive w.r.t. to a signed measure \(v\) if \(v(A^{\plus} \cap E) \ge 0\) for all \(E \in \chi\).

Similarly, we can define what it means to call a set negative.
Hahn Decomposition theorem. One can decompose the measurable space into disjoint positive and negative sets.

It can be shown that this decomposition, in some sense, is unique. More precisely, if \(\{A_1^{\plus}, A_1^{-}\}\) and \(\{A_2^{\plus}, A_2^{\plus}\}\) are two such decomposition, then for all \(E \in \chi\), \(v(A_1^{\plus} \cap E) = v(A_2^{\plus} \cap E)\) and similarly for the negative part too.
For a signed measure \(v\), we can define measure \(v^{\plus}\) and \(v^{-}\) by:

\[\begin{array}{cl} v^{\plus}(E) &= v(A^{\plus} \cap E)\\ v^{-}(E) &= v(A^{-} \cap E)\\\end{array}\]
Jordan decomposition theorem. For a signed measure \(v\), \(v = v^{\plus} - v^{-}\) and if \(v = \lambda - \mu\), then \(v^{\plus} \le \lambda\) and \(v^{-} \le \mu\). (Here \(\lambda\) and \(\mu\) are positive measures.)
The total variation of a signed measure \(v\colon \chi \rightarrow \R\) is a finite measure \(\vert v \vert \colon \chi \rightarrow \R\) defined by \(\vert v \vert = v^{\plus} + v^{-}\).
If \((X, \chi, \mu)\) is a measure space, \(f \in L(X, \chi, \mu)\) and \(v\colon \chi \rightarrow \R\) is the signed measure defined by

\[v(E) = \int_{E} f d \mu.\]

Then the negative, the positive and the total variation are given by

\[v^{\plus}(E) = \int_{E} f^{+} d\mu\]

\[v^{-}(E) = \int_{E} f^{-} d\mu\]

\[\bar{v}(E) = \int_{E} \vert f \vert d \mu.\]
Let \(\lambda\) and \(\mu\) be measures on \(\chi\), we say that \(\lambda\) is absolutely continuous with respect to \(\mu\) if for every set \(E\) such that \(\lambda(E) = 0\) the value of \(\mu(E)\) is also zero. We use the notation \(\lambda \ll \mu\) to denote the same.
Theorem. \(\lambda\) and \(\mu\) be measures on \(\chi\), then the following two statements are equivalent.
1. For every \(\varepsilon > 0\), there exists a \(\delta > 0\) such that, for \(E \in \chi\), \(\mu(E) < \delta\) implies \(\lambda(E) < \varepsilon\).
2. \(\lambda\) is absolutely continuous with respect to \(\mu\) (\(\lambda \ll \mu\).)
Radon Nikodym Theorem.

\((X, \chi, \mu)\) be a measurable space and \(\lambda\) and \(\mu\) be \(\sigma\)-finite measures on \(\chi\). If \(\lambda\) is absolutely continuous with respect to \(\mu\), then there exists a unique (\(\mu\) a.e.) real valued measurable function \(f\) such that

\[\lambda(E) = \int_{E} f d\mu.\]

The function \(f\) is called as the Radon-Nikodym derivative of \(\lambda\) with respect to \(\mu\) and is denoted by \(f = \frac{d \lambda}{d\mu}\).
The Radon Nikodym theorem is vital in proving the existence of the conditional expectation. (For a proof, refer to Appendix B: Existence of Conditional Expectation in Shreve Vol II.)

State Prices

Consider a finite sample space \(\Omega\) on which we have two probability measures \(\P\) and \(\tP\). If \(\P\) and \(\tP\) both give positive probability to every element of \(\Omega\), observe that as per our definition \(\P\) and \(\tP\) are equivalent.

In this case, one can see that the Radon-Nikodym derivative \(\frac{d\tP}{d\P}\) is given by the random variable \(Z\) given by \(Z(\omega) = \frac{\tP(\omega)}{\P(\omega)}\).
Theorem. Let \(\P\) and \(\tP\) be probability measures on finite sample space \(\Omega\), and assume that \(\P(\omega) > 0\) and \(\tP(\omega) > 0\) for every \(\omega \in \Omega\), and define the random variable \(Z\) to be the Radon-Nikodym derivative (defined in the previous bullet.) Then we have the following:
1. \(\P(Z > 0) = 1\).
2. \(\E Z = 1\)
3. for any random variable \(Y\),
  
  \[\tE Y = \E[ZY].\]

Change of Measure

In stochastic calculus, we say that two measures \(\lambda\) and \(\mu\) are equivalent if

\[ \lambda(E) = 0 \iff \mu(E) = 0.\]

Notice that this is absolutely continuous in “both directions”.
Theorem. Let \((\Omega, F, \P)\) be a probability space and let \(Z\) be an almost surely nonnegative random variable with \(\E Z = 1\). For \(A \in \F\), define

\[\tP(A) = \int_A Z d\P.\]

Then \(\tP\) is a probability measure. Furthermore, if \(X\) is a nonnegative random variable, then

\[\tE(X) = \E[XZ].\]

If \(Z\) is almost surely strictly positive, we also have

\[\E Y = \tE \left[ \frac{Y}{Z}\right].\]
It should noted that for two probability measures \(\P\) and \(\tP\) that are equivalent, the Radon-Nikodym theorem guarantees you the existence of such a random variable \(Z\).

Independence

Let \((\Omega, \F, \P)\) be a probability space, we say that two sets \(A\) and \(B\) are independent if

\[\P(A \cap B) = \P(A) \cap \P(B).\]

We say that two sigma algebras \(\F\) and \(\G\) are independent if for \(A \in \F\) and \(B \in \G\), \(A\) and \(B\) are independent.

We say that two random variables \(X\) and \(Y\) are independent if \(\sigma(X)\) and \(\sigma(Y)\) are independent.

We use the notation \(\E[Y\vert X]\) where \(X\) and \(Y\) are random variables to denote \(\E[Y\vert \sigma(X)]\), also, we use the notation \(f(X)\), where \(f\) is usually a Borel Measurable function to denote the function \(f \circ X\colon \Omega \rightarrow \R\).
Theorem: Let \(X\) and \(Y\) be independent random variables, and let \(f\) and \(g\) be Borel measurable functions on \(\R\). Then \(f(X)\) and \(g(Y)\) are independent random variables.

Proof: Notice that \(f(X)\) and \(g(Y)\) are measurable. The theorem follows from the fact that the sigma algebra generated by \(f(X)\) and \(g(Y)\) are a sub-sigma algebra of \(\sigma(X)\) and \(\sigma(Y)\) respectively.
We can define a Borel sigma algebra on \(\R^2\) by taking the sigma algebra generated by closed rectangles in \(\R^2\).
Let \(X\) and \(Y\) be random variables. The pair of random variables \((X, Y)\) takes values in the place \(\R^2\), and the joint distribution measure of \((X, Y)\) is given by

\[\mu_{X, Y}(C) = \P\{(X, Y) \in C\} \text{ for all Borel sets } C \subset \R^2.\]

One can see that this (\(\mu\)) is a probability measure.

The joint cumulative distribution function of \((X, Y)\) is:

\[F_{X, Y}(a, b) = \mu_{X, Y}((-\infty, a] \times (\infty, b]) = \P\{X \le a, Y\le b\}, a \in \R, b\in \R.\]

We say that a nonnegative, Borel-measurable function \(f_{X, Y}(x, y)\) is a joint density for a pair of random variables if

\[\mu_{X, Y}(C) = \iint \chi_C(x, y) f_{X, Y}\, dydx.\]

The marginal distribution function \(\mu_X\) can be defined by

\[\mu_X(A) = \P(\{X \in A \}\times \R).\]

and \(\mu_Y\) can be similarly defined. Similarly, we can think of marginal densities.

Some observations:
- Essentially, we are defining a probability measure on \(\Omega_1 \times \Omega_2\), with the help of two random variables \(X\) and \(Y\) on \(\Omega_1\) and \(\Omega_2\) respectively.
Theorem. Let \(X\) and \(Y\) be random variables. The following conditions are equivalent.
1. \(X\) and \(Y\) are independent.
2. The joint distribution measure factors (for all Borel sets \(A\) and \(B\))
  
  \[\mu_{X, Y}(A, B) = \mu_X(A) \cdot \mu_Y(B).\]
3. The joint cumulative distribution function factors:
  
  \[F_{X, Y}(a, b) = F_X(a) \cdot F_Y(b).\]
4. The joint moment-generating function factors:
  
  \[\E e^{uX + uY} = \E e^{uX} \cdot \E e^{vY}.\]
  
  for all \(u, v \in \R\), for which the expectations are finite.
  
  If there is a joint density, each of the conditions are equivalent to the following (in the general case, the above conditions imply the following:)
5. The joint density factors:
  
  \[f_{X, Y}(x, y) = f_X(x) \cdot f_Y(y).\]
  
  for almost every \(x\in \R\) and \(y\in \R\).
6. The expectation factors
  
  \[\E[XY] = \E X \cdot\E Y,\]
  
  provided that \(\E[XY] < \infty\).
The variance of a random variable \(X\) whose expected value is defined, denoted by \(\Var(X)\)

\[\Var(X) = \E[(X - \E X)^2] = \E[X^2] - \E[X]^2.\]

The standard deviation is defined as \(\sqrt{\Var(X)}\).

The covariance of \(X\) and \(Y\) is:

\[\Cov(X, Y) = \E[(X - \E X)(Y - \E Y)] = \E[XY] - \E X \cdot \E Y.\]

In particular, \(\E[XY] = \E X \cdot E Y = 0 \iff \Cov(X, Y) = 0\). The correlation coefficient of \(X\) and \(Y\) is

\[\rho(X, Y) = \frac{\Cov(X, Y)}{\sqrt{\Var(X) \Var(Y)}}.\]
Example of an uncorrelated dependent random variable.

Let \(X\) be the standard normal random variable and choose a random variable \(Z\) that is independent of \(X\) and satisfying \(\P\{Z = 1\} = \frac12\) and \(\P\{Z = -1\} = \frac12\).

Consider the random variable \(Y = XZ\). It can be shown that this random variable is standard normal, \(\Cov(X, Y) = 0\), but \(X\) and \(Y\) are not independent. (One can also verify that this random variable, as expected, does not have a joint density function.)
(Independence lemma) Let \((\Omega, \F, \P)\) be a probability space, and let \(\G\) be a sub-sigma-algebra of \(\F\). Suppose that the random variables \(X_1, \cdots, X_K\) are \(\G\)-measurable and the random variables \(Y_1, \cdots, Y_L\) are independent of \(\G\). Let \(f(x_1, \cdots, x_K, y_1, \cdots, y_L)\) be a function of the dummy variables \(x_1, \cdots, x_K\) and \(y_1, \cdots, y_L\), and define

\[g(x_1, \cdots, x_K) = \E f(x_1, \cdots, x_K, Y_1, \cdots, Y_L).\]

Then

\[\E[f(X_1, \cdots, X_K, Y_1, \cdots, Y_L)\vert \G] = g(X_1, \cdots, X_K).\]

Brownian Motion

Symmetric Random walks.

Let \(\omega\) be an infinite sequence of tosses, and \(\omega_n\) is the outcome of the \(n\) th toss. Let

\[X = \left\{\begin{array}{rl} 1 & \textup{if } \omega_j = H, \\ -1 & \textup{if } \omega_j = T, \end{array} \right. \]

and define \(M_0 = 0\),

\[M_k = \sum_{j=1}^{k} X_j,\ k = 1, 2, \cdots\]

(Note that in symmetric random walks, the probability of getting a head and that of heading a tail are both equal; in general random walks, this need not be true.)

It can be observed that, for \(0 < k_1 < \cdots < k_m\), the random variables \((M_{k_1} - M_{k_0}), (M_{k_2} - M_{k_1}), \cdots, (M_{K_m} - M_{k_{m-1}})\) are independent.

Moreover, \(\Var(M_{k_{i+1}} - M_{k_i}) = k_{i+1} - k_i\).

The symmetric random walk is a martingale, i.e., \(\E[M_l\vert \F_k] = M_k\) for \(k < l\).

The quadratic variation of the symmetric random walk \([M, M]_K = \sum_1^k (M_j - M_{j-1})^2 = k\). Note that this property is also true for a general random walk (where \(p\) and \(q\) are not necessarily the same.)
Brownian motion: Let \((\Omega, \F, \P)\) be a probability space. For each \(\omega \in \Omega\), suppose that there is a continuous function \(W(t)\) of \(t \ge 0\) that satisfies \(W(0) = 0\) and that depends on \(\omega\). Then \(W(t)\) \(t \ge 0\), is a Brownian motion if for all \(0 = t_0 < t_1 < \cdots, t_m\), the increments

\[W(t_1), W(t_2) - W(t_1), W(t_3) - W(t_2), \cdots, W(t_m) - W(t_{m-1}).\]

are independent and each of these increments is normally distributed with

\[\E[W(t_{i+1}) - W(t_i)) = 0,\] \[\Var[W(t_{i+1}) - W(t_{i})] = t_{i+1} - t_i.\]

Interestingly, it can be found out that \(W(t)\) is a nowhere differentiable function (this follows from the fact that a Brownian motion is self-similar.)

Stochastic Calculus

Ito integral. I think this is same as the Riemann-Stieltjes integral. \(W(t)\) represents a Brownian motion, \(\F(t)\), a filtration and \(\Delta(t)\) be an adapted stock price process. The integral

\[\int \Delta(t) dW(t).\]

is called the Ito integral (the construction of the integral is very similar to that of the Riemann-Stieltjes integral; I don’t know why they are mute about Riemann-Stietljes. Math.SE link that offers something more.)
Theorem. The Ito integral is a martingale.
Theorem. (Ito Isometry). The Ito integral satisfies

\[\E I^2(t) = E \int_{0}^{t} \Delta^2 (u)\, du.\]
Theorem. The quadratic variation accumulated upto time \(t\) by the Ito integral is

\[[I, I](t) = \int_{0}^{t} \Delta^2(u)\, du.\]
To sum things up, \(I(t)\) has the following properties.
- Paths of \(I(t)\) are continuous (yes, \(I\) is a process.)
- \(I(t)\) is \(\F(t)\) measurable.
- Linearity is obeyed.
- \(I(t)\) is a martingale.
- Ito Isometry.
- Quadratic variation.
Remark.

\[\int_0^T W(t)\, dW(t) = \frac12 \cdot W^2(T) - \frac12\cdot T.\]
A shorthand representation of things (these doesn’t posses rigorous meaning.)
1. \(dt\,dt = 0\).
2. \(dW(t)\, dt = 0\).
3. \(dW(t)\, dW(t) = dt\).

Ito-Doeblin Formula

Here \(f(x)\) is is a differentiable function and \(W(t)\) is a Brownian motion.
Differential form

\[df(W(t)) = f'(W(t))\,dW(t) + \frac12 f''(W(t))\, dt.\]

Integral form

\[f(W(t)) - f(W(0)) = \int_0^t f'(W(u))\, dW(u) + \frac12 \int_0^t f''(W(u))\, du.\]
Theorem (Ito-Doeblin formula for Brownian motion). Let \(f(t, x)\) be a function for which the partial derivatives \(f_t(t, x), f_x(t, x)\), and \(f_{xx}(t, x)\) are defined and continuous, and let \(W(t)\) be a Brownian motion. Then for every \(T \ge 0\),

\[f(T, W(T)) = f(0, W(0)) + \int_0^T f_t(t, W(t))\, dt + \int_0^T f_x(t, W(t))\,dW(t) + \frac12 \int_0^T f_{xx}(t, W(t)) dt.\]
Ito Process. Let \(W(t)\) be a Brownian motion, \(\F(t)\) be associated filtration. An Ito process is a stochastic process of the form

\[X(t) = X(0) + \int_0^t \Delta(u) dW(u) + \int_0^t \Theta(u)\, du.\]

where \(X(0)\) is nonrandom and \(\Delta(u)\) and \(\Theta(u)\) are adapted stochastic processes.

In differential form, this can be thought of as

\[dX(t) = \Delta(t)\,dW(t) + \Theta(t)\, dt.\]
Lemma. The quadratic variation of the Ito process is

\[[X, X](t) = \int_0^t \Delta^2(u)\, du.\]
Let \(X(t)\) be an Ito process, let \(\Gamma(t)\) be an adapted process. We define the integral with respect to an Ito process as

\[\int_0^t \Gamma(u)\, dX(u) = \int_0^t \Gamma(u)\Delta(u)\, dW(u) + \int_0^t \Gamma(u)\Theta(u)\, du.\]
Theorem (Ito-Doeblin formula for an Ito process). Let \(X(t)\) be an Ito process, \(f(t,x)\) be a function for which the partial derivatives \(f_t(t,x)\), \(f_x(t, x)\), and \(f_{xx}(t, x)\) are defined and continuous. Then for every \(T \ge 0\),
\begin{align*} f(T, X(T)) &= f(0, X(0)) + \int_0^T f_t(t, X(t))\, dt + \int_0^T f_x(t, X(t)) dX(t)+ \frac12 \int_0^T f_{xx}(t, X(t)) d[X, X](t)\\ &= f(0, X(0)) + \int_0^T f_t(t, X(t))\, dt + \int_0^T f_x(t, X(t))\Delta(t)\, dW(t)\\&\quad + \int_0^Tf_x(t, X(t)) \Theta(t)\, dt + \frac12 \int_0^T f_{xx}(t, X(t))\Delta^2(t)\,dt\notag \end{align*}
Theorem (Ito Integral of a deterministic Integral). \(\Delta(s)\) be a non-random function of time. \(I(t) = \int_0^{t} \Delta(s)\, dW(s)\). For each \(t \ge 0\), the random variable \(I(t)\) is normally distributed with expected value zero and variance \(\int_0^t\Delta^2(s)ds\).