# 数据科学的数学 Maths for Data Science MATH1057

0

Let $A$ be the set of people, and $P, F, M, S, B$ the relations over $A$ of ‘parent of’, ‘father of’, ‘mother of’, ‘sister of’ and ‘brother of’ respectively. Describe exactly the following relative products. (a) $P \circ P$, (b) $M \circ F$, (c) $S \circ P$, (d) BoB. Warnings: (1) Be careful about order. (2) In some cases there will be a handy word in English for just the relation, but in others it will have to be described in a more roundabout (but still precise) way.

PoP $P$ ‘grandparent of’. Reason: $a$ is a grandparent of $c$ iff there is an $x$ such that $a$ is a parent of $x$ and $x$ is a parent of $c$.
$M \circ F=$ ‘maternal grandfather of’. Reason: $a$ is a maternal grandfather of $c$ iff there is an $x$ such that $a$ is a father of $x$ and $x$ is a mother of c. Comments: Two common errors here. (1) Getting the order wrong. (2) Rushing to the answer ‘grandfather of’, since a father of

## MATH1057 COURSE NOTES ：

Let $f: A \rightarrow B$, i.e. let $f$ be a function from $A$ to $B$, and let $X \subseteq A$. The image under $f$ of $X \subseteq A$ is the set ${b \in B: \exists a \in X, b=f(a)}$, which can also be written more briefly as ${f(a): a \in A}$. Thus, to take limiting cases as examples, the image $f(A)$ of $A$ itself is range $(f)$, and the image $f(\varnothing)$ of $\varnothing$ is $\varnothing$, while the image of a singleton subset ${a} \subseteq A$ is the singleton ${f(a)}$. Thus image is not quite the same thing as value: the value of $a \in A$ under $f$ is $f(a)$, while the image of ${a} \subseteq A$ under $f$ is ${f(a)}$. However, many texts also use the term ‘image’ rather loosely as a synonym of ‘value’. Always check what your author means.

# 统计学 Statistics MATH1054

0

For instance, suppose the goal is, as before, to estimate a regression
$$Y=x \beta \quad e$$
where the joint dependence between errors might be described by a multivariate Normal prior
$$e \sim N_{n}(0, \Sigma)$$

such that the dispersion matrix $\Sigma$ reflects the spatial interdependencies within the data. Outcomes may also be discrete, and then one might have, for binomial data, say,
$$\begin{gathered} Y_{i} \sim \operatorname{Bin}\left(\pi_{i}, N_{i}\right) \quad i=1, \ldots n \ \operatorname{logit}\left(\pi_{i}\right)=x_{i} \beta \quad e_{i} \end{gathered}$$
where again, the errors may be spatially dependent. Let the $n \times n$ covariance matrix for $e$ be
$$\Sigma=\sigma^{2} R(d)$$

## MATH1054COURSE NOTES ：

Among the most commonly used functions meeting these requirements are the exponential model
$$r_{i j}=\exp \left(-3 d_{i j} / h\right)$$
where $h$ is the range, or inter-point distance at which spatial correlation ceases to be important $^{12}$. The Gaussian correlation function has
$$r_{i j}=\exp \left(-3 d_{i j}^{2} / h^{2}\right)$$
and the spherical (Mardia and Marshall, 1984) has $r_{i j}=0$ for $d_{i j}>h$ and
$$r_{i j}=\left(1-3 d_{i j} / 2 h \quad d_{i j}^{3} / 2 h^{3}\right)$$
for $d_{i j}<h$; see Example $7.5$ for an illustration. In each of these functions, $h$ is analogous to the bandwidth of kernel smoothing models. If $\Sigma=\sigma^{2} R(d)$, then the covariance tends

# 线性数学 Linear Mathematics MATH1007

0

$$\hat{L} \hat{U}=\hat{P}(A+\delta A) \quad \text { with }|\delta A| \leq \frac{2 n \epsilon}{1-n \epsilon}|\hat{L}||\hat{U}|$$
and for the particular case that $m=n$ and $A$ is nonsingular, if an approximate solution, $\hat{\mathbf{x}}$, to $A \mathbf{x}=\mathbf{b}$ is computed by solving the two triangular linear systems, $\hat{L} \mathbf{y}=\hat{P} \mathbf{b}$ and $\hat{U} \hat{\mathbf{x}}=\mathbf{y}$, then $\hat{\mathbf{x}}$ is the exact solution to a perturbed linear system:

$$(A+\delta A) \hat{\mathbf{x}}=\mathbf{b} \quad \text { with } \quad|\delta A| \leq \frac{2 n \epsilon}{1-n \epsilon} \hat{P}^{T}|\mathcal{L}||\hat{U}|$$
Furthermore, $\left|L_{i, j}\right| \leq 1$ and $\left|U_{i, j}\right| \leq 2^{i-1} \max {k \leq i}\left|A{k, j}\right|$, so
$$|\delta A|_{\infty} \leq \frac{2^{n} n^{2} \epsilon}{1-n \epsilon}|A|_{\infty}$$

## MATH1007COURSE NOTES ：

$$\hat{G} \hat{G}^{T}=A+\delta A \quad \text { with } \quad|\delta A| \leq \frac{(n+1) \epsilon}{1-(n+1) \epsilon}|\hat{G}|\left|\hat{G}^{T}\right| .$$
Furthermore, if an approximate solution, $\hat{\mathbf{x}}$, to $A \mathbf{x}=\mathbf{b}$ is computed by solving the two triangular linear systems $\hat{G} \mathbf{y}=\mathbf{b}$ and $\hat{G}^{T} \hat{\mathbf{x}}=\mathbf{y}$, and a scaling matrix is defined as $\Delta=\operatorname{diag}\left(\sqrt{a_{i i}}\right)$, then the scaled error $\Delta(\mathbf{x}-\hat{\mathbf{x}})$ satisfies
$$\frac{|\Delta(\mathbf{x}-\hat{\mathbf{x}})|_{2}}{|\Delta \mathbf{x}|_{2}} \leq \frac{\kappa_{2}(H) \epsilon}{1-\kappa_{2}(H) \epsilon}$$

# 微积分 Calculus MATH1006

0

Given a series $\Sigma_{m=1}^{m} a_{n}=a_{1}+a_{2}+a_{s}+\cdots$, let $s_{a}$ denote its rth partial sum:
$$s_{n}=\sum_{i=1}^{n} a_{i}=a_{1}+a_{2}+\cdots+a_{n}$$

If the sequence $\left{s_{\mathrm{n}}\right}$ is convergent and $\lim {\mathrm{a} \rightarrow \mathrm{m}} s{\mathrm{a}}=s$ exists as a real number, then the series $\Sigma a_{n}$ is called convergent and we write
$$a_{1}+a_{2}+\cdots+a_{n}+\cdots=s \quad \text { or } \quad \sum_{n=1}^{\infty} a_{n}=s$$
The number $s$ is called the sum of the series. Otherwise, the series is called divergent.

## MATH1006COURSE NOTES ：

If $r=1$, then $s_{n}=a+a+\cdots+a=n a \rightarrow \pm \infty$. Since $\lim {n \rightarrow-} s{n}$ doesn’t exist, the geometric series diverges in this case.
If $r \neq 1$, we have
\begin{aligned} &s_{\mathrm{a}}=a+a r+a r^{2}+\cdots+a r^{\mathrm{n}-1} \ &r s_{\mathrm{a}}=\quad a r+a r^{2}+\cdots+a r^{\mathrm{n}-1}+a r^{\mathrm{n}} \end{aligned}
Subtracting these equations, we get
$$\begin{array}{r} s_{\mathrm{a}}-r s_{\mathrm{a}}=a-a r^{n} \ s_{\mathrm{n}}=\frac{a\left(1-r^{\mathrm{n}}\right)}{1-r} \end{array}$$
If $-1<r<1$, we know from $(11.1 .9)$ that $r^{n} \rightarrow 0$ as $n \rightarrow \infty$, so
$$\lim {n \rightarrow \infty} s{n}=\lim {n \rightarrow \infty} \frac{a\left(1-r^{n}\right)}{1-r}=\frac{a}{1-r}-\frac{a}{1-r} \lim {n \rightarrow \infty} r^{n}=\frac{a}{1-r}$$

# 分析与计算基础 Analytical & Comput Foundation MATH1005

0

As this is not very precise because of problems in evaluating $E_{s}$, one may also measure the compression wave (P wave), which, being the fastest, gets to the pickup unit first. If this value is obtained, the ratio of the two velocities is
$$M=\frac{v_{c}}{v_{s}} \approx \frac{v_{c}}{v_{r}}$$

and
$$\mu=\frac{M^{2}-2}{2\left(M^{2}-1\right)}$$
We can now obtain the soil modulus of elasticity using Eqs. (2-19) and (2-21) as
$$E_{s}=2(1+\mu) G$$
Table lists typical values of Poisson’s ratio $\mu$ as compiled from several sources.

\begin{aligned} &s_{e}=1+0.2 i_{e} \frac{B}{L} \ &s_{\mathrm{q}}=1+\frac{B i_{\mathrm{e}}}{L} \sin \phi \ &s_{\mathrm{y}}=1-\frac{0.4 i_{p} B}{L} \geq 0.6 \end{aligned}
$$\begin{array}{ll} d_{\mathrm{e}} & = \begin{cases}1+0.35 \frac{D}{B} & D \leq B \ 1.4 \tan ^{-1} \frac{D}{B} & D>B\end{cases} \ d_{\mathrm{q}} & = \begin{cases}1+2 \tan \phi(1-\sin \phi)^{2} \frac{D}{B} & D \leq B \ 1+2 \tan \phi(1-\sin \phi)^{2} \tan ^{-1} \frac{D}{B} & D>B\end{cases} \ d_{7} & =1.00 \quad \text { all cases } \end{array}$$