# 高等概率论|Math 425/Math 340 / Stat 231/ST 8553/Math 551/MA 485/585Advanced Probability Theory代写

0

$$P\left[\lim {n \rightarrow \infty} X{n}=0\right]=1$$
if
$$\sum_{n} p_{n}<\infty$$
To verify that observe that if
$$\sum_{n} p_{n}=\sum_{n} P\left[X_{n}=1\right]<\infty,$$
then by the Borel-Cantelli Lemma
$$P\left(\left[X_{n}=1\right] \text { i.o. }\right)=0$$
Taking complements, we find
$$1=P\left(\limsup {n \rightarrow \infty}\left[X{n}=1\right]^{c}\right)=P\left(\liminf {n \rightarrow \infty}\left[X{n}=0\right]\right)=1 .$$

## Math 425/Math 340 / Stat 231/ST 8553/Math 551/MA 485/585COURSE NOTES ：

$$P\left[X_{k}=1\right]=p_{k}=1-P\left[X_{k}=0\right]$$
Then we assert that
$$P\left[X_{n} \rightarrow 0\right]=1 \text { iff } \sum_{n} p_{k}<\infty$$
To verify this assertion, we merely need to observe that
$$P\left{\left[X_{n}=1\right] \text { i.o. }\right}=0$$
iff
$$\sum_{n} P\left[X_{n}=1\right]=\sum_{n} p_{n}<\infty$$

# 统计计算|STAT206/STA 518/ STA 511/STA 6106/STAT 8070/STAT151Statistical Computing代写

0

Consider maximization of the function $L(\mathbf{W}, \mathbf{H})$ in, written here without the matrix notation
$$L(\mathbf{W}, \mathbf{H})=\sum_{i=1}^{N} \sum_{j=1}^{p}\left[x_{i j} \log \left(\sum_{k=1}^{r} w_{i k} h_{k j}\right)-\sum_{k=1}^{r} w_{i k} h_{k j}\right] .$$
Using the concavity of $\log (x)$, show that for any set of $r$ values $y_{k} \geq 0$ and $0 \leq c_{k} \leq 1$ with $\sum_{k=1}^{r} c_{k}=1$,
$$\log \left(\sum_{k=1}^{r} y_{k}\right) \geq \sum_{k=1}^{r} c_{k} \log \left(y_{k} / c_{k}\right)$$

## MSTAT 502/STAT 316/MTH 513A/MATH 321/STAT210/STA 106 COURSE NOTES ：

For $m=1$ to $M$ :
(a) Fit a classifier $G_{m}(x)$ to the training data using weights $w_{i}$ –
(b) Compute
$$\operatorname{err}{m}=\frac{\sum{i=1}^{N} w_{i} I\left(y_{i} \neq G_{m}\left(x_{i}\right)\right)}{\sum_{i=1}^{N} w_{i}}$$
(c) Compute $\alpha_{m}=\log \left(\left(1-\operatorname{err}{m}\right) /\right.$ err $\left.{m}\right)$.
(d) Set $w_{i} \leftarrow w_{i} \cdot \exp \left[\alpha_{m} \cdot I\left(y_{i} \neq G_{m}\left(x_{i}\right)\right)\right], i=1,2, \ldots, N$.
Output $G(x)=\operatorname{sign}\left[\sum_{m=1}^{M} \alpha_{m} G_{m}(x)\right]$.

# 试验设计与方差分析|STAT 502/STAT 316/MTH 513A/MATH 321/STAT210/STA 106 Experimental Design and Analysis of Variance代写

0

In this equation $c_{i 1}$ is the coefficient of $\bar{X}{i . .}$ when testing for a linear trend, and $c{i 2}$ is the coefficient for the quadratic trend.

The value of $a_{1}^{}$ is estimated from $C_{1}$ for the $A$ main effect just as in Chapter 10: $$\hat{a}{1}^{}=C{1} / \Sigma_{i} c_{i 1}^{2}=100 / 20=5 .$$
To estimate $a^{}\left(b_{j}\right){2}$, we use the same formula (substituting $C{2}$ for $C_{1}$ and $C_{i 2}$ for $C_{i 1}$, but we use a different $C_{2}$ for each level of $B$. The values we use are in the quadratic column in Table 11.3:
\begin{aligned} &\hat{a}^{}\left(b_{1}\right){2}=-2 / \Sigma{i} c_{i 2}^{2}=-2 / 4=-.50 \ &\hat{a}^{}\left(b_{2}\right){2}=-13 / \Sigma{i} c_{i 2}^{2}=-13 / 4=-3.25 \ &\hat{a}^{}\left(b_{3}\right){2}=-19 / \Sigma{i} c_{i 2}^{2}=-19 / 4=-4.75 . \end{aligned}
The estimate of $\mu_{11}$ would then be $\hat{\mu}{11}=45.00+5(-3)-.50(1)=29.50$. The estimate of $\mu{12}$ would be $\mu_{12}=52.75+54(-3)-3.25(1)=34.50$. The other estimates shownand plotted in are obtained similarly.

## MSTAT 502/STAT 316/MTH 513A/MATH 321/STAT210/STA 106 COURSE NOTES ：

Sometimes addition, subtraction, or multiplication of matrices is possible after one or both matrices have been transposed. To transpose a matrix, we simply exchange rows and columns. For example, the transpose of $C$ is
$$C^{t}=\left|\begin{array}{rr} 3 & 4 \ -2 & 2 \ 1 & 0 \end{array}\right|$$
One important use of transposing is to enable the multiplication of a matrix by itself. We cannot write $A A$ unless $A$ is a square matrix, but we can always write $A^{t} A$ and $A A^{t}$. For the matrix above,
$$C^{t} C=\left|\begin{array}{rr} 3 & 4 \ -2 & 2 \ 1 & 0 \end{array}\right|\left|\begin{array}{rrr} 3 & -2 & 1 \ 4 & 2 & 0 \end{array}\right|=\left|\begin{array}{rrr} 25 & 2 & 3 \ 2 & 8 & -2 \ 3 & -2 & 1 \end{array}\right| .$$

# 应用回归分析|STAT 423/BUS 41100/MATH 3113/STAT 4230/6230/STAT 415/615/NHM 726 Applied Regression Analysis代写

0

$$\hat{\boldsymbol{\beta}}{O L S}=\frac{n}{n-1} \hat{\boldsymbol{\Sigma}}{\boldsymbol{x}}^{-1} \hat{\boldsymbol{\Sigma}}{\boldsymbol{x} Y} \stackrel{D}{\rightarrow} \boldsymbol{\beta}{O L S} \text { as } \mathrm{n} \rightarrow \infty$$
and
$$\hat{\boldsymbol{\Sigma}} \boldsymbol{x} Y=\frac{1}{n} \sum_{i=1}^{n} \boldsymbol{x}{i} Y{i}-\overline{\boldsymbol{x}} \bar{Y}$$
Thus
$$\hat{\Sigma}{\boldsymbol{x} Y}=\frac{1}{n}\left[\sum{j: Y_{j}=1} \boldsymbol{x}{j}(1)+\sum{j: Y_{j}=0} \boldsymbol{x}{j}(0)\right]-\overline{\boldsymbol{x}} \hat{\pi}{1}=$$

## MTH 412/NURS 629/STAT 4530/STA 111L/QTM 100/STA 290 COURSE NOTES ：

The discriminant function estimator
$$\hat{\boldsymbol{\beta}}{D}=\frac{n(n-1)}{N{0} N_{1}} \hat{\Sigma}^{-1} \hat{\Sigma}{\boldsymbol{x}} \hat{\boldsymbol{\beta}}{O L S} .$$
Now when the conditions of Definition $10.3$ are met and if $\mu_{1}-\mu_{0}$ is small enough so that there is not perfect classification, then
$$\boldsymbol{\beta}{L R}=\Sigma^{-1}\left(\mu{1}-\mu_{0}\right) .$$
Empirically, the OLS ESP and LR ESP are highly correlated for many LR data sets where the conditions are not met, eg when some of the predictors are factors. This suggests that $\boldsymbol{\beta}{L R} \approx d \boldsymbol{\Sigma}{\boldsymbol{x}}^{-1}\left(\boldsymbol{\mu}{1}-\boldsymbol{\mu}{0}\right)$ for many LR data sets where $d$ is some constant depending on the data. Results from Haggstrom (1983) suggest that if a binary regression model is fit using OLS software for MLR, then a rough approximation is $\hat{\boldsymbol{\beta}}{L R} \approx \hat{\boldsymbol{\beta}}{O L S} / M S E$. So a rough approximation is LR ESP $\approx(\mathrm{OLS}$ ESP $) / M S E$.

# 统计推断|MTH 412/NURS 629/STAT 4530/STA 111L/QTM 100/STA 290 Statistical Inference 代写

0

$$Y_{i}=\beta_{0}+\sum_{j=1}^{p-1} \beta_{j} x_{i j}+e_{i}, \quad i=1, \ldots, n$$
where the $e_{i}$ are random errors with
\begin{aligned} E\left(e_{i}\right) &=0 \ \operatorname{Var}\left(e_{i}\right) &=\sigma^{2} \ \operatorname{Cov}\left(e_{i}, e_{j}\right) &=0, \quad i \neq j \end{aligned}
In matrix notation, we have
$$\underset{n \times 1}{\mathbf{Y}}=\underset{n \times p}{\mathbf{X}} \underset{p \times 1}{ } \boldsymbol{\beta}+\underset{n \times 1}{\mathbf{e}}$$
and
\begin{aligned} E(\mathbf{e}) &=0 \ \boldsymbol{\Sigma}_{e e} &=\sigma^{2} \mathbf{I} \end{aligned}

## MTH 412/NURS 629/STAT 4530/STA 111L/QTM 100/STA 290 COURSE NOTES ：

Under the normality assumption, it can be shown that
$$\frac{\hat{\beta}{i}-\beta{i}}{s_{\hat{\beta}{i}}} \sim t{n-p}$$
although we will not derive this result. It follows that a $100(1-\alpha) \%$ confidence interval for $\beta_{i}$ is
$$\hat{\beta}{i} \pm t{n-p}(\alpha / 2) s_{\hat{\beta}{i}}$$ To test the null hypothesis $H{0}: \beta_{i}=\beta_{i 0}$, where $\beta_{i 0}$ is a fixed number, we can use the test statistic
$$t=\frac{\hat{\beta}{i}-\beta{i 0}}{s_{\hat{\beta}_{i}}}$$