# 统计学|MA20227 Statistics 2B代写

0

Multivariate distributions: expectation and variance-covariance matrix of a random vector; statement of properties of the bivariate and multivariate normal distribution.

$$G^{2}\left(M_{0}\right)=2 \sum_{i \in I} n_{i} \log \left(\frac{n_{i}}{\hat{m}{i}^{0}}\right)$$ where, for a cell $i$ belonging to the index set $I, n{i}$ is the frequency of observations in the $i$ th cell and $\hat{m}{i}^{0}$ are the expected frequencies for the considered model $M{0}$. For model comparison, two nested models $M_{0}$ and $M_{1}$ can be compared using the difference between their deviances:
$$D=G_{0}^{2}-G_{1}^{2}=2 \sum_{i \in I} n_{i} \log \left(\frac{n_{i}}{\hat{m}{i}^{0}}\right)-2 \sum{i \in I} n_{i} \log \left(\frac{n_{i}}{\hat{m}{i}^{1}}\right)=2 \sum{i \in I} n_{i} \log \left(\frac{\hat{m}{i}^{1}}{\hat{m}{i}^{0}}\right)$$

## MA20227 COURSE NOTES ：

If we knew $f$, the real model, we would be able to determine which of the approximating statistical models, different choices for $g$, will minimise the discrepancy. Therefore the discrepancy of $g$ (due to the parametric approximation) can be obtained as the discrepancy between the unknown probabilistic model and the best parametric statistical model, $p_{\theta_{0}}^{(I)}$ :
$$\Delta\left(f, p_{\theta_{0}}^{(I)}\right)=\sum_{i=1}^{n}\left(f\left(x_{i}\right)-p_{\theta_{0}}^{(l)}\left(x_{i}\right)\right)^{2}$$
However, since $f$ is unknown we cannot identify the best parametric statistical model. Therefore we will substitute $f$ with a sample estimate, be denoted by $p_{\hat{\theta}}^{(I)}(x)$, for which the $I$ parameters are estimated on the basis of the data. The discrepancy between this sample estimate of $f(x)$ and the best statistical model is called the discrepancy of $g$ (due to the estimation process):
$$\Delta\left(p_{\hat{\theta}}^{(I)}, p_{\theta_{0}}^{(I)}\right)=\sum_{i=1}^{n}\left(p_{\hat{\theta}}^{(I)}\left(x_{i}\right)-p_{\theta_{0}}^{(I)}\left(x_{i}\right)\right)^{2}$$

# 统计学 Statistics MATH1054

0

For instance, suppose the goal is, as before, to estimate a regression
$$Y=x \beta \quad e$$
where the joint dependence between errors might be described by a multivariate Normal prior
$$e \sim N_{n}(0, \Sigma)$$

such that the dispersion matrix $\Sigma$ reflects the spatial interdependencies within the data. Outcomes may also be discrete, and then one might have, for binomial data, say,
$$\begin{gathered} Y_{i} \sim \operatorname{Bin}\left(\pi_{i}, N_{i}\right) \quad i=1, \ldots n \ \operatorname{logit}\left(\pi_{i}\right)=x_{i} \beta \quad e_{i} \end{gathered}$$
where again, the errors may be spatially dependent. Let the $n \times n$ covariance matrix for $e$ be
$$\Sigma=\sigma^{2} R(d)$$

## MATH1054COURSE NOTES ：

Among the most commonly used functions meeting these requirements are the exponential model
$$r_{i j}=\exp \left(-3 d_{i j} / h\right)$$
where $h$ is the range, or inter-point distance at which spatial correlation ceases to be important $^{12}$. The Gaussian correlation function has
$$r_{i j}=\exp \left(-3 d_{i j}^{2} / h^{2}\right)$$
and the spherical (Mardia and Marshall, 1984) has $r_{i j}=0$ for $d_{i j}>h$ and
$$r_{i j}=\left(1-3 d_{i j} / 2 h \quad d_{i j}^{3} / 2 h^{3}\right)$$
for $d_{i j}<h$; see Example $7.5$ for an illustration. In each of these functions, $h$ is analogous to the bandwidth of kernel smoothing models. If $\Sigma=\sigma^{2} R(d)$, then the covariance tends

# 统计学 Statistics 1Z STATS1003_1

0

An appropriate model for a series with a single cycle is then
$$y_{t}=A \cos (2 \pi f t \quad P) \quad u_{t}$$
where $A$ is the amplitude and $P$ the phase of the cycle, and period $1 / f$, namely the number of time units from peak to peak. To allow for several $(r)$ frequencies operating simultaneously in the same data, the preceding may be generalised to
$$y_{t}=\sum_{j=1}^{r} A_{j} \cos \left(2 \pi f_{j} t \quad P_{j}\right) \quad u_{t}$$

For stationarity to apply, the $A_{j}$ may be taken as uncorrelated with mean 0 and the $P_{j}$ as uniform on $(0,2 \pi)$. Because
$$\cos \left(2 \pi f_{j} t \quad P_{j}\right)=\cos \left(2 \pi f_{j} t\right) \cos \left(P_{j}\right)-\sin \left(2 \pi f_{j} t\right) \sin \left(P_{j}\right)$$

## STATS1003_1 COURSE NOTES ：

For discrete outcomes, dependence on past observations and predictors may be handled by adapting metric variable methods within the appropriate regression link. Thus for Poisson outcomes
$$y_{t} \sim \operatorname{Poi}\left(\mu_{t}\right)$$
an $\operatorname{AR}(1)$ dependence on previous values in the series could be specified
$$\log \left(\mu_{t}\right)=\rho y_{t-1} \quad \beta x_{t}$$
Here, non-stationarity or ‘explosive’ behaviour would be implied by $\rho>0$ (Fahrmeir and Tutz, 2001, p. 244), and in an MCMC framework stationarity would be assessed by the proportion of iterations for which $\rho$ was positive. Autoregressive errors lead to specification such as
$$\log \left(\mu_{t}\right)=\beta x_{t} \quad \varepsilon_{t}$$
with
$$\varepsilon_{l}=\gamma \varepsilon_{t-1} \quad u_{l}$$
for $t>1$, and $u_{t}$ being white noise.

# 统计学|Statistics代写 STAT 516

0

We are being asked to construct a $100(1-\alpha)$ confidence interval estimate, with $\alpha=0.10$ in part (a) and $\alpha=0.01$ in part (b). Now
$$z_{0.05}=1.645 \text { and } z_{0.005}=2.576$$
and so the 90 percent confidence interval estimator is
$$\bar{X} \pm 1.645 \frac{\sigma}{\sqrt{n}}$$

and the 99 percent confidence interval estimator is
$$\bar{X} \pm 2.576 \frac{\sigma}{\sqrt{n}}$$
For the data of Example 8.5, $n=10, \bar{X}=19.3$, and $\sigma=3$. Therefore, the 90 and 99 percent confidence interval estimates for $\mu$ are, respectively,
$$19.3 \pm 1.645 \frac{3}{\sqrt{10}}=19.3 \pm 1.56$$
and
$$19.3 \pm 2.576 \frac{3}{\sqrt{10}}=19.3 \pm 2.44$$

## STAT516 COURSE NOTES ：

$$\bar{X} \pm 1.96 \frac{\sigma}{\sqrt{n}}$$
Since the length of this interval is
$$\text { Length of interval }=2(1.96) \frac{\sigma}{\sqrt{n}}=3.92 \frac{\sigma}{\sqrt{n}}$$
we must choose $n$ so that
$$\frac{3.92 \sigma}{\sqrt{n}} \leq b$$
or, equivalently,
$$\sqrt{n} \geq \frac{3.92 \sigma}{b}$$
Upon squaring both sides we see that the sample size $n$ must be chosen so that
$$n \geq\left(\frac{3.92 \sigma}{b}\right)^{2}$$