# 统计学|MATH1005/MATH1905 Statistics代写 Sydney代写

0

$$P\left(\beta_{j} \mid \delta_{j}\right)=\delta_{j} P\left(\beta_{j} \mid \delta_{j}=1\right)+\left(1-\delta_{j}\right) P\left(\beta_{j} \mid \delta_{j}=0\right),$$
whereby $\beta_{j}$ has a relatively diffuse prior when $\delta_{j}=1$ and $X_{j}$ is included in the usual way, but for $\delta_{j}=0$ the prior is centred at zero with high precision, so that while $X_{j}$ is still in the regression, it is essentially irrelevant to that regression. For instance, if
$$\left(\beta_{j} \mid \delta_{j}=1\right) \sim N\left(0, V_{j}\right),$$
one might assume $V_{j}$ large, leading to a prior that allows a search among values that reflect the predictor’s possible effect, whereas
$$\left(\beta_{j} \mid \delta_{j}=0\right) \sim N\left(0, c_{j} V_{j}\right),$$
where $c_{j}$ is small and chosen so that the range of $\beta_{j}$ under $P\left(\beta_{j} \mid \delta_{j}=0\right)$ is confined to substantively insignificant values. So the above prior becomes
$$P\left(\beta_{j} \mid \delta_{j}\right)=\delta_{j} N\left(0, V_{j}\right)+\left(1-\delta_{j}\right) N\left(0, c_{j} V_{j}\right) .$$

## MATH1005/MATH1905 COURSE NOTES ：

The ridge regression approach is closely related to a version of the standard posterior Bayes regression estimate, but with an exchangeable prior distribution on the elements of the regression vector. Thus in $y=X \beta+\varepsilon$, with $\varepsilon \sim N\left(0, \sigma^{2}\right)$, assume that the elements of $\beta$ are drawn from a common normal density
$$\beta_{j} \sim N\left(0, \sigma^{2} / k\right) \quad j=2, \ldots, p,$$
where a preliminary standardisation of the variables $x_{2}, \ldots, x_{p}$ may be needed to make this prior assumption more plausible. The mean of the posterior distribution of $\beta$ given $y$ is then
$$\beta=\left(X^{\prime} X+k I\right)^{-1} X^{\prime} y .$$
If the prior on $\beta$ specifies a location, as in
$$\beta \sim N\left(\gamma, \sigma^{2} / k\right)$$
122
REGRESSION MODEL.S
then the posterior mean of $\beta$ becomes
$$\beta=\left(k / \sigma^{2}+X^{\prime} X / \sigma^{2}\right)^{-1}\left(k \gamma / \sigma^{2}+X^{\prime} y / \sigma^{2}\right) .$$

# 统计学 Statistics MATH1054

0

For instance, suppose the goal is, as before, to estimate a regression
$$Y=x \beta \quad e$$
where the joint dependence between errors might be described by a multivariate Normal prior
$$e \sim N_{n}(0, \Sigma)$$

such that the dispersion matrix $\Sigma$ reflects the spatial interdependencies within the data. Outcomes may also be discrete, and then one might have, for binomial data, say,
$$\begin{gathered} Y_{i} \sim \operatorname{Bin}\left(\pi_{i}, N_{i}\right) \quad i=1, \ldots n \ \operatorname{logit}\left(\pi_{i}\right)=x_{i} \beta \quad e_{i} \end{gathered}$$
where again, the errors may be spatially dependent. Let the $n \times n$ covariance matrix for $e$ be
$$\Sigma=\sigma^{2} R(d)$$

## MATH1054COURSE NOTES ：

Among the most commonly used functions meeting these requirements are the exponential model
$$r_{i j}=\exp \left(-3 d_{i j} / h\right)$$
where $h$ is the range, or inter-point distance at which spatial correlation ceases to be important $^{12}$. The Gaussian correlation function has
$$r_{i j}=\exp \left(-3 d_{i j}^{2} / h^{2}\right)$$
and the spherical (Mardia and Marshall, 1984) has $r_{i j}=0$ for $d_{i j}>h$ and
$$r_{i j}=\left(1-3 d_{i j} / 2 h \quad d_{i j}^{3} / 2 h^{3}\right)$$
for $d_{i j}<h$; see Example $7.5$ for an illustration. In each of these functions, $h$ is analogous to the bandwidth of kernel smoothing models. If $\Sigma=\sigma^{2} R(d)$, then the covariance tends

# 统计 Statistics MATH08051

0

\begin{aligned} &H_{0}: \mu=0.86 \ &H_{a}: \mu \neq 0.86 \end{aligned}
at the $1 \%$ level of significance. What is the power of this test against the specific alternative $\mu=0.845$ ?
The test rejects $H_{0}$ when $|z| \geq 2.576$. The test statistic is
$$z=\frac{\bar{x}-0.86}{0.0068 / \sqrt{3}}$$

Some arithmetic shows that the test rejects when either of the following is true:
$$\begin{array}{ll} z \geq 2.576 & \text { (in other words, } \bar{x} \geq 0.870 \text { ) } \ z \leq-2.576 & \text { (in other words, } \bar{x} \leq 0.850 \text { ) } \end{array}$$
These are disjoint events, so the power is the sum of their probabilities, computed assuming that the alternative $\mu=0.845$ is true. We find that
\begin{aligned} P(\bar{x} \geq 0.87) &=P\left(\frac{\bar{x}-\mu}{\sigma / \sqrt{n}} \geq \frac{0.87-0.845}{0.0068 / \sqrt{3}}\right) \ &=P(Z \geq 6.37) \doteq 0 \end{aligned}

## MATH08051  COURSE NOTES ：

The sample mean is $\bar{x}=5$ and the standard deviation is $s=3.63$ with degrees of freedom $n-1=7$. The standard error is
$$\mathrm{SE}_{\bar{x}}=s / \sqrt{n}=3.63 / \sqrt{8}=1.28$$
From Table D we find $t^{}=2.365$. The $95 \%$ confidence interval is \begin{aligned} \bar{x} \pm t^{} \frac{s}{\sqrt{n}} &=5.0 \pm 2.365 \frac{3.63}{\sqrt{8}} \ &=5.0 \pm(2.365)(1.28) \ &=5.0 \pm 3.0 \ &=(2.0,8.0) \end{aligned}

# 统计学|Statistics代写 STAT 516

0

We are being asked to construct a $100(1-\alpha)$ confidence interval estimate, with $\alpha=0.10$ in part (a) and $\alpha=0.01$ in part (b). Now
$$z_{0.05}=1.645 \text { and } z_{0.005}=2.576$$
and so the 90 percent confidence interval estimator is
$$\bar{X} \pm 1.645 \frac{\sigma}{\sqrt{n}}$$

and the 99 percent confidence interval estimator is
$$\bar{X} \pm 2.576 \frac{\sigma}{\sqrt{n}}$$
For the data of Example 8.5, $n=10, \bar{X}=19.3$, and $\sigma=3$. Therefore, the 90 and 99 percent confidence interval estimates for $\mu$ are, respectively,
$$19.3 \pm 1.645 \frac{3}{\sqrt{10}}=19.3 \pm 1.56$$
and
$$19.3 \pm 2.576 \frac{3}{\sqrt{10}}=19.3 \pm 2.44$$

$$\bar{X} \pm 1.96 \frac{\sigma}{\sqrt{n}}$$
$$\text { Length of interval }=2(1.96) \frac{\sigma}{\sqrt{n}}=3.92 \frac{\sigma}{\sqrt{n}}$$
we must choose $n$ so that
$$\frac{3.92 \sigma}{\sqrt{n}} \leq b$$
$$\sqrt{n} \geq \frac{3.92 \sigma}{b}$$
Upon squaring both sides we see that the sample size $n$ must be chosen so that
$$n \geq\left(\frac{3.92 \sigma}{b}\right)^{2}$$