Class Notes, Chapter 9, 9.3 - 9.5, 9.14

Estimation (of population parameters)

Reading: 9.3-9.5, 9.14

Probability theory
Parent processes, parameters
Expectation
Sampling schemes, statistics, sampling distributions
Estimation

Today: Estimation - from sample to parameter

The difference between an estimator and an estimate: the formula to get the number vs. the number.
(The "hat" is used to indicate an estimator)
The estimator for a population parameter
doesn't have to be the analog of the parameter.
Example: the sample median is an estimator of the population mean
Some estimators are better than others.
The sample mean is a good estimator of the population mean.
Twice the sample mode is not a good estimator of the population mean.
   

Properties of estimators

(1) Unbiased

Definition: E() =
is the "formula" used with the sample data to get a number;
is the actual population value being estimated

Factoids:
(A) (the formula for the sample mean) is an unbiased estimator of for any parent distribution.

Use the Algebra of Expectations

(B) is an unbiased estimator of for any parent distribution.
Here is the derivation.

(C) is not an unbiased estimator of sigma (!!)

Bias is the difference between the value of a parameter and the expected value of an estimator.

Here is an example.

(2) Efficient, i.e. small variance for the sampling distribution

We would like (the number derived from the sample using ) not only to on average equal , but also to be close to every time we take a sample.
An unbiased estimator that has less variance than any other unbiased estimator is called the Minimum Variance Unbiased Estimator, MVUE. Such an estimator is especially good.

Factoid:
(A) is MVUE for for a Normal parent distribution

But an estimator may be biased. We can still compare 2 estimators to see which is better (more efficient, sampling distribution has smaller variance); in that case we use relative efficiency.

(3) Consistent, i.e. as n, the sample size, gets larger, the estimator gets better and better.

Factoid:
(A) is a consistent estimator of .

We look for estimators that are unbiased, efficient, and consistent.

Methods for finding estimators

The best estimator for a population parameter may not be the sample analog of the population value, and we would like to have general methods for finding estimators that are unbiased, efficient, consistent, etc. There are several such methods commonly used, for example:

the method of Moments
the method of Maximum Likelihood
the method of Least Squares
Bayesian methods

(1) Method of Moments

(2) Maximum Likelihood

Here is the example of lambda of a Poisson:
click

(3) Least Squares

Strictly empirical approach to estimation
Parent model includes predictor variables
Used for curve fitting (regression)
Find curve with minimal deviation from all data points
Define best fit as minimizing the sum of squared errors
Related to the mean, which minimizes SSE about itself

Examples from linear regression

Linear relationship, parameters are slope and intercept
Empirical model (Ch. 1): y = ax + b
Statistical model: y = ax + b + epsilon "noise"
xi predictors, yi outcomes
Y = aX + b + e random variable
yi = axi + b + ei one observation
ei = yi - (axi + b) error for one observation
SSE


Interval estimates

I. Mean of a normal distribution with "known" variance, as an example:

Where do we put (two places) so that is not too extreme?
If is actually at point x, will fall within the Confidence Interval? Find the limits of where cab be:

II. Mean of a normal distribution with unknown variance; use t

If n is large, use degrees of freedom "infinity", i.e. approximate by the normal distribution z.
is the estimated standard error of the mean





Class Home