統計學 ( 一 ) 第六章抽樣分佈 (Distributions of Sample Statistics) 授課教師 : 唐麗英教授 國立交通大學工業工程與管理學系聯絡電話 :(03)5731896 e-mail:litong@cc.nctu.edu.tw 013 本講義未經同意請勿自行翻印
本課程內容參考書目 教科書 P. Newbold, W. L. Carlson and B. Thorne(007). Statistics for Business and the Economics, 7 tt Edition, Pearson. 參考書目 Berenson, M. L., Levine, D. M., and Krehbiel, T. C. (009). Basic business statistics: Concepts and applications, 11 tt EditionPrentice Hall. Larson, H. J. (198). Introduction to probability theory and statistical inference, 3 rr Edition, New York: Wiley. Miller, I., Freund, J. E., and Johnson, R. A. (000). Miller and Freund's Probability and statistics for engineers, 6 tt Edition, Prentice Hall. Montgomery, D. C., and Runger, G. C. (011). Applied statistics and probability for engineers, 5 tt Edition, Wiley. Watson, C. J. (1997). Statistics for management and economics, 5th Edition. Prentice Hall. 唐麗英 王春和 (013), 從範例學 MINITAB 統計分析與應用, 博碩文化公司 唐麗英 王春和 (008), SPSS 統計分析, 儒林圖書公司 唐麗英 王春和 (007), Excel 統計分析, 第二版, 儒林圖書公司 唐麗英 王春和 (005), STATISTICA 與基礎統計分析, 儒林圖書公司 統計學 ( 一 ) 唐麗英老師上課講義
抽樣分布 (Sampling Distributions) 統計學 ( 一 ) 唐麗英老師上課講義 3
Sampling Distributions ( 抽樣分佈 ) Recall: Parameter and statistic ( 參數與統計量 ) A quantity computed from the observations in a population is called a parameter. A quantity computed from the observations in a sample is called a statistic. Example: 1) μ, σ and P are parameters. ) X, S and p are statistics. 統計學 ( 一 ) 唐麗英老師上課講義 4
Sampling Distributions ( 抽樣分佈 ) Sampling Distribution 抽樣分佈 The probability distribution of a statistic that results when random sample of size n are repeatedly drawn from a given population is called the sampling distribution of the statistic. 統計量之機率分佈稱為 抽樣分佈 Example: The distribution of sample mean X is one type of sampling distribution. The distribution of sample proportion p is another type of sampling distribution. 統計學 ( 一 ) 唐麗英老師上課講義 5
樣本平均數的抽樣分布 (The Sampling Distributions of Means) 統計學 ( 一 ) 唐麗英老師上課講義 6
Sampling Distribution of the Sample Mean 1) What is the Sampling Distribution of the Sample Mean, X (When σ is known)? i.e., How the statistic X behaves in the repeated sampling? 統計學 ( 一 ) 唐麗英老師上課講義 7
Sampling Distribution of the Sample Mean 例 1 : Suppose a population consists of four numbers (N=4): 1,,3,4 Since the four values are distinct, the population probability distribution assigns an equal probability of 1/4 to each value of x in the population. x P(x) xp(x) x P(x) P(X) 1 3 4 1/4 1/4 1/4 1/4 1/4 /4 3/4 4/4 1/4 4/4 9/4 16/4 1/4 The population mean and variance are Uniform distribution 1 3 4 X n N µ = xp x i =.5, σ = x P x i µ = 7.5.5 = 1.5 i=1 i=1 統計學 ( 一 ) 唐麗英老師上課講義 8
Sampling Distribution of the Sample Mean Step 1 : Take a random sample of size with replacement from the population. How many possible samples are there? 4 * 4 = 16 possible samples. Sample Sample mean X Sample (1, 1) 1 (3, 1) (1, ) 1.5 (3, ).5 (1, 3) (3, 3) 3 (1, 4).5 (3, 4) 3.5 (, 1) 1.5 (4, 1).5 (, ) (4, ) 3 (, 3).5 (4, 3) 3.5 (, 4) 3 (4, 4) 4 Sample mean X 統計學 ( 一 ) 唐麗英老師上課講義 9
Sampling Distribution of the Sample Mean Step : Construct the probability distribution for the sample mean X. Close to Normal Distribution For this probability distribution of X, the mean of the sample mean X and the variance of the sample mean X are : µ X = σ X = 統計學 ( 一 ) 唐麗英老師上課講義 10
Sampling Distribution of the Sample Mean Conclusions : 1) µ X = μ ) σ x = σ, where n is the sample size. n 3) Whether the distribution of the original population is normal or not, the distribution of the sample mean is close to Normal. Note: When n gets larger, the distribution of x gets closer to the Normal distribution. 統計學 ( 一 ) 唐麗英老師上課講義 11
Central Limit Theorem 中央極限定理 Following Figures gives the sampling distributions of X for four different population probability distributions with n=, 6, 30, respectively. 統計學 ( 一 ) 唐麗英老師上課講義 1
Central Limit Theorem 中央極限定理 The Central Limit Theorem (C.L.T.) 中央極限定理 If random samples of n observations are drawn from a population with mean μ and standard deviation σ,when n is large ( n 30 ), the sampling distribution of X is approximately normally distributed with µ X = µ, and σ X = σ n. σ µ That is, X ~ N(, ), if n 30. n The approximation will become more and more accurate as n becomes large. 統計學 ( 一 ) 唐麗英老師上課講義 13
Central Limit Theorem 中央極限定理 Remark: 1. If the population is normal, then distribution of the sample mean X will always be normal, regardless of the sample size (n).. If x ~N µ, σ n, x n i=1 ~N nn, nσ 統計學 ( 一 ) 唐麗英老師上課講義 14
Central Limit Theorem 中央極限定理 Resoucre: Statistics for Business and the Economics, 7th Edition, by P. Newbold, W. L. Carlson and B. Thorne, Pearson, 007. 統計學 ( 一 ) 唐麗英老師上課講義 15
例 1 : Sampling Distribution of the Sample Mean Suppose that X follows a distribution with mean μ=10 and variance σ = 4. A sample of size 5 are drawn from this population. What is the probability distribution of X? 統計學 ( 一 ) 唐麗英老師上課講義 16
例 : Application of the Central Limit Theorem The average vitamin B- content of a certain brand of vitamins is 30 mg with a standard deviation of mg. A quality control inspector selects 36 pills for testing. What is the probability that the average vitamin B- content of these 36 pills is less than 8 mg? 統計學 ( 一 ) 唐麗英老師上課講義 17
例 3 : Application of the Central Limit Theorem If a 1-gallon can of a certain kind of paint covers on the average 513.3 square feet with a standard deviation of 31.5 square feet, what is the probability that the mean area covered by a sample of 40 of these 1-gallon cans will be anywhere from 510.0 to 50.0 square feet? 統計學 ( 一 ) 唐麗英老師上課講義 18
Application of the Central Limit Theorem Note : When the sample size, n, is not a small fraction of a finite Population with size N, the sampling Distribution of X is Normal with mean µ X =E(X )=μ σ X =Var(X )= σ N n n N 1 Note : N n N 1 is called the finite population correction (fpc)factor ( 有限母體校正因子 ) 統計學 ( 一 ) 唐麗英老師上課講義 19
Sampling Distribution of the Sample Proportion What is the Sampling Distribution of the Sample Proportion, p P:Population Proportion p :Sample proportion = x/n = 成功次數 / 總試驗次數 Theorem:Sampling Distribution of p When the sample size n is large, the sampling distribution of p is approximately normal with mean P and standard deviation pq n. pˆ ~ N( p, pq n ) 統計學 ( 一 ) 唐麗英老師上課講義 0
Sampling Distribution of the Sample Proportion 例 1 : A production line at a manufacturing company produces 10% defective items. If a sample of n=64 items is taken, what is the probability that the sample defective rate is less than 8%? 統計學 ( 一 ) 唐麗英老師上課講義 1
Sampling Distribution of the Sample Proportion 例 : 請參考課本 89 頁例 6.7 A random sample of 70 homes was taken from a large population of older homes to estimate the proportion for homes with unsafe wiring. If, in fact, 0% of the homes have unsafe wiring, what is the probability that the sample proportion will be between 16% and 4% of homes with unsafe wiring? 統計學 ( 一 ) 唐麗英老師上課講義
Sampling Distribution of the Sample Proportion [Ans] P = 0. n = 70 σ p = P(1 P) n = 0.(1 0.) 70 = 0.04 P 0.16 < p < 0.4 = P( 0.16 P σ p < p P σ p < 0.4 P σ p ) = P 1.67 < Z < 1.67 = 0.9050 統計學 ( 一 ) 唐麗英老師上課講義 3
Sampling Distribution of the Sample Mean ) What is the Sampling Distribution of the Sample Mean, X (When σ is unknown)? 統計學 ( 一 ) 唐麗英老師上課講義 4
Sampling Distribution of the Sample Mean Theorem If X is the mean of a random sample of size n taken from a normal population having the mean μ and the variance σ, the sample statistic t = X µ s / n has a t-distribution with degrees of freedom(d.f.)( 自由度 ) ν= n-1. Note: t-distribution is also called Student s t-distribution. 統計學 ( 一 ) 唐麗英老師上課講義 5
Student s t Distribution What is Student s t-distribution? The probability distribution of t statistic was first published in 1908 in a paper by W. S. Gosset. At the time, Gosset was employed by an Irish brewery ( 釀酒場 ) that disallowed publication of research by members of its staff. To circumvent this restriction, he published his work secretly under the name Student. Consequently, the distribution of t is usually called the Student s t-distribution, or simply the t-distribution. source: http://en.wikipedia.org/wiki/william_sealy_gosset 統計學 ( 一 ) 唐麗英老師上課講義 6
Student s t Distribution Properties of t-distribution The t-distribution is very much like a Z-distribution Z t Comparison of t-distribution and Z-distribution 1) Both are symmetric, bell-shaped. ) Both have a mean of 0. 3) t is more variable than Z in repeated sampling. (There is more area in the tails of the t-distribution, and the Z-distribution is higher in the middle). 4) As the number of d.f. increases (i.e., as n increases ) without limit, the t-distribution approaches Z-distribution. 統計學 ( 一 ) 唐麗英老師上課講義 7
Degree of Freedom What is the Degree of Freedom? We use the degrees of freedom as a measure of sample information. For example, we say that the t statistic has degrees of freedom n-1. Why? There are n degrees of freedom or independent pieces of information in the random sample of size n from the normal distribution. In calculating t = X µ s n, we do not know σ and need to use the sample data to estimate σ. When the data (the values in the sample) are used to compute the mean X for obtaining S = n i=1 x i X /(n 1), there is 1 less degree of freedom in the information used to estimate σ. 統計學 ( 一 ) 唐麗英老師上課講義 8
Student s t Distribution t-distribution table Table 8 in the Appendix Tables of the textbook (page 866) gives the value of t α which locates an area of α in the upper tail of the t-distribution for various values of α and for d.f. ranging from 1 to. Note When d.f. 9 (or n 30), Z-distribution is very close to t-distribution. 統計學 ( 一 ) 唐麗英老師上課講義 9
t α α
例 1 : Find t α and tα when α=0.05 and n=6. [Ans] Student s t Distribution t.05, 5 =.015, t.05, 5 =.571. 統計學 ( 一 ) 唐麗英老師上課講義 31
Student s t Distribution 例 : Find t α and tα when α=0.01 and n=0. [Ans] t.01, 19 =.539, t.005, 19 =.861. 統計學 ( 一 ) 唐麗英老師上課講義 3
Student s t Distribution 例 3 : Find t α and tα when α=0.10 and n=4. [Ans] t.10, = 1.8, t.05, = 1.645. 統計學 ( 一 ) 唐麗英老師上課講義 33
Sampling Distributions of Sample Variance, S 統計學 ( 一 ) 唐麗英老師上課講義 34
χ -Distribution χ -Distribution If s is the variance of a random sample of size n taken from a Normal population having the variance σ, then χ has a (Greek letter, Chi) distribution with the d.f. = ν =n-1. n σ ( 1) = S Table 7 on pages 865 of the Appendix Tables gives the value of which locates an area of α in the upper tail of the χ -distribution for various values of α and d.f. 統計學 ( 一 ) 唐麗英老師上課講義 35
例 1 : χ -Distribution Table If n= 0, use Table 7 (p.865) to determine χ 0.05 =? α 統計學 ( 一 ) 唐麗英老師上課講義 36
例 : χ -Distribution Consider a cannery that produces 8-ounce cans of processed corn. Quality control engineers have determined that process is operating properly when the true variation σ of the fill amount per can is less than 0.005. A random sample of n=10 cans is selected from a day s production, and the fill amount (in ounces) recorded for each. Of interest is the sample variance, S. If, in fact, σ =0.001, find the probability that S exceeds 0.005. Assume that the fill amounts are normally distributed. 統計學 ( 一 ) 唐麗英老師上課講義 37
[Ans] (1/) χ -Distribution We want to calculate P(S > 0.005). Assume the sample of 10 fill amount is selected from a normal distribution. χ n σ ( 1) = S has a chi-square probability distribution with ν=(n-1) degrees of freedom. Consequently, the probability we seek can be written > ( n 1) S ( n 1)(0.005) ( n 1)(0.005 P( S 0.005) = P > = P χ > σ σ σ Substituting n =10 and σ =0.001, we have P( S ) 9(0.005) > 0.005) = P χ > = ( > 0.001 P χ.5). 統計學 ( 一 ) 唐麗英老師上課講義 38
χ -Distribution [Ans] (/) We want to find the probability α such that for n=10 (ν =9), we obtain χ = 1.666 and χ 0.01 0.005 = 3.589 i.e., 0.005 < P( χ >.5) < 0.01 >.5 χ α 0.005 < α <.5 0.01 Thus, the probability that the variance of the sample fill amounts exceeds 0.005 is small (between 0.005 and 0.01) when the true population variance σ equals 0.001. 統計學 ( 一 ) 唐麗英老師上課講義 39
χ -Distribution 例 3 : 請參考課本 97 頁例 6.10 Shirley Mendez is the manager of quality assurance for Green Valley Foods Inc., a packer of frozen vegetable products. Shirley wants to be sure that the variation of package weight is small so that the company does not produce a large proportion of packages that are under the stated package weight. She has asked you to obtain upper and lower limits for the ratio of the sample variance divided by the population variance for a random sample of n = 0 observations. The limits are such that the probability that the ratio is below the lower limits is 0.05 and the probability that the ratio is above the upper limit is 0.05. Thus, 95% of the ratios will be between these limits. The population distribution can be assumed to be normal. 統計學 ( 一 ) 唐麗英老師上課講義 40
[Ans] (1/) χ -Distribution P s σ < K L = 0.05 aaa P s σ > K U = 0.05 n 1 s 0.05 = P σ < n 1 K L = P[χ 19 < n 1 K L ] 0.05 = P n 1 s σ > n 1 K U = P[χ 19 < n 1 K U ] 查表可得 χ 19L = 8.91, χ 19U = 3.85 統計學 ( 一 ) 唐麗英老師上課講義 41
[Ans] (/) χ -Distribution 0.05 = P 0.05 = P n 1 s σ < n 1 K L = P[8.91 < 19 K L ] n 1 s σ > n 1 K U = P[3.85 < 19 K U ] K L =0.469, K U =1.79 The 95% acceptance interval for the ratio of sample variance divided by population variance is as follows: P 0.469 s 1.79 = 0.95 σ 統計學 ( 一 ) 唐麗英老師上課講義 4
Sampling Distributions of Sample Variance, s 1 s 統計學 ( 一 ) 唐麗英老師上課講義 43
F-Distribution F-Distribution Let χ 1 and χ be two independent chi-square random variables with ν 1 and ν degrees of freedom, respectively, then χ1 / ν1 F = χ / ν has a F distribution with ν 1 numerator d.f. ( 分子自由度 ) and ν denominator d.f. ( 分母自由度 ) Theorem : If s 1 and s are the variances of a random sample of size n 1 and n taken from two normal population having the same variances, then s1 F = s has a F distribution with d. f. = ν 1, ν = n 1 1, n 1. 統計學 ( 一 ) 唐麗英老師上課講義 44
F-Distribution Table F-Distribution Table TABLE 9 on pages 867-869 of the Appendix Tables gives the value of F α which locates an area of α in the upper tail of the F-distribution for various values of α and d.f. 例 4: If n 1 = 7, n = 13, use TABLE 9 to determine F.01 =? α F α 統計學 ( 一 ) 唐麗英老師上課講義 45
本單元結束 統計學 ( 一 ) 唐麗英老師上課講義 46