1 00:00:00,160 --> 00:00:02,340 The weak law of large numbers tells us 2 00:00:02,340 --> 00:00:03,640 that the sample mean-- 3 00:00:03,640 --> 00:00:06,480 that is, the average of independent identically 4 00:00:06,480 --> 00:00:09,260 distributed random variables, Xi-- 5 00:00:09,260 --> 00:00:12,960 converges, in a certain sense, to a number, namely the 6 00:00:12,960 --> 00:00:16,700 expected value of the random variables, Xi. 7 00:00:16,700 --> 00:00:19,750 But it does not tell us much about the details of the 8 00:00:19,750 --> 00:00:23,260 distribution of the sample mean. 9 00:00:23,260 --> 00:00:26,210 The central limit theorem provides us exactly with this 10 00:00:26,210 --> 00:00:27,590 kind of detail. 11 00:00:27,590 --> 00:00:31,470 It tells us that the sum of many independent identically 12 00:00:31,470 --> 00:00:35,260 distributed random variables has approximately a normal 13 00:00:35,260 --> 00:00:36,840 distribution. 14 00:00:36,840 --> 00:00:40,300 The mean and variance of this normal is easy to find if we 15 00:00:40,300 --> 00:00:43,820 know the mean and variance of the original random variables. 16 00:00:43,820 --> 00:00:48,330 This enables us to carry out approximate calculations 17 00:00:48,330 --> 00:00:52,650 rather quickly by using the normal tables. 18 00:00:52,650 --> 00:00:55,630 We will start with a precise statement of the central limit 19 00:00:55,630 --> 00:01:00,290 theorem, and we will emphasize that it is a universal result. 20 00:01:00,290 --> 00:01:03,870 It holds no matter what the distribution of the original 21 00:01:03,870 --> 00:01:08,970 random variables, and for this reason, it is very useful. 22 00:01:08,970 --> 00:01:12,300 We will work through several examples of the typical ways 23 00:01:12,300 --> 00:01:15,200 that the central limit theorem is used. 24 00:01:15,200 --> 00:01:18,730 We will develop a refinement that can be used when we are 25 00:01:18,730 --> 00:01:21,940 dealing with discrete distributions, which provides 26 00:01:21,940 --> 00:01:25,300 us with even more accurate approximations. 27 00:01:25,300 --> 00:01:29,190 And finally we will revisit the polling problem, and 28 00:01:29,190 --> 00:01:32,800 inquire again about the number of samples that are needed to 29 00:01:32,800 --> 00:01:37,160 obtain a certain accuracy with a certain confidence. 30 00:01:37,160 --> 00:01:40,140 We will see that the central limit theorem is much more 31 00:01:40,140 --> 00:01:44,520 informative, much less conservative, compared to the 32 00:01:44,520 --> 00:01:48,050 conclusions that we had gotten before based on the Chebyshev 33 00:01:48,050 --> 00:01:49,300 inequality.