## General Teaching – Understanding Confidence Intervals in Statistics (Wednesday 11/4/09)

Note – This is the first in a series of General Teaching articles that will appear each Wednesday.

This article is about an activity I used to help my students understand confidence intervals. I’ll begin with a little background information on my class. It’s been a trying semester to say the least. A larger number of students has struggled on the exams, and I’ve been perplexed by this. We just switched to a new textbook, so I’m unsure of the book or its approach has been to blame. It could be me, but I leave every day thinking that the lecture was good. I’ve tried collecting written homework and giving daily quizzes, with little to no effect.

I decided to make a last ditch attempt to save the semester by incorporating more in class activities. We had spent three days constructing confidence intervals. In this course we create z intervals for the population mean, t intervals for the population mean, and z intervals for the population proportion.

Problem 1

I broke the class into 10 groups of size 4. For the first problem, I generated 10 sets of 57 data values and told the class that the data was generated from a normally distributed population whose standard deviation is 15. On the computer screen I displayed the sample mean for each group, and told them to generate a 90% confidence interval for the mean. As each group finished they came to the board to write down the confidence interval. With some quick Excel work I generated the 10 intervals and put them on the screen, and each group was able to verify their result.

At this point I announced to the class that the mean of the population was 100, which was the value that I used to generate the random data in Excel. I then called on each group to declare whether their interval contained the true population mean – 9 of the groups said yes and 1 said no. We talked about how, if we repeated the process over and over, approximately 90% of the intervals created this way would contain the population mean. It was a happy accident that exactly 90% of our intervals were successful, but the important point was that there is no guarantee that a confidence interval contains the population parameter. They were also able to see first hand that samples, and the confidence intervals generated by those samples, can vary.

Problem 2

For the second problem, I generated 10 sets of 23 data values and told the class that the data was generated from a population that was approximately normally distributed, but I did not know the population standard deviation. In this case, the students understood that they needed to construct a t-interval. On the computer screen I displayed the sample mean and sample standard deviation for each group, and told them to generate a 90% confidence interval for the mean. Again, as each group finished they came to the board to write down the confidence interval. Using Excel I generated the 10 intervals and put them on the screen, and each group was able to verify their result.

At this point I announced to the class that the mean of the population was 100, although I actually used 105 as the population mean when I generated the random data in Excel. I then called on each group to declare whether their interval contained the true population mean – 4 of the groups said yes and 6 said no. I then filled them in that the actual mean was 105, and they saw that most of their intervals did contain this value. We then talked about what sort of conclusions we can draw when the expected population mean is outside of our confidence interval.

I got the feeling that they were not only enjoying this active format, but also that they were gaining a much more intuitive understanding of confidence intervals and their interpretation.

Problem 3

For the last problem, I generated 10 sets of 500 random values between 0 & 1. I converted each of these decimal values to a 1 (if the random number was less than 0.4) or a 0 (if the random number was greater than 0.4). A quick sum of each column gave the groups the number of males in a survey of 500 students at our college, and I informed them that they were to construct a 90% confidence interval for the proportion of male students at our school. On the computer screen I displayed the sample proportion (p-hat), the margin of error, and the lower & upper bounds of the confidence intervals for each group.

At this point I announced to the class that the actual percentage of males at our college was 50%, and then asked them if they thought I was lying. Since only one interval contained 50%, they realized that it was unlikely that 50% was the actual percentage of males. I then asked if they thought 42% was possible, and as a class they agreed that the percentage was indeed possible. I informed them that the actual percentage used to generate the data was 40%, which was contained in all but 1 interval.

Reflection

The class session was one of the more enjoyable sessions I’ve had in a long time. The students were active and involved. The students gained valuable insight through their participation. I’m encouraged, but we will have to wait and see how things go when I present new material (rather than review material) in this format.

If you would like a copy of the Excel worksheet I used, drop me an email through the contact page on my website – georgewoodbury.com or leave me a comment.

I am a math instructor at College of the Sequoias in Visalia, CA. If there are topics you’d like me to address in future General Teaching articles, send in your requests through the contact page on my web site. Be sure to check out next Wednesday’s article.  – George