Statisticians often want to know how a group would respond to a certain question (their opinion.) But, for a large group, it isn’t possible, cost effective or realistic to ask the opinion of the whole group. So, they attempt to collect the responses of a smaller sampling of the larger group; then they use statistical paradigms to resonably determine how the whole population would have respond.
So, for example, if you wanted to know how all the people in a large city felt about the appearance of the golden tropical fluffy feefee fish, [no, this is not a real fish; but, it sounds cool,] it wouldn’t be practical to call a million people and ask their opinion. But, you might be able to call several hundred or a thousand, taken from a randomized list of the whole population, and get their opinions. Then you would analyze the data using the science of statistics and come up with results. With the results of the analysis, you can then make statements about the what the response of the whole population of the city would be, with a reasonable degree of certainty.
For example, let’s say that we get the following results, already analyzed, from our random sample regarding the appearance of the golden tropical fluffy feefee fish: 48% love the appearance of the fish, 45% like the appearance of the fish, 4% dislike the appearance, and 3% have no opinion at all [possibly because they ate too much sugar and their brain is taking a break.]
Now, when we look at the 48% and the 45%, they seem to say that more people love the appearance of the fish (48%) than like the appearance of the fish (45%.) But, is it correct to say that more people love the appearance of the fish than like the appearance of the fish? The Answer is that we can’t tell. Why?
When the data of the sample is analyzed, their is a degree of uncertainty (error) that is part of the analysis. This uncertainty occurs because we are not asking the opinions of the every person in the city, but rather taking a small sample and applying the results to make statements about how the whole population would respond. For example, a statistician might say that we are 95% certain that if we sampled the whole population, the people who love the appearance of the golden tropical fluffy feefee fish would be 48% +/-3%. The +/- 3 % is the margin of error. This means that the 48% could actually be anywhere from 45%-51% (45+3 and 45-3.) The 45% who like the appearance could really be between 42% and 48%. So, given this range of error, it is possible that the people who liked the appearance of the fish (45%+/-3) could be more than the people who loved the appearance of the fish, 48%+/-3. We just don’t know.
I wrote this article because repeatedly when statistics are reported (TV, In Print, etc.,) they will mention results of a study and disregard the margin of error. For example, they will say than in a study of the opinions of people living on Mars, we found that 48% support XYZ and 45% are against XYZ. The two numbers reported may be the same or different; we just don’t know. Additionally, they report that the 48% is up from 45%, when the poll was last taken. But, this too may not be true. We really don’t know because of the range of error.
Read the fine print; check-out the range of error. Sometimes it does matter.
-Glenn
founder,














[...] This post was mentioned on Twitter by Glenn, Glenn. Glenn said: Statistics: Golden Tropical Fluffy Fee Fee Fish http://glennishere.com/blog/2010/01/30/statistics-the-golden-tropical-fluffy-feefee-fish/ [...]