I just finished reading a book *How to lie with statistics* written by Darrell Huff, first printed in 1954. It is an enjoyable book with jokes, satires and cartoons. It talks about the blunders, misinterpretation, misrepresentation and lie using statistics, which is still very much applicable today.

I have been looking at statistical value -mean, median etc – from science perspective. In science and engineering, the aim is to be as correct and close to reality as possible. Afterall, things can crush and burn due to wrong numbers. But in social science, statistics can be work of arts.

So it is fascinating to find out how easily can the numbers give different unpresentative views legally and how easily we accept them without question and, worse, pass them around. Some of these common ‘errors’ are already mentioned in Gary’s class. But I will summarize them anyway.

**1. Sample with built-in bias**

The sample is not representative of general situation. For example , only certain groups in the entire population are taken into account. Or, the question itself is misleading.

**2. The well-chosen average**

People can choose between mean, median and mode to suit what they want to portray.

**3. The little figures that are not there**

The sample size is so small that it is just not representative. For example, only one-thirds of female college lecturers is married. But the sample involves only 3 lecturers and one of them is married. Well??

**4. Much ado with practically nothing**

Exaggerating small differences through unwise classifications or clever drawing of charts. Sometimes the difference is so small that it is actually within (often unmentioned) margin of error.

**5. Exaggeration using one-dimentional picture**

For example: to say that now people earn twice what people’s, we can use picture twice the size of the initial picture.

now ==> 10 years later

But this is misleading because the 2nd picture is not only twice the height of the first, but also twice the width. So in overall, it is four times bigger than the first picture and it will make readers think that the increment is greater than it actually is.

**6. The semiattached figure**

It is by proving something through something else that is not so relevant actually.

The example given: a report said that the number of death chargable to railroads is 4712, which may scare people from taking train. But actually nearly half of those were victims of people who were in cars collided to the trains at crossing. Others were riding on the rods. Only 132 out of 4712 were passengers on the trains.

**7. Post hoc rides again**

Two clocks are perfectly in tune with each other. Only clock B have the bell. So when clock A shows 12 o’clock, clock B rung. People then assume that clock B rung because of clock A.

**8. Statisculate**

Basically ‘lie legally with statistics”. Use the wrong base value, use weird reasoning, use unrealistic estimations.

How to be careful with statistics then?

1. Who says so? Does he have obvious bias?

2. How can he know?

3. What’s missing?

4. Did somebody change the subject?

5. (most important of all) Does it makes sense?

I would say ‘if you have time to spare, read the book.’ I had some good laugh.

### Like this:

Like Loading...

*Related*

## Lie with statistics

6 AprI just finished reading a book

How to lie with statisticswritten by Darrell Huff, first printed in 1954. It is an enjoyable book with jokes, satires and cartoons. It talks about the blunders, misinterpretation, misrepresentation and lie using statistics, which is still very much applicable today.I have been looking at statistical value -mean, median etc – from science perspective. In science and engineering, the aim is to be as correct and close to reality as possible. Afterall, things can crush and burn due to wrong numbers. But in social science, statistics can be work of arts.

So it is fascinating to find out how easily can the numbers give different unpresentative views legally and how easily we accept them without question and, worse, pass them around. Some of these common ‘errors’ are already mentioned in Gary’s class. But I will summarize them anyway.

1. Sample with built-in biasThe sample is not representative of general situation. For example , only certain groups in the entire population are taken into account. Or, the question itself is misleading.

2. The well-chosen average

People can choose between mean, median and mode to suit what they want to portray.3. The little figures that are not thereThe sample size is so small that it is just not representative. For example, only one-thirds of female college lecturers is married. But the sample involves only 3 lecturers and one of them is married. Well??

4. Much ado with practically nothingExaggerating small differences through unwise classifications or clever drawing of charts. Sometimes the difference is so small that it is actually within (often unmentioned) margin of error.

5. Exaggeration using one-dimentional picture

For example: to say that now people earn twice what people’s, we can use picture twice the size of the initial picture.now ==> 10 years later

But this is misleading because the 2nd picture is not only twice the height of the first, but also twice the width. So in overall, it is four times bigger than the first picture and it will make readers think that the increment is greater than it actually is.

6. The semiattached figure

It is by proving something through something else that is not so relevant actually.The example given: a report said that the number of death chargable to railroads is 4712, which may scare people from taking train. But actually nearly half of those were victims of people who were in cars collided to the trains at crossing. Others were riding on the rods. Only 132 out of 4712 were passengers on the trains.

7. Post hoc rides again

Two clocks are perfectly in tune with each other. Only clock B have the bell. So when clock A shows 12 o’clock, clock B rung. People then assume that clock B rung because of clock A.8. Statisculate

Basically ‘lie legally with statistics”. Use the wrong base value, use weird reasoning, use unrealistic estimations.How to be careful with statistics then?

1. Who says so? Does he have obvious bias?

2. How can he know?

3. What’s missing?

4. Did somebody change the subject?

5. (most important of all) Does it makes sense?

I would say ‘if you have time to spare, read the book.’ I had some good laugh.

## Like this:

Related