Sunday, November 21, 2010

Statistics 101 for the Common Man

"Facts are stubborn, but statistics are more pliable."
Mark Twain.

There is nothing more misused and misunderstood than statistics. I include myself in this, and I am trained to analyze data and create statistics to tell a story.  I hope to be able to help you understand how statistics work, and how to make your own statistics.

Statistics depend on two very important things...DATA..and the QUESTION. In fact Data, and the Question work with each other in developing the story you want to tell.  We will jump right in to developing our statistical data.

SITUATION: You have been hired by a High School Football Team to develop a database of the members of their football team. At first all they want to know is the name of each player, their position, Grade Level and Grade Point Average.

Your Database looks like this.

    PlayerName
    Position
    GradeLevel
    GPA

The school makes things easy for you...they only have 30 players.

THE QUESTION: The School asks you to create a report that shows the GPA for each player.  They make it easy...1.0 is a D...2.0 is a C...3.0 is a B..and 4.0 is an A.  No fractions.

THE STORY: You run a report that shows 10 players have a 4.0...10 players have a 3.0...and 10 players have a 2.0.

THE NEXT MORNING

You open the local paper and see the Headline "Our Football Players Are The Smartest!"  The report accurately describes your data and says that "33.33% of our players are A students".  The article goes on to say that only 25% of your rivals players are A students.  

You tell the teams coach how proud you are.  That is when the coach bursts your bubble. He tells you that the other team has 100 players..25 had A's, 50 Had B's, and 25 had C's. So in reality 75% of the other team is a B or above student, and only 66.66% of our team is B or better.

DATA MANIPULATION

The Coach is feeling the heat from the Principle.  The team, according to the Principle is lagging academically. He wants more students rated at B or better. They decide to add fractions to the GPA. An A is still 4.0, but they want you to round up all B's that are 3.6 or better, and all C's that are 2.6.  If a student has a 2.6GPA it will become a B. 

Data Manipulation is one way to change the story without changing the players.  It is easier than making kids smarter. Lets see how that has changed your report

Data Report:
  Total Players: 30
   GPA>=3.6 =A
   GPA>=2.6 =B
   GPA>=1.6 =C
   GPA>=0.6 =D

  10 Players have a 4.0, 10 Players have a 3.6, 5 players have a 2.6, and 5 have a 1.6. 

  This gives you 25 players, or 83.33% with a B or better.

Overnight you have made kids smarter..and didn't do anything "wrong".  Of course 5 are still in the box of rocks category. 

THE ARGUMENT BEGINS.

Your rival school says you manipulated the data to create a false story.  Your Principle counters that we are "better defining" the academic level of the kids.  Technically both statements are true.  Academically your kids are still the same as they were, but you have better defined, through the use of fractions, how you rate that performance. 

WHAT YOU HAVE LEARNED.

You have learned how to build a simple database.  Keep in mind DATA is very important..and you will learn how important in the next lesson.

You have learned how to manipulate data by changing how you define it. Data manipulation is very important when telling your story. 

Finally you have learned how to build the story with the results you want.

You are well on your way to being a professional statistician. In the next lesson you will learn advanced Statistics for Demographics.

Enjoy.

4 comments:

  1. As the CFO of a failing company told me during the S&L Recession.. I can make the numbers appear to say anything I want...He was fairly good at it...kept the creditors fooled long enough to get everything of value out of the companies name..

    ReplyDelete
  2. When I graduated college my first job was as the IT manager for a Community Action Agency. They were responsible for acting as a clearing house for welfare, SSI, and other benefits.

    They had a huge database that was built in house. Other than being slow it was full of datasets. The more datasets the more accurate your information will be.

    The old IT manager had left under other than friendly terms, so the historical data reports were essentially destroyed. My initial job was to rebuild the old reports, and create new ones for a Community Block Grant they were applying for.

    The problem was the numbers did not reflect the past years reports. The Old IT manager was a pleaser...he wanted the director of the agency to be happy with the data. The director thought he was getting real information..what he was getting was "enhanced" information. The director was not pleased.

    I told him, my job is to report what your data says, how you choose to present that data to the world is up to you.

    ReplyDelete
  3. I make my report match your's on a daily basis.

    ReplyDelete

Note: Only a member of this blog may post a comment.