• Sunday, February 19, 2012
February 19, 2012, 01:53:32 PM *
Welcome, Guest. Please login or register.

Login with your Chronicle username and password
News: For all you tweeters, follow The Chronicle on Twitter.
 
Pages: [1]
  Print  
Author Topic: help me teach binomial distribution  (Read 2897 times)
kohelet
Senior member
****
Posts: 613


« on: February 21, 2008, 09:46:37 AM »

Here’s another* very specific question for any stats/OR/mgt sci/otherwise geeky friends out there.  I cannot wrap my brain around something I’m supposed to be teaching.  I’ve read every textbook I can get my hand on, and no one explains what I need explaining.

Here’s a typical binomial distribution problem**: 

“The state Department of Human Services has been charged with sex discrimination in hiring.  Last year, they hired 120 new case workers, yet only 10 of these were men.  DHS claims that this ratio does not represent discrimination, only that far fewer men than women apply for these positions.  The state’s Civil Service Commission found that 30 percent of the applicants deemed to be qualified for the case worker positions were men.  What is the probability that DHS would hire 10 or fewer men if they do not discriminate?”

I know how to get the right answer.  Use the binomial distribution*** to find the probability of 10 or fewer being selected if the probability of any one being selected is .3.  In Excelspeak, it’s BINOMDIST(10,120,0.3,TRUE).  What I get, barely and only intuitively, but cannot teach:  Why is it 10 or less?  Why not 10 or more?  Or more than 10?  Or less than 10?  I get why it’s not just 10 (the probability of selecting any one value, even the expected value, 36, would be pretty small).  I’ve tried explaining this five different ways and, to be honest, they all stink.

How do you teach this?  What’s the “rule” we’re following here?

* Thanks, again, to everyone who helped me here: http://chronicle.com/forums/index.php/topic,30047.0.html and http://chronicle.com/forums/index.php/topic,30469.0.html.
** I adapted this from a textbook by Meier, Brudney, and Bohte.
*** I’m aware there are some cases when hypergeometric distribution would be appropriate.  I read a great back-and-forth between two statisticians fighting over which would be appropriate for some big profile discrimination court case.  Fun stuff.
Logged
yatchie
Member
***
Posts: 198


« Reply #1 on: February 21, 2008, 10:22:23 AM »

Quote
Why is it 10 or less?  Why not 10 or more?  Or more than 10?  Or less than 10?  I get why it’s not just 10 (the probability of selecting any one value, even the expected value, 36, would be pretty small).  I’ve tried explaining this five different ways and, to be honest, they all stink.

I think your answer to this question lies in the wording of the problem itself.

Quote
“The state Department of Human Services has been charged with sex discrimination in hiring.  Last year, they hired 120 new case workers, yet only 10 of these were men.  DHS claims that this ratio does not represent discrimination, only that far fewer men than women apply for these positions.  The state’s Civil Service Commission found that 30 percent of the applicants deemed to be qualified for the case worker positions were men.  What is the probability that DHS would hire 10 or fewer men if they do not discriminate?”

From this problem, I understand that fewer men than women are generally hired.  If DHS is discriminating against men, then they are hiring fewer than x number of men.  And in this case x = 10. 

But why less than 10?  Well, if it were more than 10 men, it wouldn't be discrimination.  YOu can also think of the normal distribution.  Since the n is pretty large, you can use the normal approximation to the binomial which would probably be well suited for this problem.  You'd want to find P(x < 10) for that.

Another thing I thought of is that in hypothesis testing (what you probably would do to test for discrimination) you are trying to find the probability that an observed value is that extreme or more extreme.  That is, find a tail probability.  P(x > 10) isn't going to give you that.
« Last Edit: February 21, 2008, 10:28:55 AM by yatchie » Logged
mccfan
Senior member
****
Posts: 576


« Reply #2 on: February 21, 2008, 01:05:28 PM »

Maybe this is way off base, but couldn't you figure the confidence interval for the proportion here?  You see if 8% falls within the 95% interval.  If not, it is unlikely that this low a percentage would occur by chance, rather than through some (potentialy sex-based) design.
Logged
kissa_mau
Frequently Napping
Distinguished Senior Member
*****
Posts: 1,212

Purrrvocative Posing


« Reply #3 on: February 21, 2008, 06:51:33 PM »

Have you thought about starting out by simulating the binomial distribution in Excel? Then the students can get perhaps a more intuitive idea for how it works and what it looks like.

Once you've actually built the distribution, you can look at a specific problem and then that plus the confidence interval idea could give you a type of hypothesis test.

I hope that makes sense.
Logged

Cat! I'm a kitty cat. And I dance, dance, dance and I dance, dance, dance.
conjugate
Compulsive punster and insatiable reader, and
Member-Moderator
Distinguished Senior Member
*****
Posts: 16,690

Tends to have warped sense of humor


« Reply #4 on: February 21, 2008, 08:04:23 PM »

Here’s another* very specific question for any stats/OR/mgt sci/otherwise geeky friends out there.  I cannot wrap my brain around something I’m supposed to be teaching.  I’ve read every textbook I can get my hand on, and no one explains what I need explaining.

Here’s a typical binomial distribution problem**: 

“The state Department of Human Services has been charged with sex discrimination in hiring.  Last year, they hired 120 new case workers, yet only 10 of these were men.  DHS claims that this ratio does not represent discrimination, only that far fewer men than women apply for these positions.  The state’s Civil Service Commission found that 30 percent of the applicants deemed to be qualified for the case worker positions were men.  What is the probability that DHS would hire 10 or fewer men if they do not discriminate?”

I know how to get the right answer.  Use the binomial distribution*** to find the probability of 10 or fewer being selected if the probability of any one being selected is .3.  In Excelspeak, it’s BINOMDIST(10,120,0.3,TRUE).  What I get, barely and only intuitively, but cannot teach:  Why is it 10 or less?  Why not 10 or more?  Or more than 10?  Or less than 10?  I get why it’s not just 10 (the probability of selecting any one value, even the expected value, 36, would be pretty small).  I’ve tried explaining this five different ways and, to be honest, they all stink.

How do you teach this?  What’s the “rule” we’re following here?

* Thanks, again, to everyone who helped me here: http://chronicle.com/forums/index.php/topic,30047.0.html and http://chronicle.com/forums/index.php/topic,30469.0.html.
** I adapted this from a textbook by Meier, Brudney, and Bohte.
*** I’m aware there are some cases when hypergeometric distribution would be appropriate.  I read a great back-and-forth between two statisticians fighting over which would be appropriate for some big profile discrimination court case.  Fun stuff.

Okay, speaking as a guy with some combinatorics under his belt, I will take a shot at this.  If you figure out the probability that "more than 10" will be hired, you are asking for one minus the probability that ten or fewer will be hired, so they are basically the same calculation.  The only difference is that there are something like 110 terms to do it the "more than 10" way, and only 11 (0 through 10) to do it the "ten or fewer" way.

The reason it's 10 or fewer is because they hired exactly 10.  Now, did they do it because they tend not to like men, or was it just random chance?  To find out, we're assuming that it's just random chance, and asking what the probability is that we wouldn't get more than 10 by random chance.  Does that help your puzzlement any?  I may be misunderstanding the question, but it seems that you aren't asking about probability or statistics, but about the way the problem is addressed.
Logged

Unfortunately, I think conjugate gives good advice.
∀ε>0∃δ>0∋|x–a|<δ⇒|ƒ(x)-ƒ(a)|<ε
kohelet
Senior member
****
Posts: 613


« Reply #5 on: February 21, 2008, 08:46:27 PM »

You guys are great.  In some ways, this is sort of affirming--I'm glad there's not some big conceptual bullet that I've been an idiot to miss.  Conjugate, your second paragraph is basically how I've tried to conceptualize the problem for the students, but your wording is helpful.  Yatchie, your emphasis on keeping the context of the question itself in view was helpful to some of my students--thank you!  And your suggestion to think about the problem as a normal distribution problem helped me clarify my own thinking . . .

If I start thinking how a vertical line at the expected value would fall in the middle of a nice normal curve, with 50% of the cases to the left and 50% to the right, I can then picture sliding that vertical line to the left--40/60, 30/70, 20/80 . . . I can definitely get my students to understand this (they already do), so then transitioning to the binomial distribution is a very small step.  I think these types of questions are written poorly (not my example, but the one I adapted it from to make it easier for my students!).  The cumulative probability (e.g., here, the p that x <= 10) isn't the "probability that they discriminate."  It is what it is--the probability that they'd hire 10 or fewer if the don't discriminate.  This certainly sheds light on whether or not they discriminate, but it certainly doesn't answer the question "what is the probability that they discriminate?" (which, come to think of it, would be a goofy question to ask in the first place, yet some of these textbook problems do).

I know that last bit doesn't make a lot of sense, but trust me, it's much clearer in my head, and I'm looking forward to trying to teach this one more time next week.  Typing this out has been helpful.  I'll check back for more responses, but thanks very much for the help so far!
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.9 | SMF © 2006-2008, Simple Machines LLC Valid XHTML 1.0! Valid CSS!