Inference for Two Proportions in Excel

We will use the heart attack data as an example. If you have not worked with this data before you can find a description here. This is a very large data set and so is provided as zip files. (You may need a program such as winzip to unzip them). Available are plain text (with tabs separating entries), Excel. If possible, download and unzip in Excel of the heart attack data and open it in Excel.

We can find 2 by 2 table in Excel that gives a count of male and female who died in hospital or not. For this select Data > PivotTable and PivotChart Report… > Next > Range: select the columns SEX, DRG, and DIED > Next> > (tick on) New worksheet > Layout… > Drag the variable DIED and place it into Row > Similarly drag the variable SEX and place it into Column > Now in data select one of the variable DIED or SEX and place it into Data > Now double click on the button which immediately above the Data > Count > OK > Finish. We get table as follow:

Count of DIED	     SEX		
DIED		 F	 M	Grand Total
0		4298	7136	11434
1		 767	 643	 1410
Grand Total	5065	7779	12844

We can not directly find proportion test in Excel. Here we want to test the proportions of male and female who died in hospital. From the above table we get the number of male and female. We will make one more table as follow:

X is number of patient who died in hospital & N is total number of patient. First make a column Sample proportion next to N column. To find proportion of female died in hospital, suppose we have number of female died in hospital 767 is in cell B2 and total number of female 5065 in C2. First click on the first cell of Sample proportion and type "=B2/C2" and press ‘Enter’. We get p0 the proportion of females died in hospital. Similarly find the proportion of males (p1).

  A B C D
1 SEX X N Proportion
2 F 767(X0) 5065 (N0) 0.1514 (p0)
3 M 643(X1) 7779 (N1) 0.0826 (p1)

To calculate the test statistic we have formula:
z = (p0) - (p1) / sqrt ( (P*(1-P)/(N0))+ (P*(1-P)/(N1)))
where P is pooled proportion
P = [(X0) + (X0)]/[(N0)+(N1)]
Now we have to find difference in the proportion (p0 - p1), for this we have to select one cell (say cell (D4)) and
type "=D2-D3" and press ‘Enter’. For pooled proportion P we have to write "=(B2+B3)/(C2+C3)" and press ‘Enter’.Suppose we have this value in cell (D5).

Now we will find z-value in cell say D6, we have formula for z-value "=D4/SQRT((D5*(1-D5)/C2)+(D5*(1-D5)/C3))" and press ‘Enter’.

So finally we get : Difference = 0.0688
P = 0.1097
Z = 12.1845

  A B C D
1 SEX X N Proportion
2 F 767(X0) 5065 (N0) 0.1514 (p0)
3 M 643(X1) 7779 (N1) 0.0826 (p1)
4       0.0688
5       0.1097
6       12.1845

Next step is to calculate P-value:

Since our alternative hypothesis is two tailed, P-value = 2*P(z > 12.1845).

Note that using excel we can calculate left tailed P-value directly.
For that select Insert > Functions > (Function category:) Statistical > (Function name:) NORMSDIST > select D6 in z > OK. The left tailed P-value is 1.

Thus, two tailed P-value = 2 * P(z > 12.1845) =2 * (1 - P (z < 12.1845)) = 0.0000

The small P-value indicates that, we cannot regard this as a random sample from a population with an average death rate of 10%.

P-Value = 0.0000

Since p-value = 0.0000 which is less than 0.05 so we reject H0 at 5% level of significance.
This implies that there is difference between proportion of males and females died in hospital.