Running for Excel: Quantitative Data

This page uses the PULSE data. If you are not already familiar with it, you should read a description of the data. You can find the entire dataset at our site as a plain text file or as an Excel spreadsheet.Once it opens you should work through the examples below. Some of the examples below will use the command line because it is simple and self-documenting.

In Excel 2007, click the tab "Formulas". You will find a "Functional Library" with various statistical and mathematical formulae. Choose the respective function you need to perform and the remaining procedure will be the same as described below for Excel 2003.

To find a descriptive statistics, first select Tools>Add-Ins>tick Analysis Tool Pak > click Ok. Again go to Tools select Data Analysis then select Descriptive statistics, in data range select the range for which you want to find descriptive statistics click the Summary statistics, we will get the summary statistics as below,

Descriptive statistics for PULSE rate before and after :

 
	   PuBefore				   PuAfter	

Mean			72.8696		Mean			80
Standard Error		1.1477		Standard Error		1.7822
Median			71		Median			76
Mode			68		Mode			76
Standard Deviation	11.0087		Standard Deviation	17.0938
Sample Variance		121.1916	Sample Variance		292.1978
Kurtosis		-0.4424		Kurtosis		1.3887
Skewness		0.3974		Skewness		1.1281
Range			52		Range			90
Minimum			48 		Minimum			50
Maximum			100		Maximum			140
Sum			6704		Sum			7360
Count			92		Count			92

We can compare the before and after pulse rates with parallel boxplots. To draw a Box-plot in excel we need to find out Quartiles.

Quartiles:

To find Quartiles using excel, first select Select Insert>functions>Statistical>QUARTILE. Next select the range for which you want to find Quartile in Array, then type 1,2 or 3 to get Q1, Q2 and Q3 respectively.

Hence we get,

Quartile 	PuBefore	PuAfter
Q1     		64		68
Q2    		71		76
Q3     		80		85

Statistic	PuBefore	PuAfter
Q1		64		68
Minimum		48		50
Median		71		76
Maximum		100		140
Q3		80		85

Boxplot :

In Excel 2003:

In Excel 2007 :

{short description of image}

You can see that the "after" data show a higher median and more variability along with high outliers not present in the "before" data. The fact that both center and variability have changed makes it hard to give a simple comparison here. If we looked only at the mean, we might say that pulse rates went up by about seven points, and they did on average. However, the lowest rates have not gone up much at all (The minimum went up by 2.) while the highest have changed considerably (The maximum went up by 40!). A simple average change does not describe what happened here very accurately, which is one reason we always need to make a picture!

Because the class contained both men and women, we might expect some bimodality in the heights and weights. Stem and leaf plots are a good tool for assessing the shape of a distribution. (Boxplots tend to hide bimodality.)

To draw a box plot for Male and Female, first we have to make two different columns for male and female

 

Statistic	Male	Female	
Q1		69	63			
Minimum		66	61				
Median		71	65.5				
Maximum		75	70				
Q3		73	68				

Boxplot :

{short description of image}

Both genders show reasonably compact, symmetric distributions with no outliers but the men are consistently about 5 inches taller than the women. Because the two distributions have similar shapes and variabilities, we can reasonably say that the men as a group are about 5 inches taller than the women. Compare this to the situation with the pulse rates above, where such a simple description was an oversimplification. Note that although the two sexes are clearly different here, there is enough overlap that the stem and leaf does not clearly show the two groups.