This page uses the PULSE data. If you are not already familiar with it, you should read a description of the data. You can find the entire dataset at our site as a plain text file or as an Excel spreadsheet.Once it opens you should work through the examples below. Some of the examples below will use the command line because it is simple and self-documenting.
In Excel 2007, click the tab "Formulas". You will find a "Functional Library" with various statistical and mathematical formulae. Choose the respective function you need to perform and the remaining procedure will be the same as described below for Excel 2003.
To find a descriptive statistics, first select Tools>Add-Ins>tick Analysis Tool Pak > click Ok. Again go to Tools select Data Analysis then select Descriptive statistics, in data range select the range for which you want to find descriptive statistics click the Summary statistics, we will get the summary statistics as below,
PuBefore PuAfter Mean 72.8696 Mean 80 Standard Error 1.1477 Standard Error 1.7822 Median 71 Median 76 Mode 68 Mode 76 Standard Deviation 11.0087 Standard Deviation 17.0938 Sample Variance 121.1916 Sample Variance 292.1978 Kurtosis -0.4424 Kurtosis 1.3887 Skewness 0.3974 Skewness 1.1281 Range 52 Range 90 Minimum 48 Minimum 50 Maximum 100 Maximum 140 Sum 6704 Sum 7360 Count 92 Count 92
We can compare the before and after pulse rates with parallel boxplots. To draw a Box-plot in excel we need to find out Quartiles.
To find Quartiles using excel, first select Select Insert>functions>Statistical>QUARTILE. Next select the range for which you want to find Quartile in Array, then type 1,2 or 3 to get Q1, Q2 and Q3 respectively.Hence we get,
Quartile PuBefore PuAfter Q1 64 68 Q2 71 76 Q3 80 85 Statistic PuBefore PuAfter Q1 64 68 Minimum 48 50 Median 71 76 Maximum 100 140 Q3 80 85
You can see that the "after" data show a higher median and more variability along with high outliers not present in the "before" data. The fact that both center and variability have changed makes it hard to give a simple comparison here. If we looked only at the mean, we might say that pulse rates went up by about seven points, and they did on average. However, the lowest rates have not gone up much at all (The minimum went up by 2.) while the highest have changed considerably (The maximum went up by 40!). A simple average change does not describe what happened here very accurately, which is one reason we always need to make a picture!
Because the class contained both men and women, we might expect some bimodality in the heights and weights. Stem and leaf plots are a good tool for assessing the shape of a distribution. (Boxplots tend to hide bimodality.)
To draw a box plot for Male and Female, first we have to make two different columns for male and female
Statistic Male Female Q1 69 63 Minimum 66 61 Median 71 65.5 Maximum 75 70 Q3 73 68
Both genders show reasonably compact, symmetric distributions with no outliers but the men are consistently about 5 inches taller than the women. Because the two distributions have similar shapes and variabilities, we can reasonably say that the men as a group are about 5 inches taller than the women. Compare this to the situation with the pulse rates above, where such a simple description was an oversimplification. Note that although the two sexes are clearly different here, there is enough overlap that the stem and leaf does not clearly show the two groups.