InterQuartile Range (IQR)

The interquartile range of an observation variable is the difference of its upper and lower quartiles. It is a measure of how far apart the middle portion of data spreads in value.

When a data set has outliers or extreme values, we summarize a typical value using the median as opposed to the mean.  When a data set has outliers, variability is often summarized by a statistic called the interquartile range, which is the difference between the first and third quartiles. The first quartile, denoted Q1, is the value in the data set that holds 25% of the values below it. The third quartile, denoted Q3, is the value in the data set that holds 25% of the values above it. The quartiles can be determined following the same approach that we used to determine the median, but we now consider each half of the data set separately.

The interquartile range is defined as:

Interquartile Range = Q3-Q1

 

Interquartile Range with Even Sample Size:

For the sample (n=10) the median diastolic blood pressure is 71 (50% of the values are above 71, and 50% are below). The quartiles can be determined in the same way we determined the median, except we consider each half of the data set separately.

 Median with even number of observations

There are 5 values below the median (lower half), the middle value is 64 which is the first quartile. There are 5 values above the median (upper half), the middle value is 77 which is the third quartile. The interquartile range is 77 – 64 = 13; the interquartile range is the range of the middle 50% of the data.

 

Interquartile Range with Odd Sample Size:

When the sample size is odd, the median and quartiles are determined in the same way. Suppose in the previous example, the lowest value (62) were excluded, and the sample size was n=9.  The median and quartiles are indicated below.

Median value when sample size is odd

When the sample size is 9, the median is the middle number 72. The quartiles are determined in the same way looking at the lower and upper halves, respectively. There are 4 values in the lower half, the first quartile is the mean of the 2 middle values in the lower half ((64+64)/2=64). The same approach is used in the upper half to determine the third quartile ((77+81)/2=79).

Interquartile Range Using R:

> subset = dataSet$field       # the list of records
> IQR(subset )                          # apply the IQR function

2 comments

  1. This post really helped a lot in clearing the confusion I have with respect to difference in InterQuartile Range an Standard Deviation. Nice work.

    Liked by 1 person

Leave a reply to wiservarun Cancel reply