Section 1
Shapes of Distributions (Symmetric vs. Skewed)
Property
The shape of a data distribution reveals its "personality." When viewing histograms or box plots, we classify the shape into three main categories:
- Symmetric: Data is evenly spread around the center. On a box plot, the median is perfectly in the middle of the box, and the whiskers are equal in length.
- Skewed Right (Positively Skewed): Most data clusters on the left, with a long "tail" stretching to the right. On a box plot, the median is pushed to the left side of the box (), and the right whisker is much longer.
- Skewed Left (Negatively Skewed): Most data clusters on the right, with a long "tail" stretching to the left. On a box plot, the median is pushed to the right side of the box (), and the left whisker is much longer.
Note on Bin Width: When using technology to graph a histogram, choosing the wrong bin width can artificially hide the true shape. Bins that are too wide will make a skewed distribution look deceptively symmetric, while bins that are too narrow will create jagged, fake gaps.
Examples
- Symmetric: A dot plot of daily temperatures shows values clustered evenly around 72°F. The left and right sides look like mirror images.
- Skewed Right: A histogram of house prices shows most homes cost between 300k (a tall peak on the left), but a few $1M+ mansions create a long tail dragging to the right.
- Skewed Left: A box plot of retirement ages has , Median = 65, and . The left side of the box (58 to 65) is much wider than the right side (65 to 68), and the left whisker stretches far out to early retirees at age 45.