Quiz: Scatterplots and Association
Test your understanding of scatterplots, correlation, and association with these review questions.
1. In a scatterplot investigating whether hours of exercise per week predicts resting heart rate, which variable should be placed on the x-axis?
- Resting heart rate, because it's the response variable
- Hours of exercise, because it's the explanatory variable
- Either variable, because correlation is symmetric
- Whichever variable has larger values
Show Answer
The correct answer is B. In a scatterplot, the explanatory (independent) variable goes on the x-axis and the response (dependent) variable goes on the y-axis. Since we're investigating whether exercise predicts heart rate, exercise is the explanatory variable. While correlation itself is symmetric, proper scatterplot construction follows this convention.
Concept Tested: Scatterplot Construction
2. A scatterplot shows points that slope downward from left to right with moderate scatter around the pattern. How would you describe this association?
- Strong positive linear association
- Moderate negative linear association
- Weak positive linear association
- No association
Show Answer
The correct answer is B. When points slope downward from left to right, the association is negative (as x increases, y decreases). "Moderate scatter around the pattern" indicates the relationship is neither very strong (tight clustering) nor very weak (loose scatter), making it a moderate association. The overall pattern being roughly linear completes the description.
Concept Tested: Describing Scatterplots
3. Which correlation coefficient indicates the strongest linear relationship?
- r = -0.85
- r = 0.72
- r = -0.45
- r = 0.80
Show Answer
The correct answer is A. The strength of a linear relationship is determined by the absolute value of r, not its sign. Here, |-0.85| = 0.85 is the largest absolute value, indicating the strongest linear relationship. The negative sign only tells us the direction (negative association), not the strength.
Concept Tested: Correlation Coefficient
4. A researcher finds r = 0 for the relationship between study time and test scores. What can we conclude?
- There is no relationship between study time and test scores
- There is no linear relationship, but there could be a nonlinear one
- The data contains errors
- Study time definitely does not affect test scores
Show Answer
The correct answer is B. A correlation of 0 means there is no linear relationship between the variables. However, there could still be a curved (nonlinear) relationship. For example, if very low and very high study times both lead to lower scores (inverted U-shape), r could be near 0 despite a clear pattern. Always examine the scatterplot.
Concept Tested: Correlation Limitations
5. Which of the following is true about the correlation coefficient r?
- It has units that match the data being measured
- It can take any value on the number line
- It changes when you swap the x and y variables
- It is always between -1 and 1, inclusive
Show Answer
The correct answer is D. The correlation coefficient is always bounded between -1 and 1. It is unitless because z-scores remove the original units. Correlation is symmetric, meaning the correlation between x and y equals the correlation between y and x. A value outside [-1, 1] indicates a calculation error.
Concept Tested: Properties of Correlation
6. Data shows a strong positive correlation between ice cream sales and shark attacks. Which statement best explains this correlation?
- Eating ice cream causes people to be attacked by sharks
- Shark attacks cause people to buy more ice cream
- A third variable (summer/warm weather) causes both to increase
- This is proof that correlation equals causation
Show Answer
The correct answer is C. This is a classic example of correlation not implying causation. During summer months, both ice cream consumption and beach swimming increase due to warm weather. More swimming means more shark encounters, and more heat means more ice cream sales. Temperature is the lurking variable connecting both.
Concept Tested: Correlation vs. Causation (Correlation Limitations)
7. A scatterplot of car age (years) versus resale value ($) shows points that follow a curved pattern, decreasing rapidly at first then leveling off. What is the form of this association?
- Linear
- Nonlinear
- No form
- Positive linear
Show Answer
The correct answer is B. A curved pattern that decreases rapidly then levels off is characteristic of exponential decay or logarithmic relationships, both of which are nonlinear. Car values typically depreciate quickly in early years then stabilize. Attempting to fit a straight line to such data would miss the true pattern.
Concept Tested: Nonlinear Form
8. When calculating the correlation coefficient, what is the purpose of converting both variables to z-scores?
- To make the calculations easier
- To standardize variables so they can be compared on the same scale
- To change the shape of the distribution
- To eliminate outliers from the data
Show Answer
The correct answer is B. Converting to z-scores puts both variables on a standard scale (mean 0, standard deviation 1), allowing us to compare how they move together regardless of their original units. This is why correlation is unitless. The shape of the distribution remains unchanged by standardization.
Concept Tested: Calculating Correlation
9. A single point is added to a dataset that previously showed r = 0.90. The new point has an extreme x-value and falls far from the line. What is most likely to happen to the correlation?
- It will stay exactly at 0.90
- It will increase toward 1.0
- It will decrease substantially
- It will become negative
Show Answer
The correct answer is C. Correlation is sensitive to outliers, especially those with extreme x-values (high leverage points). A point far from the established pattern will pull the correlation toward 0, weakening the apparent relationship. Such influential points can dramatically change the correlation coefficient.
Concept Tested: Properties of Correlation (Outlier Sensitivity)
10. A study finds a correlation of r = 0.65 between height and basketball skill among professional NBA players. What issue might affect this correlation?
- The correlation is too strong to be meaningful
- The range of heights is restricted, potentially understating the true correlation
- Height and basketball skill are not quantitative variables
- Professional players are not a valid sample
Show Answer
The correct answer is B. NBA players represent a restricted range of heights since very short individuals are rarely found in professional basketball. When the range of one variable is restricted, correlation tends to be weaker than it would be in the general population. The true correlation between height and basketball skill across all people is likely much higher.
Concept Tested: Correlation Limitations (Restricted Range)