Learn on PengiBig Ideas Math, Course 3Chapter 9: Data Analysis and Displays

Lesson 2: Lines of Fit

In this Grade 8 lesson from Big Ideas Math, Course 3, students learn how to draw a line of fit on a scatter plot and write its linear equation to model real-world data. They practice interpreting the slope and y-intercept of the line and use the equation to make predictions, applying Common Core standards 8.SP.1, 8.SP.2, and 8.SP.3. Contexts such as river depth and animal growth illustrate how lines of fit reveal linear relationships in data sets.

Section 1

Writing Equations for Lines of Best Fit

Property

A line of best fit through a scatter plot can be modeled with a linear equation by selecting two points that the line passes through or nearly passes through.
Use these points (x1,y1)(x_1, y_1) and (x2,y2)(x_2, y_2) to calculate the slope m=y2y1x2x1m = \frac{y_2 - y_1}{x_2 - x_1}, then substitute the slope and either point into the point-slope form to write the equation of the line of best fit.

Examples

Section 2

The Line of Best Fit (Regression Line)

Property

The line of best fit, also known as the regression line, is the specific line that most accurately represents the data in a scatter plot. It is calculated mathematically to minimize the sum of the squared vertical distances from each data point to the line. The equation is typically in the form y=mx+by = mx + b.

Examples

  • For a data set relating hours of sunshine (xx) to plant growth in cm (yy), the line of best fit might be y=0.5x+2y = 0.5x + 2. This line is the single best summary of the linear trend.
  • A scatter plot shows the relationship between a car''s age in years (xx) and its value in dollars (yy). The calculated line of best fit could be y=2500x+30000y = -2500x + 30000.
  • Unlike a line of fit which can be estimated by eye, the line of best fit is found using a precise formula, often with a calculator or computer software.

Explanation

While a line of fit is a good visual estimate of the trend in a scatter plot, the line of best fit is the most precise version. It is a unique, mathematically determined line that best models the linear relationship between two variables. This line is calculated to be as close as possible to all the data points simultaneously. We often use technology to find the exact equation for the line of best fit, which is also called the regression line.

Section 3

Correlation Coefficient

Property

The correlation coefficient is a value, rr, between 1-1 and 11.

  • r>0r > 0 suggests a positive (increasing) relationship
  • r<0r < 0 suggests a negative (decreasing) relationship
  • The closer the value is to 0, the more scattered the data.
  • The closer the value is to 1 or 1-1, the less scattered the data is.

Examples

  • The relationship between the distance a car is driven and the amount of fuel used would have a strong positive correlation, with an rr value close to 1, like r=0.98r = 0.98.
  • Data comparing the number of hours a candle has been burning to its remaining height would show a strong negative correlation, with an rr value close to -1, like r=0.99r = -0.99.
  • A scatter plot of a person's height versus the last digit of their phone number would have a correlation coefficient very close to 0.

Explanation

The correlation coefficient, rr, is a score from -1 to 1 that measures how strong a linear relationship is. A score near 1 or -1 means the data points form an almost perfect line, while a score near 0 means no line.

Book overview

Jump across lessons in the current chapter without opening the full course modal.

Continue this chapter

Chapter 9: Data Analysis and Displays

  1. Lesson 1

    Lesson 1: Scatter Plots

  2. Lesson 2Current

    Lesson 2: Lines of Fit

  3. Lesson 3

    Lesson 3: Two-Way Tables

  4. Lesson 4

    Lesson 4: Choosing a Data Display

Lesson overview

Expand to review the lesson summary and core properties.

Expand

Section 1

Writing Equations for Lines of Best Fit

Property

A line of best fit through a scatter plot can be modeled with a linear equation by selecting two points that the line passes through or nearly passes through.
Use these points (x1,y1)(x_1, y_1) and (x2,y2)(x_2, y_2) to calculate the slope m=y2y1x2x1m = \frac{y_2 - y_1}{x_2 - x_1}, then substitute the slope and either point into the point-slope form to write the equation of the line of best fit.

Examples

Section 2

The Line of Best Fit (Regression Line)

Property

The line of best fit, also known as the regression line, is the specific line that most accurately represents the data in a scatter plot. It is calculated mathematically to minimize the sum of the squared vertical distances from each data point to the line. The equation is typically in the form y=mx+by = mx + b.

Examples

  • For a data set relating hours of sunshine (xx) to plant growth in cm (yy), the line of best fit might be y=0.5x+2y = 0.5x + 2. This line is the single best summary of the linear trend.
  • A scatter plot shows the relationship between a car''s age in years (xx) and its value in dollars (yy). The calculated line of best fit could be y=2500x+30000y = -2500x + 30000.
  • Unlike a line of fit which can be estimated by eye, the line of best fit is found using a precise formula, often with a calculator or computer software.

Explanation

While a line of fit is a good visual estimate of the trend in a scatter plot, the line of best fit is the most precise version. It is a unique, mathematically determined line that best models the linear relationship between two variables. This line is calculated to be as close as possible to all the data points simultaneously. We often use technology to find the exact equation for the line of best fit, which is also called the regression line.

Section 3

Correlation Coefficient

Property

The correlation coefficient is a value, rr, between 1-1 and 11.

  • r>0r > 0 suggests a positive (increasing) relationship
  • r<0r < 0 suggests a negative (decreasing) relationship
  • The closer the value is to 0, the more scattered the data.
  • The closer the value is to 1 or 1-1, the less scattered the data is.

Examples

  • The relationship between the distance a car is driven and the amount of fuel used would have a strong positive correlation, with an rr value close to 1, like r=0.98r = 0.98.
  • Data comparing the number of hours a candle has been burning to its remaining height would show a strong negative correlation, with an rr value close to -1, like r=0.99r = -0.99.
  • A scatter plot of a person's height versus the last digit of their phone number would have a correlation coefficient very close to 0.

Explanation

The correlation coefficient, rr, is a score from -1 to 1 that measures how strong a linear relationship is. A score near 1 or -1 means the data points form an almost perfect line, while a score near 0 means no line.

Book overview

Jump across lessons in the current chapter without opening the full course modal.

Continue this chapter

Chapter 9: Data Analysis and Displays

  1. Lesson 1

    Lesson 1: Scatter Plots

  2. Lesson 2Current

    Lesson 2: Lines of Fit

  3. Lesson 3

    Lesson 3: Two-Way Tables

  4. Lesson 4

    Lesson 4: Choosing a Data Display