Equation of the best fit line
Intros
Examples
Lessons
- Determining the Equation for a Best Fit Line
Given the following bivariate data give the equation for the best fit line and plot it on the given graph.
x
y
1
2
2
2
3
4
4
5
- Determining the Equation for a Best Fit Line using Calculator Commands
For the following bivariate data:
x
y
1
9
2
7
3
8
4
5
5
5
6
3
7
2
- Interpretation graphical Data
In Skyrim (a video game) I plotted what level I was when I killed my first 5 dragons. The graphical data is given below:
# of dragons killed
Corresponding level
1
Level 4
2
Level 5
3
Level 6
4
Level 6
5
Level 7
Free to Join!
Easily See Your Progress
We track the progress you've made on a topic so you know what you've done. From the course view you can easily see what topics have what and the progress you've made on them. Fill the rings to completely master that section or mouse over the icon to see more details.Make Use of Our Learning Aids
Earn Achievements as You Learn
Make the most of your time as you use StudyPug to help you achieve your goals. Earn fun little badges the more you watch, practice, and use our service.Create and Customize Your Avatar
Play with our fun little avatar builder to create and customize your own avatar on StudyPug. Choose your face, eye colour, hair colour and style, and background. Unlock more options the more you use StudyPug.
Topic Notes
Table of Contents:
- Equation of the best fit line
- What is a line of best fit
- How to find line of best fit
- How to draw a line of best fit
- Example 1
Equation of the best fit line
This lesson is a continuation of our past two lessons, where we talked about bivariate data, scatter plots and correlation, and then learnt about regression analysis. Therefore, we will be using the concepts we acquired throughout those two lessons and construct on them to study the line of best fit definition and characteristics.
What is a line of best fit
As we saw in our past lesson, a line of best fit (or best fit line) is simply straight line that tries to represent the data points in a scatter plot as best as possible. This doesnt mean that this line will touch every single point from the data in the plot, actually a line of best fit may touch a few, all or NONE of the data points plotted in the graph. For that reason, the line of best fit is also called the trend line because instead of exactly representing each single point of the data set, it does all it can by presenting the overall trend that the data points follow, it provides a view of the behaviour of the data points and how the variables are correlated with each other.
How to find line of best fit
Since the line of best fit is simply a straight line, it can be mathematically defined through the equation for a straight line:
Where we know that:
y= dependent variable
x=independent variable
m=a= slope of the line (the name can be different depending on the textbook you are using)
b=y−intercept (point in the graph where the line crosses the y axis
Notice the slope can have either one of two names: m or a, the name differs depending on which textbook you are using in your class or to study; for this lesson, we will keep the name a, just remember that we are talking about the slope of best fit line.
For the cases in which we are looking at a linear regression analysis graph where a bivariate set of data has been plotted, we will always have the values of the variables xi and yi (since these are the values given in the bivariate data set) and so, we will usually have to solve for the slope and the y-intercept from the equation for the line of best fit.
In other words, when having a bivariate data set, xi and yi are provided, so a and b have to be calculated (this is not always the case, the line of best fit equation can be used to solve for the values of the variables themselves when given the slope of the line and the y-intercept, but if the data table is provided, then we will be solving for a and b).
The formulas for the slope and the y-intercept are as follows:
Where:
n= number of data points
yi=dependent variable data value
xi=independent variable data value
a= slope of the best fit line
b=y -intercept
x= mean for the sample of x values
y = mean for the sample of y values
∑i=1n is the symbol for summation
therefore: ∑i=1nxi=x1+x2+...+xn
In equation 2, notice that b is defined in terms of a, therefore, you will always solve for a first; b is also defined in terms of the means x and y, which takes us to an important realization: the data points in the set shown in a regression analysis scatter plot count as a sample, not as a whole population. If you think about it, this makes sense, since a regression analysis scatter plot is usually used to find missing points that have not been graphed, but can be inferred by the relationship shown throughout the given data points.
Therefore, when obtaining the mean of the values for each of the variables used in the analysis, we are taking the mean of sample data points and so the notation for the mean of a sample: x.
After solving a and b, we can use these values to solve the best fit line equation as shown in equation 1, and plot the best fit line graph in the scatter plot.
How to draw a line of best fit
Let us use the method described above to obtain the best fit line of the bivariate data scatter plot shown in figure 2. We start by producing its corresponding data table so we know the values of xi and yi .
So let us solve for a by making the calculations in pieces:
Now we solve for b:
And so, we can obtain the points for our trend line using the line of best fit formula from equation 1:
And now we can graph the two points found above: (0, 8.9) and (13, 1.23); we connect them with a straight line and we find the line of best fit!
And so, for the scatter plot of the line of best fit as seen in figure 4, we can see that the points (0, 8.9) and (13, 1.23) are shown in green, and the best fit line is shown in blue.
Let us work through another example so you can get more practice:
Example 1
Given the following bivariate data, what is the line of best fit?Use the the equation for the line of best fit and plot it in the diagram provided.
We start by doing the calculation for the slope of the line of best fit:
Now we solve for b:
And so, we can obtain the points for our trend line using the line of best fit formula from equation 1:
And now we can graph the two points found above: (0, 0.5) and (4, 4.9); we connect them with a straight line and we obtain the line of best fit:
No we end this lesson with a few recommendations: this lesson on the equation of the line of best fit provides many more examples that you can work through so you continue practice what you learned today. And for even more practice on you own, this lines of best fit worksheet can be printed out and worked through!
This is it for our lesson of today, see you in the next one!
• a=n∑x2−(∑x)2n∑xy−∑x∑y
• b=y−ax
remaining today
remaining today