Application to linear models

Application to linear models

Lessons

In this section, we will apply least square problems to economics.

Instead of finding the least squares solution of Ax=bAx=b, we will be finding it for Xβ=yX\beta =y where

XX → design matrix
β\beta → parameter vector
yy → observation vector

Least-Squares Line
Suppose we are given data points, and we want to find a line that best fits the data points. Let the best fit line be the linear equation
y=β0+β1xy=\beta_0 + \beta_1 x

And let the data points be (x1,y1),(x2,y2),,,(xn,yn)(x_1,y_1 ),(x_2,y_2 ),\cdots,,(x_n,y_n ). The graph should look something like this:
best fit liney=beta_0 + beta_1 x

Our goal is to determine the parameters β0\beta_0 and β1\beta_1. Let's say that each data point is on the line. Then
best fit line data points beta_0 + beta_1 x

This is a linear system which we can write this as:
linear system

Then the least squares solution to Xβ=yX\beta=y will be XTXβ=XTyX^T X \beta =X^T y.

General Linear Model
Since the data points are not actually on the line, then there are residual values. Those are also known as errors. So we introduce a vector called the residual vector ϵ\epsilon, where
ϵ=yXβ\epsilon = y - X\beta
y=Xβ+ϵ y = X\beta + \epsilon

Our goal is to minimize the length of ϵ\epsilon (the error), so that XβX\beta is approximately equal to yy. This means we are finding a least-squares solution of y=Xβy=X\beta using XTXβ=XTyX^T X\beta=X^T y.

Least-Squares of Other Curves
Let the data points be (x1,y1),(x2,y2),,,(xn,yn)(x_1,y_1 ),(x_2,y_2 ),\cdots,,(x_n,y_n ) and we want to find the best fit using the function y=β0+β1x+β2x2y=\beta_0+\beta_1 x+\beta_2 x^2, where β0,β1,β2\beta_0,\beta_1,\beta_2 are parameters. Technically we are using a best fit quadratic function instead of a line now.
best fit quadratic function

Again, the data points don't actually lie on the function, so we add residue values ϵ1,ϵ2,,ϵn\epsilon_1,\epsilon_2,\cdots,\epsilon_n where

add residue values 
 to data points

Since we are minimizing the length of ϵ\epsilon, then we can find the least-squares solution β\beta using XTXβ=XTyX^T X\beta=X^T y. This can also be applied to other functions.

Multiple Regression
Let the data points be (u1,v1,y1),(u2,v2,y2),,,(un,vn,yn)(u_1,v_1,y_1 ),(u_2,v_2,y_2 ),\cdots,,(u_n,v_n,y_n ) and we want to use the best fit function y=β0+β1u+β2vy=\beta_0+\beta_1 u+\beta_2 v, where β0,β1,β2\beta_0,\beta_1,\beta_2 are parameters.
best fit function Multiple Regression

Again, the data points don't actually lie on the function, so we add residue values ϵ1,ϵ2,,ϵn\epsilon_1,\epsilon_2,\cdots,\epsilon_n where

add residue values to data points of the function

Since we are minimizing the length of ϵ\epsilon, then we can find the least-squares solution β\beta using XTXβ=XTyX^T X\beta=X^T y. This can also be applied to other multi-variable functions.
  • Introduction
    Applications to Linear Models Overview:
    a)
    Applying Least-Squares Problem to Economics
    • Go from Ax=bAx=b to Xβ=yX\beta=y
    XX → design matrix
    β\beta → parameter vector
    yy → observation vector

    b)
    Least-Squares Line
    • Finding the best fit line
    • Turning a system of equations into Xβ=yX\beta =y
    • Using the normal equation XTXβ=XTyX^T X\beta=X^T y
    • Introduction of the residual vector

    c)
    Least-Squares to Other Curves
    • Finding the Best Fit Curve (not a line)
    • Using the normal equation XTXβ=XTyX^T X\beta=X^T y

    d)
    Least-Squares to Multiple Regressions
    • Multiple Regression → multivariable function
    • Finding a Best Fit Plane
    • Using the normal equation XTXβ=XTyX^T X\beta=X^T y


  • 1.
    Finding the Least-Squares Line
    Find the equation y=β0+β1xy=\beta_0+\beta_1 x of the least-squares line that best fits the given data points:
    (0,1),(1,2),(2,3),(3,3)(0,1),(1,2),(2,3),(3,3)

  • 2.
    Finding the Least-Squares of Other Curves
    Suppose the monthly costs of a product depend on seasonal fluctuations. A curve that approximates the cost is
    y=β0+β1x+β2x2+β3cosy= \beta _0 + \beta _1 x+ \beta _2 x^2 + \beta_3 \cos (2πx12\frac{2 \pi x}{12})

    Suppose you want to find a better approximation in the future by evaluating the residual errors in each data point. Let's assume the errors for each data point to be ϵ1,ϵ2,,ϵn\epsilon_1,\epsilon_2,\cdots,\epsilon_n.
    Give the design matrix, parameter vector, and residual vector for the model that leads to a least-squares fit for the equation above. Assume the data are (x1,y1),,(xn,yn).(x_1,y_1 ),\cdots,(x_n,y_n).

  • 3.
    An experiment gives the data points (0,1),(1,3),(2,4),(3,5)(0,1) , (1,3) , (2, 4), (3, 5). Suppose we wish to approximate the data using the equation
    y=A+Bx2y=A+Bx^2

    First find the design matrix, observational vector, and unknown parameter vector. No need to find the residual vector. Then find the least-squares curve for the data.

  • 4.
    Finding the Least Squares of Multiple Regressions
    When examining a local model of terrain, we examine the data points to be (1,1,3),(2,2,5),(1,1, 3), (2, 2, 5), and (3,1,3)(3, 1, 3). Suppose we wish to approximate the data using the equation
    y=β0u+β1vy=\beta_0 u+\beta_1 v

    First find the design matrix, observational vector, and unknown parameter vector. No need to find the residual vector. Then find the least-squares curve for the data.

  • 5.
    Proof Question Relating to Linear Models
    Show that
    Xβ^2=βTXTy\lVert X \hat{\beta} \rVert^2=\beta^TX^Ty