Least Squares Problem icon Least Squares Problem
University Project #Data Science#Mathematics

Least Squares Problem#

Implementing Least Squares problem solver, along with some examples of it running.

.rmd knitted with execution results available for download here

Define a function to solve the least squares problem#

Function takes in a list of (x,y) points.

Creates a matrix ‘A’, with the first column being all 1’s, and the second column being all the ‘x’ values

Creates a vector ‘b’, with it’s values being all of the ‘y’ values

Then solves the equation: x~ = inv(t(A)A)t(A)b

… To get x~, the ‘a’ and ‘b’ values of the y = a + bx line of best fit

least_squares_linear = function(points){
# create 'A'
A = matrix(c(rep(1, length(points$x)), points$x), nr=length(points$x), nc=2)
# create 'b'
b = points$y
# calculate 't(A)A'
AtA = t(A) %*% A
AtA
# find the inverse of 'AtA'
invAtA = solve(AtA)
# use the previous values to solve for 'x~'
solution_linear = invAtA %*% t(A) %*% b
return(solution_linear)
}

The only difference for the quadratic version is the additional column in ‘A’ containing all the ‘x’ values, squared

Realistically, this could be worked into a general function that works for polynomials to the ’n’th degree, but I can’t be bothered.

least_squares_quad = function(points){
# create 'A'
A = matrix(c(rep(1, length(points$x)), points$x, points$x ^ 2), nr=length(points$x), nc=3)
# create 'b'
b = points$y
# calculate 't(A)A'
AtA = t(A) %*% A
AtA
# find the inverse of 'AtA'
invAtA = solve(AtA)
# use the previous values to solve for 'x~'
solution_quad = invAtA %*% t(A) %*% b
return(solution_quad)
}

Problem 1:#

Find the least squares approximating line for the points: * (1,0) * (2,2) * (4,5) and compute the corresponding least squares error.

# least squares error is the difference between the actual point, and the predicted point, squared, summed, and then square rooted
error = function(a, b, points){
e_i = points$y - (a + b*points$x)
return(sqrt(sum(e_i^2)))
}
points = list()
points$x = c(1,2,4)
points$y = c(0,2,5)
sol = least_squares_linear(points)
# least squares solution is:
sol
## [,1]
## [1,] -1.500000
## [2,] 1.642857
plot(x=points$x, y=points$y)
lines(x=points$x, y=sol[1] + sol[2] * points$x)
```r e = error(sol[1], sol[2], points) # least squares error is: e ``` ``` ## [1] 0.2672612 ```

Problem 2:#

Find the least squares approximating parabola for the points

  • (1,1)
  • (2,-2)
  • (3,3)
  • (4,4)

Since parabola’s are of degree 2, use the quadratic least squares function

points = list()
points$x = c(1,2,3,4)
points$y = c(1,-2,3,4)
sol = least_squares_quad(points)
# solution is:
sol
## [,1]
## [1,] 3.0
## [2,] -3.6
## [3,] 1.0
plot(x=points$x, y=points$y)
curve(sol[1] + sol[2]*x + sol[3]*x^2, from=min(points$x), to=max(points$x))

Problem 3:#

Construct and solve the normal equations to find a least squares solution of Ax = b, where

A = matrix(c(c(3,1,1), c(1,2,1)), nc=2, nr=3)
A
## [,1] [,2]
## [1,] 3 1
## [2,] 1 2
## [3,] 1 1
b = c(1,1,1)
b
## [1] 1 1 1

This is just solving using the linear least squares again, but we have ‘A’ and ‘b’ instead of points

So, just use the linear least squares function code to solve for ‘x’

# calculate 't(A)A'
AtA = t(A) %*% A
# find the inverse of 'AtA'
invAtA = solve(AtA)
# use the previous values to solve for 'x~'
sol = invAtA %*% t(A) %*% b
sol
## [,1]
## [1,] 0.2000000
## [2,] 0.4666667

Problem 4:#

Consider

A = matrix(c(c(1,1,0,0),c(1,0,-1,-1),c(0,1,1,1),c(0,1,1,0)), nc=4, nr=4)
A
## [,1] [,2] [,3] [,4]
## [1,] 1 1 0 0
## [2,] 1 0 1 1
## [3,] 0 -1 1 1
## [4,] 0 -1 1 0
b = c(1,-3,2,4)
b
## [1] 1 -3 2 4

Show that the least squares solution of Ax = b is not unique. Solve the normal equations to find all least squares solutions.

[Hint: The RREF might be helpful.]

if (!require(pracma)) {
install.packages("pracma")
}
## Loading required package: pracma
library(pracma)
rref(A)
## [,1] [,2] [,3] [,4]
## [1,] 1 0 1 0
## [2,] 0 1 -1 0
## [3,] 0 0 0 1
## [4,] 0 0 0 0

Theorem 14 in the course notes states that the least squares solution is unique if ‘A’ has linearly independent columns, but this is not the case, since the result of RREF(A) has a zero row.

Therefore the solution to the least squares problem is not unique.

# solving for a general solution
# find t(A)A
AtA = t(A) %*% A
AtA
## [,1] [,2] [,3] [,4]
## [1,] 2 1 1 1
## [2,] 1 3 -2 -1
## [3,] 1 -2 3 2
## [4,] 1 -1 2 2
# find t(A)b
Atb = t(A) %*% b
Atb
## [,1]
## [1,] -2
## [2,] -5
## [3,] 3
## [4,] -1
# form augmented matrix, and row reduce
aug = matrix(c(AtA, Atb), nr=4, nc=5)
rref(aug)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 0 1 0 3
## [2,] 0 1 -1 0 -4
## [3,] 0 0 0 1 -4
## [4,] 0 0 0 0 0
# formulate the general solution using solved matrix
# shuffling the equations around and solving for each 'xi'...
# the free variable is 'x3', meaning it can be anything, for now it's just set to 0
x3 = 0
x1 = 3 - x3
x2 = -4 + x3
x4 = -4
# so the solution is:
x_tilde = function(x3){
return(c(3,-4,0,-4) + x3 * c(-1, 1, 1, 0))
}
sol1 = x_tilde(0)
sol1
## [1] 3 -4 0 -4
sol2 = x_tilde(1)
sol2
## [1] 2 -3 1 -4
← Back to Projects