Least Squares Problem Least Squares Problem
Implementing Least Squares problem solver, along with some examples of it running.
.rmd knitted with execution results available for download here
Define a function to solve the least squares problem
Function takes in a list of (x,y) points.
Creates a matrix ‘A’, with the first column being all 1’s, and the second column being all the ‘x’ values
Creates a vector ‘b’, with it’s values being all of the ‘y’ values
Then solves the equation: x~ = inv(t(A)A)t(A)b
… To get x~, the ‘a’ and ‘b’ values of the y = a + bx line of best fit
least_squares_linear = function(points){ # create 'A' A = matrix(c(rep(1, length(points$x)), points$x), nr=length(points$x), nc=2) # create 'b' b = points$y # calculate 't(A)A' AtA = t(A) %*% A AtA # find the inverse of 'AtA' invAtA = solve(AtA) # use the previous values to solve for 'x~' solution_linear = invAtA %*% t(A) %*% b
return(solution_linear)}The only difference for the quadratic version is the additional column in ‘A’ containing all the ‘x’ values, squared
Realistically, this could be worked into a general function that works for polynomials to the ’n’th degree, but I can’t be bothered.
least_squares_quad = function(points){ # create 'A' A = matrix(c(rep(1, length(points$x)), points$x, points$x ^ 2), nr=length(points$x), nc=3) # create 'b' b = points$y # calculate 't(A)A' AtA = t(A) %*% A AtA # find the inverse of 'AtA' invAtA = solve(AtA) # use the previous values to solve for 'x~' solution_quad = invAtA %*% t(A) %*% b
return(solution_quad)}Problem 1:
Find the least squares approximating line for the points: * (1,0) * (2,2) * (4,5) and compute the corresponding least squares error.
# least squares error is the difference between the actual point, and the predicted point, squared, summed, and then square rootederror = function(a, b, points){ e_i = points$y - (a + b*points$x) return(sqrt(sum(e_i^2)))}
points = list()points$x = c(1,2,4)points$y = c(0,2,5)
sol = least_squares_linear(points)# least squares solution is:sol## [,1]## [1,] -1.500000## [2,] 1.642857plot(x=points$x, y=points$y)lines(x=points$x, y=sol[1] + sol[2] * points$x)Problem 2:
Find the least squares approximating parabola for the points
- (1,1)
- (2,-2)
- (3,3)
- (4,4)
Since parabola’s are of degree 2, use the quadratic least squares function
points = list()points$x = c(1,2,3,4)points$y = c(1,-2,3,4)
sol = least_squares_quad(points)# solution is:sol## [,1]## [1,] 3.0## [2,] -3.6## [3,] 1.0plot(x=points$x, y=points$y)curve(sol[1] + sol[2]*x + sol[3]*x^2, from=min(points$x), to=max(points$x))Problem 3:
Construct and solve the normal equations to find a least squares solution of Ax = b, where
A = matrix(c(c(3,1,1), c(1,2,1)), nc=2, nr=3)A## [,1] [,2]## [1,] 3 1## [2,] 1 2## [3,] 1 1b = c(1,1,1)b## [1] 1 1 1This is just solving using the linear least squares again, but we have ‘A’ and ‘b’ instead of points
So, just use the linear least squares function code to solve for ‘x’
# calculate 't(A)A'AtA = t(A) %*% A# find the inverse of 'AtA'invAtA = solve(AtA)# use the previous values to solve for 'x~'sol = invAtA %*% t(A) %*% b
sol## [,1]## [1,] 0.2000000## [2,] 0.4666667Problem 4:
Consider
A = matrix(c(c(1,1,0,0),c(1,0,-1,-1),c(0,1,1,1),c(0,1,1,0)), nc=4, nr=4)A## [,1] [,2] [,3] [,4]## [1,] 1 1 0 0## [2,] 1 0 1 1## [3,] 0 -1 1 1## [4,] 0 -1 1 0b = c(1,-3,2,4)b## [1] 1 -3 2 4Show that the least squares solution of Ax = b is not unique. Solve the normal equations to find all least squares solutions.
[Hint: The RREF might be helpful.]
if (!require(pracma)) { install.packages("pracma")}## Loading required package: pracmalibrary(pracma)
rref(A)## [,1] [,2] [,3] [,4]## [1,] 1 0 1 0## [2,] 0 1 -1 0## [3,] 0 0 0 1## [4,] 0 0 0 0Theorem 14 in the course notes states that the least squares solution is unique if ‘A’ has linearly independent columns, but this is not the case, since the result of RREF(A) has a zero row.
Therefore the solution to the least squares problem is not unique.
# solving for a general solution# find t(A)AAtA = t(A) %*% AAtA## [,1] [,2] [,3] [,4]## [1,] 2 1 1 1## [2,] 1 3 -2 -1## [3,] 1 -2 3 2## [4,] 1 -1 2 2# find t(A)bAtb = t(A) %*% bAtb## [,1]## [1,] -2## [2,] -5## [3,] 3## [4,] -1# form augmented matrix, and row reduce
aug = matrix(c(AtA, Atb), nr=4, nc=5)rref(aug)## [,1] [,2] [,3] [,4] [,5]## [1,] 1 0 1 0 3## [2,] 0 1 -1 0 -4## [3,] 0 0 0 1 -4## [4,] 0 0 0 0 0# formulate the general solution using solved matrix# shuffling the equations around and solving for each 'xi'...# the free variable is 'x3', meaning it can be anything, for now it's just set to 0x3 = 0x1 = 3 - x3x2 = -4 + x3x4 = -4
# so the solution is:x_tilde = function(x3){ return(c(3,-4,0,-4) + x3 * c(-1, 1, 1, 0))}sol1 = x_tilde(0)sol1## [1] 3 -4 0 -4sol2 = x_tilde(1)sol2## [1] 2 -3 1 -4