Gram-Schmidt and QR Factorization
Performing QR Factorization using the Gram-Schmidt process, along with some examples.
.rmd knitted with execution results available for download here
Consider the system of equations Ax=b, where:
A = matrix(c(c(2,1,3,0,5),c(3,0,-2,3,0),c(2,-27,6,2,1)), nr=5, nc=3)A## [,1] [,2] [,3]## [1,] 2 3 2## [2,] 1 0 -27## [3,] 3 -2 6## [4,] 0 3 2## [5,] 5 0 1b = c(-1,2,0,1,-2)b## [1] -1 2 0 1 -2Find a QR factorization of A
A QR factorization of ‘A’ consists of two matrices, ‘Q’ and ‘R’, where ‘Q’ is an orthonormal matrix consisting of the vectors generated from the Gram-Schmidt process, and where ‘R’ is an upper triangular matrix, with each entry ‘x_ij’ being the dot product between the vector ‘e_i’ from the Gram-Schmidt process, and the vector ‘a_j’ from the matrix ‘A’
Firstly, use the Gram-Schmidt process on ‘A’
# function for finding the norm of a vector (default function 'norm' was giving errors)normal = function(vec){ return(sqrt(sum(vec^2)))}
# function for projecting 'x' onto 'v'project = function(x, v){ dot = x %*% v nsqr = normal(v)^2 res = (dot / nsqr) res = res * v return(res)}
# the matrix 'Q', same size as 'A', right now filled w/ 'NA'Q = matrix(nr=nrow(A), nc=ncol(A))
# v1 is equal to the first column in 'A'v1 = A[,1]# v2 is equal to the second column in 'A', minus the projection of that column onto v1v2 = A[,2] - project(A[,2], v1)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.# v3 is equal to the third column in 'A', minus the projection of that column onto v1, and minus the projection of that column onto v2v3 = A[,3] - project(A[,3], v1) - project(A[,3], v2)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.
## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.# now normalize the vectors, and place into 'Q'Q[,1] = v1 / normal(v1)
Q[,2] = v2 / normal(v2)Q[,3] = v3 / normal(v3)
Q## [,1] [,2] [,3]## [1,] 0.3202563 0.6396021 0.07188852## [2,] 0.1601282 0.0000000 -0.97049496## [3,] 0.4803845 -0.4264014 0.21566555## [4,] 0.0000000 0.6396021 0.07188852## [5,] 0.8006408 0.0000000 0.03594426# from there, R = Q^T A, soR = round(t(Q) %*% A, 15)R## [,1] [,2] [,3]## [1,] 6.244998 0.000000 0.00000## [2,] 0.000000 4.690416 0.00000## [3,] 0.000000 0.000000 27.82086# and you can get A from multiplying Q and Rround(Q %*% R,15)## [,1] [,2] [,3]## [1,] 2 3 2## [2,] 1 0 -27## [3,] 3 -2 6## [4,] 0 3 2## [5,] 5 0 1# using the qr method...QR = qr(A)# gives the same resultround(qr.Q(QR) %*% qr.R(QR), 15)## [,1] [,2] [,3]## [1,] 2 3 2## [2,] 1 0 -27## [3,] 3 -2 6## [4,] 0 3 2## [5,] 5 0 1I would like to note, that above I rounded the calculation of R and A to the 15th digit due to the fact that the zero entries in R and A were instead +/-1e-16 instead of zero. But the fact that all the non-zero entries were correct led me to believe that it was just an error in the decimal arithmetic, so I just rounded the results.
Now find the least squares solution
The least squares solution using QR factorization is: x~ = R^-1 Q^T b
# using my 'Q' and 'R'xtilde1 = solve(R) %*% t(Q) %*% bxtilde1## [,1]## [1,] -0.25641026## [2,] 0.00000000## [3,] -0.07235142A %*% xtilde1## [,1]## [1,] -0.6575234## [2,] 1.6970781## [3,] -1.2033393## [4,] -0.1447028## [5,] -1.3544027# using the 'qr' methodxtilde2 = solve(qr.R(QR)) %*% t(qr.Q(QR)) %*% bxtilde2## [,1]## [1,] -2.564103e-01## [2,] 5.142561e-17## [3,] -7.235142e-02A %*% xtilde2## [,1]## [1,] -0.6575234## [2,] 1.6970781## [3,] -1.2033393## [4,] -0.1447028## [5,] -1.3544027Q2
Find the best approximation to a solution to the following system:
- x + y - z = 2
- -y + 2z = 6
- 3x +2y - z = 11
- -x + z = 0
This is pretty standard at this point, treat it as a least squares problem and solve for ‘x~’ using: x~ = inv(t(A)A)t(A)b
# the system of equations can be represented by a matrix and a vector:
# where each column in 'A' represents the coefficients of the x, y, and z variablesA = matrix(c(c(1,0,3,-1), c(1,-1,2,0), c(-1,2,-1,1)), nc=3, nr=4)A## [,1] [,2] [,3]## [1,] 1 1 -1## [2,] 0 -1 2## [3,] 3 2 -1## [4,] -1 0 1# where 'b' is the values each equation is equal tob = c(2,6,11,0)b## [1] 2 6 11 0# solve for x~xtilde = solve(t(A) %*% A) %*% t(A) %*% bxtilde## [,1]## [1,] 3.818182## [2,] 1.727273## [3,] 3.818182# see how close the solution is, if it wasn't an approximation this should give a zero vectorA %*% xtilde - b## [,1]## [1,] -2.727273e-01## [2,] -9.090909e-02## [3,] 9.090909e-02## [4,] -3.996803e-15Q5 (GS computation only)
Apply the Gram-Schmidt process to the basis of the subspace ‘W’ to obtain an orthonormal basis of ‘W’
The subspace ‘W’ is described as the span of the following vectors:
x1 = c(1,2,-2,1)x1## [1] 1 2 -2 1x2 = c(1,1,0,2)x2## [1] 1 1 0 2x3 = c(1,8,1,0)x3## [1] 1 8 1 0Now to calculate the orthonormal basis of ‘W’
# u1 is just x1 normalizedu1 = x1 / normal(x1)
# v2 is x2 minus the projection of x2 onto x1v2 = x2 - project(x2, x1)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.# normalize, and round b/c there's a floating point error againu2 = round(v2 / normal(v2),15)
# v3 is x3 minus the projection of x3 onto x1 and the projection of x3 onto x2v3 = x3 - project(x3, x1) - project(x3, x2)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.
## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.# normalizeu3 = v3 / normal(v3)
# so the orthonormal basis of 'W' is the following three vectors:u1## [1] 0.3162278 0.6324555 -0.6324555 0.3162278u2## [1] 0.2672612 0.0000000 0.5345225 0.8017837u3## [1] -0.2760262 0.4830459 0.5520524 -0.6210590Q6
Find an orthonormal basis for Real^3 that contains a vector parallel to:
v1 = c(1,2,3)Since the orthonormal basis must contain a vector parallel to ‘v1’, this means that the vector in the orthonormal basis is a scalar multiple of ‘v1’, logically if follows that this will be the normalized version of ‘v1’
u1 = v1 / normal(v1)u1## [1] 0.2672612 0.5345225 0.8017837Next we need to find any other vector that is orthogonal to ‘v1’, such as:
v2 = c(1,1,-1)v1 %*% v2## [,1]## [1,] 0Since the dot product is zero, we know that these two vectors are orthogonal, additionally the projection of ‘v2’ onto ‘v1’ will be the zero vector due to this
project(v2,v1)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.## [1] 0 0 0So if we’re applying the Gram-Schmidt process to ‘v2’, it will just result in ‘v2’
Now we can normalize ‘v2’
u2 = v2 / normal(v2)Now we apply to Gram-Schmidt process to any vector to get one that is orthogonal to both ‘v1’ and ‘v2’
x3 = c(1,0,0)v3 = x3 - project(x3, v1) - project(x3, v2)## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.
## Warning in res * v: Recycling array of length 1 in array-vector arithmetic is deprecated.## Use c() or as.vector() instead.v3## [1] 0.5952381 -0.4761905 0.1190476# then normalizeu3 = v3 / normal(v3)Now we have an orthonormal basis for Real^3, that contains a vector parallel to ‘v1’ as follows:
u1## [1] 0.2672612 0.5345225 0.8017837u2## [1] 0.5773503 0.5773503 -0.5773503u3## [1] 0.7715167 -0.6172134 0.1543033