Feineigle.com - Basics of Linear Algebra for Machine Learning - Discover the Mathematical Language of Data in Python

Home · Book Reports · 2018 · Basics of Linear Algebra for Machine Learning - Discover the Mathematical Language of Data in Python

Published: May 29, 2018
Tags:  Math · Programming · Python



The book in...
One sentence:
A straightforward, simple (maybe too simple) introduction to linear algebra and how to carry out common operations using python and numpy.

Five sentences:
Slow to start but then picks up the pace as it moves along. gIt is not proof heavy; if you want a full blown linear algebra book this is not it. There are plenty of examples of how to implement various linear algebra operations in python/numpy that are extremely well explained. For someone familiar with the concepts, either python or linear algebra, it may seem a little remedial. All in all it was a little too simple for my tastes (but an enjoyable, breezy, casual read), but as an introduction used before moving on to more advanced books, or for someone that wants to dive right into machine learning who is OK with a less nitty-gritty and more practical approach, it is ideal.

designates my notes. / designates important.


Thoughts

Overall I’d say it is far too simple, but would be a great to whet your appetite if you didn’t know anything.

The pace eventually picks up into an extremely enjoyable read, although still too simple. It is, nevertheless, straight to the point and not proof heavy.

It would certainly make a good introduction if you want to go farther, but does offer enough to let you start using machine learning with more understanding (although not a deep understanding, which might not be necessary).

It includes lots of clear code examples for very basic calculations, but no mention of any actual applications.

It does have a lot of repetition at the beginning and end of each chapter; restating goals. It could probably have been done in 150 pages instead of 200. The links are mostly to wikipedia articles and the links to books on amazon are redundant; the same half dozen books are linked over and over. Still, easy enough to glide past them.

Books


Table of Contents


Part 1: Introduction

· 01: Introduction to Linear Algebra

Part 2: Foundations

· 02: Linear Algebra and Machine Learning

· 03: Examples of Linear Algebra in Machine Learning

page 14:

Part 3: Numpy

· 04: Introduction to NumPy Arrays

· 05: Index, Slice and Reshape NumPy Arrays

· 06: NumPy Array Broadcasting

page 36:
page 40:
A.shape = (2 x 3)
b.shape = (3)
A.shape = (2 x 3)
b.shape = (1 x 3)
A.shape = (2 x 3)
b.shape = (1)
A.shape = (2 x 3)
b.shape = (1 x 1)
A.shape = (2 x 3)
b.shape = (1 x 2)

Part 4: Matrices

· 07: Vectors and Vector Arithmetic

· 08: Vector Norms

page 53:
page 54:
L1(v) = ||v||_1  (8.1)
||v||_1 = |a_1| + |a_2| + |a_3|  (8.2)
>>> # vector L1 norm
>>> from numpy import array
>>> from numpy.linalg import norm
>>> # define vector
>>> a = array([1, 2, 3])
>>> print(a)
[1 2 3]
>>> # calculate norm
>>> l1 = norm(a, 1)
>>> print(l1)
6.0
page 55:
>>> # vector L2 norm
>>> from numpy import array
>>> from numpy.linalg import norm
>>> # define vector
>>> a = array([1, 2, 3])
>>> print(a)
[1 2 3]
>>> # calculate norm
>>> l2 = norm(a) #default param for L2
>>> print(l2)
3.74165738677
page 56:
L^inf (v) = ||v||_inf  (8.5)
||v||_inf = max a1, a2, a3  (8.6)
>>> # vector max norm
>>> from math import inf
>>> from numpy import array
>>> from numpy.linalg import norm
>>> # define vector
>>> a = array([1, 2, 3])
>>> print(a)
[1 2 3]
>>> # calculate norm
>>> maxnorm = norm(a, inf)
>>> print(maxnorm)
3.0

· 09: Matrices and Matrix Arithmetic

page 62:
page 64:
C(m, k) = A(m, n) · B(n, k)
page 66:

· 10: Types of Matrices

page 72:
page 73:
page 74:
page 75:
page 76:
page 77:
Q · Q^T = I
page 78:

-Orthogonal matrices are useful tools as they are computationally cheap and stable to calculate their inverse as simply their transpose.

· 11: Matrix Operations

page 82:
page 83:
page 84:
page 85:

· 12: Sparse Matrices

page 90:
page 91:
sparsity = count of zero elements / total elements
page 93:
page 94:
>>> # sparse matrix
>>> from numpy import array
>>> from scipy.sparse import csr_matrix
>>> # create dense matrix
>>> A = array([
[1, 0, 0, 1, 0, 0],
[0, 0, 2, 0, 0, 1],
[0, 0, 0, 2, 0, 0]])

>>> print(A)
[[1 0 0 1 0 0]
[0 0 2 0 0 1]
[0 0 0 2 0 0]]

>>> # convert to sparse matrix (CSR method)
>>> S = csr_matrix(A)
>>> print(S)
>>> (0, 0) 1
>>> (0, 3) 1
>>> (1, 2) 2
>>> (1, 5) 1
>>> (2, 3) 2
>>> 
>>> # reconstruct dense matrix
>>> B = S.todense()
>>> print(B)
[[1 0 0 1 0 0]
[0 0 2 0 0 1]
[0 0 0 2 0 0]]
page 95:
sparsity = 1.0 - count_nonzero(A) / A.size
>>> # sparsity calculation
>>> from numpy import array
>>> from numpy import count_nonzero
>>> # create dense matrix
>>> A = array([
>>> [1, 0, 0, 1, 0, 0],
>>> [0, 0, 2, 0, 0, 1],
>>> [0, 0, 0, 2, 0, 0]])
>>> print(A)
[[1 0 0 1 0 0]
[0 0 2 0 0 1]
[0 0 0 2 0 0]]
>>> # calculate sparsity
>>> sparsity = 1.0 - (count_nonzero(A) / A.size)
>>> print(sparsity)
0.7222222222222222

· 13: Tensors and Tensor Arithmetic

page 099:
page 104:
page 105:

Part 5: Factorization

· 14: Matrix Decompositions

page 110:
A = LU  (14.2)
>>> # LU decomposition
>>> from numpy import array
>>> from scipy.linalg import lu
>>> # define a square matrix
>>> A = array([
>>> [1, 2, 3],
>>> [4, 5, 6],
>>> [7, 8, 9]])
>>> print(A)
[[1 2 3]
[4 5 6]
[7 8 9]]


>>> # factorize
>>> P, L, U = lu(A)
>>> print(P)
[[ 0. 1. 0.]
 [ 0. 0. 1.]
 [ 1. 0. 0.]]

>>> print(L)
[[ 1.         0.  0. ]
 [ 0.14285714 1.  0. ]
 [ 0.57142857 0.5 1. ]]

>>> print(U)
[[ 7.00000000e+00 8.00000000e+00  9.00000000e+00]
 [ 0.00000000e+00 8.57142857e-01  1.71428571e+00]
 [ 0.00000000e+00 0.00000000e+00 -1.58603289e-16]]

>>> # reconstruct
>>> B = P.dot(L).dot(U)
>>> print(B)
[[ 1. 2. 3.]
 [ 4. 5. 6.]
 [ 7. 8. 9.]]
page 111:
A = QR  (14.5)
page 112:
>>> # QR decomposition
>>> from numpy import array
>>> from numpy.linalg import qr
>>> # define rectangular matrix
>>> A = array([
>>> [1, 2],
>>> [3, 4],
>>> [5, 6]])
>>> print(A)
[[1 2]
 [3 4]
 [5 6]]

>>> # factorize
>>> Q, R = qr(A, complete)
>>> print(Q)
[[-0.16903085 0.89708523 0.40824829]
[-0.50709255 0.27602622 -0.81649658]
[-0.84515425 -0.34503278 0.40824829]]

>>> print(R)
[[-5.91607978 -7.43735744]
 [ 0.          0.82807867]
 [ 0.          0. ]]

>>> # reconstruct
>>> B = Q.dot(R)
>>> print(B)
[[ 1. 2.]
 [ 3. 4.]
 [ 5. 6.]]
A = L L^T  (14.7)
page 113:
A = U^T U  (14.8)

· 15: Eigendecomposition

page 117:
Av = λv
A = QΛQT  (15.4)
page 118:
page 119:
>>> # eigendecomposition
>>> from numpy import array
>>> from numpy.linalg import eig
>>> # define matrix
>>> A = array([
>>> [1, 2, 3],
>>> [4, 5, 6],
>>> [7, 8, 9]])
>>> print(A)
[[1 2 3]
 [4 5 6]
 [7 8 9]]

>>> # factorize
>>> values, vectors = eig(A)
>>> 15.5. Confirm an Eigenvector and Eigenvalue
>>>  119
>>> print(values)
[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]

>>> print(vectors)
[[-0.23197069 -0.78583024 0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735 0.61232756 0.40824829]]
>>> # confirm eigenvector
>>> from numpy import array
>>> from numpy.linalg import eig
>>> # define matrix
>>> A = array([
>>> [1, 2, 3],
>>> [4, 5, 6],
>>> [7, 8, 9]])
>>> # factorize
>>> values, vectors = eig(A)
>>> # confirm first eigenvector
>>> B = A.dot(vectors[:, 0])
>>> print(B)
[ -3.73863537 -8.46653421 -13.19443305]

>>> C = vectors[:, 0] * values[0]
>>> print(C)
[ -3.73863537 -8.46653421 -13.19443305]
page 120:
>>> # reconstruct matrix
>>> from numpy import diag
>>> from numpy.linalg import inv
>>> from numpy import array
>>> from numpy.linalg import eig
>>> # define matrix
>>> A = array([
>>> [1, 2, 3],
>>> [4, 5, 6],
>>> [7, 8, 9]])
>>> print(A)
[[1 2 3]
 [4 5 6]
 [7 8 9]]

>>> # factorize
>>> values, vectors = eig(A)
>>> # create matrix from eigenvectors
>>> Q = vectors
>>> # create inverse of eigenvectors matrix
>>> R = inv(Q)
>>> # create diagonal matrix from eigenvalues
>>> L = diag(values)
>>> # reconstruct the original matrix
>>> B = Q.dot(L).dot(R)
>>> print(B)
[[ 1. 2. 3.]
 [ 4. 5. 6.]
 [ 7. 8. 9.]]

· 16: Singular Value Decomposition

page 124:
A = U Σ V^T  (16.1)
page 125:
>>> # singular-value decomposition
>>> from numpy import array
>>> from scipy.linalg import svd
>>> # define a matrix
>>> A = array([
>>> [1, 2],
>>> [3, 4],
>>> [5, 6]])
>>> print(A)
[[1 2]
 [3 4]
 [5 6]]

>>> # factorize
>>> U, s, V = svd(A)
>>> print(U)
[[-0.2298477   0.88346102  0.40824829]
 [-0.52474482  0.24078249 -0.81649658]
 [-0.81964194 -0.40189603  0.40824829]]

>>> print(s)
[ 9.52551809 0.51430058]

>>> print(V)
[[-0.61962948 -0.78489445]
 [-0.78489445  0.61962948]]
U (m × m) · Σ(m × · T (n ×m) V n)  (16.2)
U (m × m) · Σ(m × n) · V T (n × n)  (16.3)
>>> # reconstruct rectangular matrix from svd
>>> from numpy import array
>>> from numpy import diag
>>> from numpy import zeros
>>> from scipy.linalg import svd
>>> # define matrix
>>> A = array([
>>> [1, 2],
>>> [3, 4],
>>> [5, 6]])
>>> print(A)
[[1 2]
 [3 4]
 [5 6]]

>>> # factorize
>>> U, s, V = svd(A)
>>> # create m x n Sigma matrix
>>> Sigma = zeros((A.shape[0], A.shape[1]))
>>> # populate Sigma with n x n diagonal matrix
>>> Sigma[:A.shape[1], :A.shape[1]] = diag(s)
>>> # reconstruct matrix
>>> B = U.dot(Sigma.dot(V))
>>> print(B)
[[ 1. 2.]
 [ 3. 4.]
 [ 5. 6.]]
page 126:
>>> # reconstruct square matrix from svd
>>> from numpy import array
>>> from numpy import diag
>>> from scipy.linalg import svd
>>> # define matrix
>>> A = array([
>>> [1, 2, 3],
>>> [4, 5, 6],
>>> [7, 8, 9]])
>>> print(A)
[[1 2 3]
 [4 5 6]
 [7 8 9]]

>>> # factorize
>>> U, s, V = svd(A)
>>> # create n x n Sigma matrix
>>> Sigma = diag(s)
>>> # reconstruct matrix
>>> B = U.dot(Sigma.dot(V))
>>> print(B)
[[ 1. 2. 3.]
 [ 4. 5. 6.]
 [ 7. 8. 9.]]
page 127:
A^+ = V D^+ U^T  (16.5)
A = U Σ V^T  (16.6)
page 128:
>>> # pseudoinverse
>>> from numpy import array
>>> from numpy.linalg import pinv
>>> # define matrix
>>> A = array([
>>> [0.1, 0.2],
>>> [0.3, 0.4],
>>> [0.5, 0.6],
>>> [0.7, 0.8]])
>>> print(A)
[[0.1 0.2]
 [0.3 0.4]
 [0.5 0.6]
 [0.7 0.8]]

>>> # calculate pseudoinverse
>>> B = pinv(A)
>>> print(B)
[[ -1.00000000e+01 -5.00000000e+00 9.04289323e-15  5.00000000e+00]
 [  8.50000000e+00  4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
page 129:
A^+ = V^T D^T U^T
>>># pseudoinverse via svd
>>>from numpy import array
>>>from numpy.linalg import svd
>>>from numpy import zeros
>>>from numpy import diag
>>># define matrix
>>>A = array([
>>>[0.1, 0.2],
>>>[0.3, 0.4],
>>>[0.5, 0.6],
>>>[0.7, 0.8]])
>>>print(A)
[[0.1 0.2]
 [0.3 0.4]
 [0.5 0.6]
 [0.7 0.8]]

>>> # factorize
>>> U, s, V = svd(A)
>>> # reciprocals of s
>>> d = 1.0 / s
>>> # create m x n D matrix
>>> D = zeros(A.shape)
>>> # populate D with n x n diagonal matrix
>>> D[:A.shape[1], :A.shape[1]] = diag(d)
>>> # calculate pseudoinverse
>>> B = V.T.dot(D.T).dot(U.T)
>>> print(B)
[[ -1.00000000e+01 -5.00000000e+00 9.04831765e-15  5.00000000e+00]
[   8.50000000e+00  4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
B = U · Σ_k · V_k^T  (16.10)
page 130:
>>> # data reduction with svd
>>> from numpy import array
>>> from numpy import diag
>>> from numpy import zeros
>>> from scipy.linalg import svd
>>> # define matrix
>>> A = array([
>>> [1,2,3,4,5,6,7,8,9,10],
>>> [11,12,13,14,15,16,17,18,19,20],
>>> [21,22,23,24,25,26,27,28,29,30]])
>>> print(A)
[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]

>>> # factorize
>>> U, s, V = svd(A)
>>> # create m x n Sigma matrix
>>> Sigma = zeros((A.shape[0], A.shape[1]))
>>> # populate Sigma with n x n diagonal matrix
>>> Sigma[:A.shape[0], :A.shape[0]] = diag(s)
>>> # select
>>> n_elements = 2
>>> Sigma = Sigma[:, :n_elements]
>>> V = V[:n_elements, :]
>>> # reconstruct
>>> B = U.dot(Sigma.dot(V))
>>> print(B)
[[ 1.   2.  3.  4.  5.  6.  7.  8.  9. 10.]
 [ 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.]
 [ 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.]]

>>> # transform
>>> T = U.dot(Sigma)
>>> print(T)
[[-18.52157747  6.47697214]
 [-49.81310011  1.91182038]
 [-81.10462276 -2.65333138]]

>>> T = A.dot(V.T)
>>> print(T)
[[-18.52157747  6.47697214]
 [-49.81310011  1.91182038]
 [-81.10462276 -2.65333138]]
page 131:
>>> # svd data reduction in scikit-learn
>>> from numpy import array
>>> from sklearn.decomposition import TruncatedSVD
>>> # define matrix
>>> A = array([
>>> [1,2,3,4,5,6,7,8,9,10],
>>> [11,12,13,14,15,16,17,18,19,20],
>>> [21,22,23,24,25,26,27,28,29,30]])
>>> print(A)
[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]

>>> # create transform
>>> svd = TruncatedSVD(n_components=2)
>>> # fit transform
>>> svd.fit(A)
>>> # apply transform
>>> result = svd.transform(A)
>>> print(result)
[[ 18.52157747  6.47697214]
 [ 49.81310011  1.91182038]
 [ 81.10462276 -2.65333138]]

Part 6: Statistics

· 17: Introduction to Multivariate Statistics

page 138:
>>> # matrix means
>>> from numpy import array
>>> from numpy import mean
>>> # define matrix
>>> M = array([
>>> [1,2,3,4,5,6],
>>> [1,2,3,4,5,6]])
>>> print(M)
[[1 2 3 4 5 6]
 [1 2 3 4 5 6]]

>>> # column means
>>> col_mean = mean(M, axis=0)
>>> print(col_mean)
[1. 2. 3. 4. 5. 6.]

>>> # row means
>>> row_mean = mean(M, axis=1)
>>> print(row_mean)
[3.5 3.5]
page 139:
>>> # vector variance
>>> from numpy import array
>>> from numpy import var
>>> # define vector
>>> v = array([1,2,3,4,5,6])
>>> print(v)
[1 2 3 4 5 6]

>>> # calculate variance
>>> result = var(v, ddof=1)
>>> print(result)
3.5
page 140:
>>> # matrix variances
>>> from numpy import array
>>> from numpy import var
>>> # define matrix
>>> M = array([
>>> [1,2,3,4,5,6],
>>> [1,2,3,4,5,6]])
>>> print(M)
[[1 2 3 4 5 6]
 [1 2 3 4 5 6]]

>>> # column variances
>>> col_var = var(M, ddof=1, axis=0)
>>> print(col_var)
[ 0. 0. 0. 0. 0. 0.]

>>> # row variances
>>> row_var = var(M, ddof=1, axis=1)
>>> print(row_var)
[ 3.5 3.5]
s = √σ2
>>> # matrix standard deviation
>>> from numpy import array
>>> from numpy import std
>>> # define matrix
>>> M = array([
>>> [1,2,3,4,5,6],
>>> [1,2,3,4,5,6]])
>>> print(M)
[[1 2 3 4 5 6]
 [1 2 3 4 5 6]]

>>> # column standard deviations
>>> col_std = std(M, ddof=1, axis=0)
>>> print(col_std)
[0. 0. 0. 0. 0. 0.]

>>> # row standard deviations
>>> row_std = std(M, ddof=1, axis=1)
>>> print(row_std)
[1.87082869 1.87082869]
page 141:
page 142:
r = cov(X,Y) / (sX × sY)  (17.17)
>>> # vector correlation
>>> from numpy import array
>>> from numpy import corrcoef
>>> # define first vector
>>> x = array([1,2,3,4,5,6,7,8,9])
>>> print(x)
[1 2 3 4 5 6 7 8 9]

>>> # define second vector
>>> y = array([9,8,7,6,5,4,3,2,1])
>>> print(y)
[9 8 7 6 5 4 3 2 1]

>>> # calculate correlation
>>> corr = corrcoef(x,y)[0,1]
>>> print(corr)
-1.0
page 143:
>>># covariance matrix
>>>from numpy import array
>>>from numpy import cov
>>># define matrix of observations
>>>X = array([
>>>[1, 5, 8],
>>>[3, 5, 11],
>>>[2, 4, 9],
>>>[3, 6, 10],
>>>[1, 5, 10]])
>>>print(X)
[[1, 5, 8],
 [3, 5, 11],
 [2, 4, 9],
 [3, 6, 10],
 [1, 5, 10]]

>>> # calculate covariance matrix
>>> Sigma = cov(X.T)
>>> print(Sigma)
[[ 1.   0.25 0.75]
 [ 0.25 0.5  0.25]
 [ 0.75 0.25 1.3 ]]

· 18: Principal Component Analysis

page 148:
page 149:
# principal component analysis
from numpy import array
from numpy import mean
from numpy import cov
from numpy.linalg import eig
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
[[1, 2],
 [3, 4],
 [5, 6]]

# column means
M = mean(A.T, axis=1)
# center columns by subtracting column means
C = A - M
# calculate covariance matrix of centered matrix
V = cov(C.T)
# factorize covariance matrix
values, vectors = eig(V)
print(vectors)
[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]

print(values)
[8. 0.]

# project data
P = vectors.T.dot(C.T)
print(P.T)
[[-2.82842712 0. ]
 [ 0.         0. ]
 [ 2.82842712 0. ]]
page 150:
>>> # principal component analysis with scikit-learn
>>> from numpy import array
>>> from sklearn.decomposition import PCA
>>> # define matrix
>>> A = array([
>>> [1, 2],
>>> [3, 4],
>>> [5, 6]])
>>> print(A)
[[1, 2],
 [3, 4],
 [5, 6]]

>>> # create the transform
>>> pca = PCA(2)
>>> # fit transform
>>> pca.fit(A)
>>> # access values and vectors
>>> print(pca.components_)
[[ 0.70710678  0.70710678]
 [ 0.70710678 -0.70710678]]

>>> print(pca.explained_variance_)
[ 8.00000000e+00 2.25080839e-33]

>>> # transform data
>>> B = pca.transform(A)
>>> print(B)
[[ -2.82842712e+00  2.22044605e-16]
 [  0.00000000e+00  0.00000000e+00]
 [  2.82842712e+00 -2.22044605e-16]]

· 19: Linear Regression

page 155:
>>> # linear regression dataset
>>> from numpy import array
>>> from matplotlib import pyplot
>>> # define dataset
>>> data = array([
>>> [0.05, 0.12],
>>> [0.18, 0.22],
>>> [0.31, 0.35],
>>> [0.42, 0.38],
>>> [0.5, 0.49]])
>>> print(data)
[[0.05, 0.12],
 [0.18, 0.22],
 [0.31, 0.35],
 [0.42, 0.38],
 [0.5, 0.49]]

>>> # split into inputs and outputs
>>> X, y = data[:,0], data[:,1]
>>> X = X.reshape((len(X), 1))
>>> # scatter plot
>>> pyplot.scatter(X, y)
>>> pyplot.show()
[[0.05 0.12]
 [0.18 0.22]
 [0.31 0.35]
 [0.42 0.38]
 [0.5  0.49]]
page 156:
page 157:
>>> # direct solution to linear least squares
>>> from numpy import array
>>> from numpy.linalg import inv
>>> from matplotlib import pyplot
>>> # define dataset
>>> data = array([
>>> [0.05, 0.12],
>>> [0.18, 0.22],
>>> [0.31, 0.35],
>>> [0.42, 0.38],
>>> [0.5, 0.49]])

>>> # split into inputs and outputs
>>> X, y = data[:,0], data[:,1]
>>> X = X.reshape((len(X), 1))
>>> # linear least squares
>>> b = inv(X.T.dot(X)).dot(X.T).dot(y)
>>> print(b)
[1.00233226]

>>> # predict using coefficients
>>> yhat = X.dot(b)

>>> # plot data and predictions
>>> pyplot.scatter(X, y)
>>> pyplot.plot(X, yhat, color=red)
>>> pyplot.show()
page 158:
A = Q · R  (19.13)
b = R^−1 · Q^T · y  (19.14)
>>> # QR decomposition solution to linear least squares
>>> from numpy import array
>>> from numpy.linalg import inv
>>> from numpy.linalg import qr
>>> from matplotlib import pyplot
>>> # define dataset
>>> data = array([
>>> [0.05, 0.12],
>>> [0.18, 0.22],
>>> [0.31, 0.35],
>>> [0.42, 0.38],
>>> [0.5, 0.49]])
>>> # split into inputs and outputs

>>> X, y = data[:,0], data[:,1]
>>> X = X.reshape((len(X), 1))

>>> # factorize
>>> Q, R = qr(X)
>>> b = inv(R).dot(Q.T).dot(y)
>>> print(b)
[1.00233226]

>>> # predict using coefficients
>>> yhat = X.dot(b)

>>> # plot data and predictions
>>> pyplot.scatter(X, y)
>>> pyplot.plot(X, yhat, color=red)
>>> pyplot.show()
page 159:
page 160:
X = U Σ V^T  (19.15)
b = X^+ y  (19.16)

Where the pseudoinverse X + is calculated as following:

X^+ = U D^+ V^T  (19.17)
>>> # SVD solution via pseudoinverse to linear least squares
>>> from numpy import array
>>> from numpy.linalg import pinv
>>> from matplotlib import pyplot
>>> # define dataset
>>> data = array([
>>> [0.05, 0.12],
>>> [0.18, 0.22],
>>> [0.31, 0.35],
>>> [0.42, 0.38],
>>> [0.5, 0.49]])

>>> # split into inputs and outputs
>>> X, y = data[:,0], data[:,1]
>>> X = X.reshape((len(X), 1))
>>> # calculate coefficients
>>> b = pinv(X).dot(y)
>>> print(b)
[ 1.00233226]

>>> # predict using coefficients
>>> yhat = X.dot(b)
>>> # plot data and predictions
>>> pyplot.scatter(X, y)
>>> pyplot.plot(X, yhat, color=red)
>>> pyplot.show()
page 162:
>>> # least squares via convenience function
>>> from numpy import array
>>> from numpy.linalg import lstsq
>>> from matplotlib import pyplot
>>> # define dataset
>>> data = array([
>>> [0.05, 0.12],
>>> [0.18, 0.22],
>>> [0.31, 0.35],
>>> [0.42, 0.38],
>>> [0.5, 0.49]])

>>> # split into inputs and outputs
>>> X, y = data[:,0], data[:,1]
>>> X = X.reshape((len(X), 1))
>>> # calculate coefficients
>>> b, residuals, rank, s = lstsq(X, y)
>>> print(b)
[1.00233226]

>>> # predict using coefficients
>>> yhat = X.dot(b)
>>> # plot data and predictions
>>> pyplot.scatter(X, y)
>>> pyplot.plot(X, yhat, color=red)
>>> pyplot.show()

· Appendix

page 190: