Graph Regularized Linear Regression
Document
Description
Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The differences stem from the introduction of a bias into the parameter estimation through the use of various regularization strategies. One of the more popular ones is ridge regression which uses ℓ2-penalization of the parameter vector. In this work, the proposed graph regularized linear estimator is pitted against the popular ridge regression when the parameter vector is known to be dense. When additional knowledge that parameters are smooth with respect to a graph is available, it can be used to improve the parameter estimates. To achieve this goal an additional smoothing penalty is introduced into the traditional loss function of ridge regression. The mean squared error(m.s.e) is used as a performance metric and the analysis is presented for fixed design matrices having a unit covariance matrix. The specific problem setup enables us to study the theoretical conditions where the graph regularized estimator out-performs the ridge estimator. The eigenvectors of the laplacian matrix indicating the graph of connections between the various dimensions of the parameter vector form an integral part of the analysis. Experiments have been conducted on simulated data to compare the performance of the two estimators for laplacian matrices of several types of graphs – complete, star, line and 4-regular. The experimental results indicate that the theory can possibly be extended to more general settings taking smoothness, a concept defined in this work, into consideration.