Sunteți pe pagina 1din 3

Linear regression with one variable through LMS algorithm

Rodrigo Barbosa de Santis May 21, 2019

1 Introduction

One of the most explored problem in machine learning literature so far is the linear regression, that consists in approximating the straight line – or function – that better fits a set of given points. Thus, we can use the traced line to predict outcomes for unknown values. Several methods have been proposed for linear regression so far, including geometric, mathematics and computational approaches. In this study, it is applied the Least Square algorithm for addressing this particular problem.

2 Materials and methods

2.1 Linear Regression

Linear Regression is one of the oldest and most widely used predictive model in the field of Machine Learning. It approximates a linear function for representing the expected results for unknown data. (Principe et al., 1999) In the present work, a simple linear model with one variable is adopted for fitting the data, shown in Table 1 and plotted in Fig. 1.

Table 1: Regression data



























2.13 1.66 2.05 2.23 2.89 3.04 2.72 3.18 Figure 1: Plot of x versus d The

Figure 1: Plot of x versus d

The model is expressed by (Principe et al., 1999)

d i = wx i + b + ϵ i =

y i + ϵ i


where d, x, y and ϵ are the desired, predictor, linearly fitted and error values for each i = 1, 2, w is the line slope and b is the bias.


N , respectively,


In most instances, it is not possible to find a straight line which fits all values. Therefore, a criterion is determined for estimate which parameters present the best performance on this task. The mean square error (MSE) is one of the most common adopted criterion, calculated by (Principe et al., 1999)

J =









where J is the average sum of square errors and N the amount of sample data to be fitted.

2.2 Least Mean Square (LMS)

The Least Mean Square (LMS) is a parameter search algorithm which minimizes the difference between the linear system output y and the desired response d. The function J(w), called the performance surface – see Fig. 2.2) – is an important tool that helps to visualize how the adaptation of the weights affects the MSE. (Principe et al., 1999)

of the weights affects the MSE. (Principe et al., 1999) Figure 2: Performance surface for the

Figure 2: Performance surface for the regression problem

As our main goal is finding a w that minimizes the J function, each is iteratively calculated, by the Eq. (3),

where k = 1, 2,

, k number of training iterations (or epochs), η is the step size (or learning rate).



The gradient of the performance surface w J(k) is a vector that points toward direction of maximum J, given

w(k + 1) = w(k) ηw J(k)

w J =

w ∂J ≈ −ϵ(k)x(k).


and ϵ(k) is the residual error calculated by ϵ i = d i (b + wx i ). Substituting Eq. (4) in Eq. (3), we obtain the final equation of LMS algorithm: (Principe et al., 1999)


The procedure is executed for n epochs with a fixed learning rate η for approximate result for the weights w . Depending on the initial values and the learning rate, the solution can or cannot converge to a result. One common phenomena faced when η value is too large, is known as rattling, where the algorithm achieves a unstable non-optimal solution (Principe et al., 1999).

w(k + 1) = w(k) + ηϵ(k)x(k).

3 Development

The algorithm is implemented using Python 2.7.8 (Van Rossum, 1998), including the following libraries:

1. NumPy ( – A large set of funcations that allows arrays manipulation;


2. PyLab ( – A scientific library, which provides a group of graphic and chart functions.

The parameters set for the method are summarized in Table 2. All weights w were initialized as 0.00.

Table 2: Parameters set for LMS algorithm

Learning rate 0.1




1,000 10,000


4 Results

The first result, obtained utilizing epochs = 1, 000 and learning rate = 0.01, is shown in Figure 4. The model is described by the linear function f (x) = 0.1568x + 1.1918, with error J = 0.0360.

( x ) = 0 . 1568 x + 1 . 1918 , with error J

Figure 3: Linear model adjusted by the algorithm

Varying the learning rate to 0.1 resulted into an execution error, returning no values for the weights, whilst adopting 0.001 performed a worse solution than the one previously found, with J = 0.1550. The same sensitivity analysis was performed for the number of epochs: increasing it to 10, 000, the method found a slightly better model with J = 0.0336, while lowering it to 100 presented error substantially higher – J = 0.1550.

5 Conclusion

Although the LMS algorithm does not present an analytical solution for the linear regression model, it provides a satisfactory approximation that can be iteratively found in many different application. There are some limitations in the method, as depending on the initial weights it can converge for a local but not global solution, or start rattling and not converging at all. But these are known issues treated by other and the method is also quite important for introducing some concepts developed in other machine learning methods


Principe, J. C., Euliano, N. R., & Lefebvre, W. C. (1999). Neural and adaptive systems: fundamentals through simulations with CD-ROM. John Wiley & Sons, Inc

Van Rossum, G. (1998). Python: a computer language. Version 2.7.8. Amsterdam, Stichting Mathematisch Cen- trum. (