Sunteți pe pagina 1din 3

How to read a regression table?

A regression is a statistical modeling of a relationship between a variable y and one or more independent variables x1, x2, x3, xn. The regression table is a common feature in empirical articles which appear in OB, psychology, and economics journals where the hypothesized relationships are tested through interpretation of the regression. A good way to learn how to read regression tables is to read the text and try to relate the textual matter with the numbers in the regression table. 1. What is the dependent variable (DV)? This is the variable that we are trying to predict/explain. What is desirable in real terms- an increasing DV or a reducing DV? 2. What are the independent variables (IVs)? How are they measured? What does a positive or negative sign on the variables represent? [+ve represents an increase in IV leads to an increase in DV] 3. How many observations have been included in the sample? 4. A regression is an optimization program which tries to derive an equation predicting/explaining the dependent variable in terms of the independent variables. Its general form is DV = f(IVn) + c. 5. One regression table may often consist of the output of a number of regressions. The output of each regression will be listed as a new column. Often these columns are numbered 1, 2, 3,. The first thing to find out is how each of the regression columns is different. If the column headings are different then each column may be predicting different DVs. However, if the columns are simply numbered then each column may be representing a more complete explanation of the same DV. A more complete explanation of the DV often incorporates more of IVs in a step wise fashion. In case you are confused or have less time, a safe option is to look at the rightmost column which often contains the maximum number of IVs and is often the most complete description of the DV. 6. Under each column, the numbers in the regression table represent the following: a. Coefficients of IVs: The values mentioned against each of the IVs represent the coefficient of the IVs. The IVs are often mentioned in the extreme left of the table and each value listed against the IV represents its coefficient pertaining to the regression equation. The coefficients of the IV may be either standardized or nonstandardized. The values mentioned in economics journals are usually nonstandardized whereas; those mentioned in OB/Psychology journals are usually

standardized. In case you are not sure, this is mentioned in the table or will be mentioned in the text. i. Non-standardized coefficients: These coefficients are formed when the analysis variables have retained their units (hence they remain sensitive to units of measurement). They can be used directly to plug in values of IV and predict the value of DV. ii. Standardized coefficients: These coefficients are formed when the analysis variables are standardized (i.e., mean is subtracted from each value & the difference is divided by the standard deviation). They are independent of units and cannot be used to predict the value of the DV in meaningful terms. However, these can be compared amongst each other and the larger the coefficient, the stronger the relationship of the IV with the DV. b. Statistical significance of coefficients of IVs: The statistical significance of the coefficients of the IVs is displayed on the coefficients in the form of one, two, or three * whose explanation is provided at the bottom of the table. The greater the number of *s the more statistically significant the coefficient is. Statistical significance represents the confidence with which we can say that a particular coefficient is non-zero. In case a coefficient is not statistically significant irrespective of the numerical value, we can consider it to be zero. A coefficient of zero means that that independent variable is not related to the dependent variable. Sometimes if the *s are not present and either of the three other indications of statistical significance are used- (i) a footnote in the table mentions that all reported values (above a particular value) are significant.; (ii) the p value is mentioned in parenthesis, p values lesser than .10 are considered significant; (iii) values of the t statistic are mentioned in parenthesis (larger t statistic values implies higher statistical significance). c. Adjusted R2: The adjusted R square value is a value between zero to one which represents the proportion of the dependent variable which can be explained by the set of independent variables in the column. Often in a series of regressions which are displayed in successive columns, you will find the adj. R square value increasing which shows that as we keep adding variables in the regression, the explanatory power of the model gets better.

7. Log terms and Power terms: If the independent variable is a logarithm or a power term i.e., squares or cubes. The relationship between the IV and the DV has to be interpreted accordingly as y = k*ln(x) + constant or y = k*x2 + constant. 8. Mediation: Regression tries to fit the most likely relationship between two or more variables. Suppose a variable A has an impact of B which in turn leads to C (i.e., B is a mediator) and suppose initially we include only A and C in the regression, the regression algorithm is likely to show a strong relationship between A and C. However, once B is entered into the regression, the algorithm identifies that B is more strongly related to C as compared to A. At this time, the earlier coefficient of A will reduce in size or significance. This shows that B mediates (partially or fully) the relationship between A and C. 9. Moderation: Moderation is tested in the regression by entering the cross product of two variables as an independent variable in the regression. Suppose A is related to C and the interaction term, A*B is also entered into the regression equation and we find that the interaction term A*B is significant. This means that B moderates the relationship between A and C. If the coefficient of the interaction term is +ve it means that as the value of B increases, the strength of the relationship between A and C increases. If the coefficient of the interaction term is negative, it means that as the value of B increases, the strength of the relationship between A and C reduces. 10. Statistical and Practical significance: The statistical significance of a relationship can be established by the statistical significance of the coefficients of the independent variables and the value of Adjusted R square. Statistical significance simply means that the independent variable(s) have some relationship with the DV and are able to predict with some certainty the DV. However, this does not necessarily mean that the relationship is practically meaningful. The practical significance of a relationship can be tested by plugging high and low values of IV into the (non-standardized) regression equation and seeing the impact of the values of DV. If the change in DV values is meaningful enough, we can conclude that the relationship is practically significant. 11. Residuals: Residuals are the difference between the actual and the estimated value of the dependent variable (after the regression). There will be one residual value corresponding to each data point. It is also known as error and it represents the unexplained portion of the regression.

S-ar putea să vă placă și