Linear regression belongs to the econometric methods of empirical research, which are applied in almost all sciences. Linear regression is a set of econometric methods of estimating statistical causality between two or more factors (variables of interest). A central assumption of linear regression is the ceteris paribus condition, which means nothing other than “if the conditions are the same” or “if other factors are kept constant“. The literature on linear regression can be found in almost every textbook on statistics, textbooks on econometrics, and research literature on methods of empirical economic research.
- For methodology issues, you can consult the Journal of Econometrics on Elsevier or ScienceDirect.
Simple Linear Model
In a simple linear model, it is assumed that only one variable $x_{1i}$ has significant impact on variable $y_i$ and all unexlained variance is explained by the error term $e_i$.
y_i=\beta_0 + \beta_1 \cdot x_{1i} + e_i \, \text{or} \, y_i=\beta_0+\sum_{j=1}^{k=1} \beta_i \cdot x_{ji} + e_i
Multi-Regression model
y_i=\beta_0 + \beta_1 \cdot x_{1i} + \beta_2 \cdot x_{2i} + ... + \beta_{k-1} \cdot x_{(k-1)i} + \beta_k \cdot x_{ki} + e_i
y_i=\beta_0+\sum_{j=1}^{k} \beta_i \cdot x_{ji} + e_i
Why is the linear regression method used in science?
Scientists use the regression method to explain the statistical causality between two or more factors so that they can identify the potential statistical correlations in their research question. However, a researcher also wants to test whether statistical estimates reflect reality, or at least whether the observation in the sample of his/her observations reflects reality in the population. Regression methods can be used to calculate or analyze relevant factors in a research question. While regression calculation aims to estimate the relevant coefficients (causality estimators), regression analysis aims to test relevant empirical hypotheses (inferential statistics).
Where is linear regression applied in economics and business administration?
Consider the following example. As an economics student, you are used to reading the following statements in almost all economics and business administration textbooks: “The law of demand says that when prices rise, the demand for a normal good falls”. Where does the statement of the law of demand come from? Can this assertion be proven empirically? When is a good a normal good? Although the answers to these questions can be found in any textbook, the background to their justifications and sometimes incomplete explanations are more likely to be found in empirical research using econometric methods.
Economists work with theoretical models that can be empirically tested to determine the extent to which they reflect your research question in reality. In the case of the law of demand, the Cobb-Douglas model can be applied, which assumes constant elasticity of demand. Here is where the first problem arises. This Cobb-Douglas theoretical model is not linear, as required by linear regression methods, but a non-linear (multiplicative) model. Utilizing the logarithm, however, the Cobb-Douglas model can be transformed into a (log-to-log) linear model (linear transformation). With a sufficient sample, a regression model can be estimated to statistically verify the claims. This example is one of many other applications of empirical analysis to test economic theories.
Simple and multi-regression analysis
In econometric regression analysis, a distinction is made between simple regression analysis and multi-regression analysis. In simple regression analysis, two factors are examined, e.g. a macroeconomic hypothesis could be that domestic consumption (C) has a positive influence on domestic income (Y). The propagated causality is that domestic income depends on domestic consumption – Y(C).
Econometric models start with a simple regression between two variables.
The aim of econometrics is to estimate empirical models that confirm or even refute the propagated causality between domestic consumption and domestic income of a country in the given population. The reverse causality, however, is also possible that domestic consumption tends to depend on domestic income – C(Y). Now, such an analysis, Y(C) and C(Y) takes place under the assumption of the ceteris paribus condition.
The simple regression is then extended by further variables – forming the multi-regression model.
Due to the ceteris paribus condition, the explanatory power of the simple model is limited to explaining the potential causality between domestic consumption and domestic income of a country, but without reference to other potential causalities between other (non-) observable factors, e.g. domestic and foreign investment, exports, imports, government expenditure, savings, taxes, etc. For this reason, the simple regression model is extended. If we now extend the propagated causality to other potential causalities, this results in the multi-regression model, e.g. domestic consumption (C) is influenced by disposable income (income (Y) minus taxes (T)) and other factors, the classical macroeconomic theory of consumption according to Keynes.
C(Y_V)=c_1Y_v+c_0