Linear Regression

Linear regression belongs to the econometric methods of empirical research, which are applied in almost all sciences. Linear regression is a set of econometric methods of estimating statistical causality between two or more factors (variables of interest). A central assumption of linear regression is the ceteris paribus condition, which means nothing other than “if the conditions are the same” or “if other factors are kept constant“. The literature on linear regression can be found in almost every textbook on statistics, textbooks on econometrics, and research literature on methods of empirical economic research.

Simple Linear Model

In a simple linear model, it is assumed that only one variable $x_{1i}$ has significant impact on variable $y_i$ and all unexlained variance is explained by the error term $e_i$.

y_i=\beta_0 + \beta_1 \cdot x_{1i} + e_i \, \text{or} \, y_i=\beta_0+\sum_{j=1}^{k=1} \beta_i \cdot x_{ji} + e_i

Multi-Regression model

y_i=\beta_0 + \beta_1 \cdot x_{1i} + \beta_2 \cdot x_{2i} + ... + \beta_{k-1} \cdot x_{(k-1)i} + \beta_k \cdot x_{ki} + e_i
y_i=\beta_0+\sum_{j=1}^{k} \beta_i \cdot x_{ji} + e_i

Why is the linear regression method used in science?

Scientists use the regression method to explain the statistical causality between two or more factors so that they can identify the potential statistical correlations in their research question. However, a researcher also wants to test whether statistical estimates reflect reality, or at least whether the observation in the sample of his/her observations reflects reality in the population. Regression methods can be used to calculate or analyze relevant factors in a research question. While regression calculation aims to estimate the relevant coefficients (causality estimators), regression analysis aims to test relevant empirical hypotheses (inferential statistics).

Where is linear regression applied in economics and business administration?

Consider the following example. As an economics student, you are used to reading the following statements in almost all economics and business administration textbooks: “The law of demand says that when prices rise, the demand for a normal good falls”. Where does the statement of the law of demand come from? Can this assertion be proven empirically? When is a good a normal good? Although the answers to these questions can be found in any textbook, the background to their justifications and sometimes incomplete explanations are more likely to be found in empirical research using econometric methods.

Economists work with theoretical models that can be empirically tested to determine the extent to which they reflect your research question in reality. In the case of the law of demand, the Cobb-Douglas model can be applied, which assumes constant elasticity of demand. Here is where the first problem arises. This Cobb-Douglas theoretical model is not linear, as required by linear regression methods, but a non-linear (multiplicative) model. Utilizing the logarithm, however, the Cobb-Douglas model can be transformed into a (log-to-log) linear model (linear transformation). With a sufficient sample, a regression model can be estimated to statistically verify the claims. This example is one of many other applications of empirical analysis to test economic theories.

Simple and multi-regression analysis

In econometric regression analysis, a distinction is made between simple regression analysis and multi-regression analysis. In simple regression analysis, two factors are examined, e.g. a macroeconomic hypothesis could be that domestic consumption (C) has a positive influence on domestic income (Y). The propagated causality is that domestic income depends on domestic consumption – Y(C).

Econometric models start with a simple regression between two variables.

The aim of econometrics is to estimate empirical models that confirm or even refute the propagated causality between domestic consumption and domestic income of a country in the given population. The reverse causality, however, is also possible that domestic consumption tends to depend on domestic income – C(Y). Now, such an analysis, Y(C) and C(Y) takes place under the assumption of the ceteris paribus condition.

The simple regression is then extended by further variables – forming the multi-regression model.

Due to the ceteris paribus condition, the explanatory power of the simple model is limited to explaining the potential causality between domestic consumption and domestic income of a country, but without reference to other potential causalities between other (non-) observable factors, e.g. domestic and foreign investment, exports, imports, government expenditure, savings, taxes, etc. For this reason, the simple regression model is extended. If we now extend the propagated causality to other potential causalities, this results in the multi-regression model, e.g. domestic consumption (C) is influenced by disposable income (income (Y) minus taxes (T)) and other factors, the classical macroeconomic theory of consumption according to Keynes.


What is Econometrics?

Econometrics is part of economics as a science. It deals with the statistical (empirical) modelling of economic theories (hypotheses) in order to explain, confirm or disprove economic theory empirically. In economic theory, causalities between two (or more) relevant measures are assumed, e.g. the relationship between income (Y) and consumption (C) of a household. Two causalities can be suspected: (1) The consumption of a household depends on its income: C(Y) or (2) the income of a household depends on its consumption: Y(C). Both statements (theory/hypotheses) are not opposed to each other (no contradiction), but are inversely related to each other (inverse causality). Using a sample or total population of households a statistical unit of interest and econometric methods it is possible to test both hypotheses for their internal validity and external validity.


Scope of Econometrics

Econometrics quantifies the theoretical hypotheses (economic theory) by testing corresponding empirical statements (empirical model). If we hypothesize that the consumption of individual households has a positive relationship with household income, the resulting empirical model should confirm or disprove the positive relationship between consumption and household income in the sample and in the household population.

Exam preparation for Econometrics

We offer exam preparation for econometrics as well as for other fields of economics, e.g. macroeconomics, microeconomics, business mathematics, statistics etc. Our goal is to have a positive and enriching effect on your learning process through professional support. Thus we would like to help you to understand both simple and complex economic methods in the respective subject. Book your personal appointment for exam preparation for Econometrics today. We will accompany your learning process carefully and help you to understand both simple and complex econometric methods.

Exam preparation for students in Freiburg

Contact us and find out more about our examination preparation for students in Freiburg. We are your emergency service for exams in Freiburg!

How to explain the Omitted Variable Bias

In regression analysis, the omitted-variable-bias is the error that is incurred on partial-effects-coefficients of other explanatory variables in a restricted regression model. Assume a simple regression model, where Variable $y_i$ is explained by the Variable $x_{1i}$ and the error term $e_i$ for $i=[1,2,3, … , n]$ observations:

y_i=\beta_0+\beta_1\cdot x_{1i}+ e_i \,, \forall i=[ 1, 2, 3, …,n]

Then consider the hypothesis, that a Variable $x_{2i}$ explains the dependent variable $y_i$ and can be depicted by the following extended regression model:

y_i=\tilde\beta_0 + \tilde\beta_1 \cdot x_{1i} + \tilde\beta_2 \cdot x_{2i} + v_i

Setting both equations equal and solving for the error term of the simple:

e_i=(\tilde\beta_0-\beta_0)+(\tilde\beta_1-\beta_1)\cdot x_{1i} +\tilde\beta_2 \cdot x_{2i} + v_i

The error term $e_i$ in the simple regression model includes the deviation of $\tilde\beta_0$ and $\tilde\beta_2$ of the extended regression model from the former coefficients. The partial effects of the omitted variables $\tilde\beta_2$ and the error term of the extended regression model $v_i$ are also included in the error term of the simple regression model. Two factors play a role in the quantification of the omitted-variable-bias:

  1. Partial effects of the omitted-variable on the explained variable.
  2. Correlation and Covariance of the omitted variable with the rest of the explanatory variables

Partial effects of Omitted Variable and Correlation with Other Explanatory Variables

Two outcomes are possible: either there is no bias or there is a positive bias or negative bias on the partial effects of other explanatory variables in the restricted model.

A. No Bias Scenario

If the omitted-variable has zero partial effects in the unrestricted model or zero correlation/covariance (independence between explanatory variables) there is no bias incurred on other partial effects in the restricted model.

B. Negative Bias Scenario

Negative (positive) partial effects of omitted-variable and positive (negative) correlation with other explanatory variables simultaneously leads to a negative bias on the partial effects of other partial effects of explanatory variables in the restricted model. In this case the signs are in opposite terms (+ and – ).

C. Positive Bias Scenario

Positive partial effects of omitted-variable and positive correlation with other explanatory variables simultaneously lead to a negative bias on the partial effects of other partial effects of explanatory variables in the restricted model. Similarly, if we simultaneously have negative signs. In this case we have two possible constellations ( + and +) or (- and – ).

More Economic Topics

    %d bloggers like this: