Log Transformations in Linear Regression

Note

The following supplemental notes were created by Dr. Maria Tackett for STA 210. They are provided for students who want to dive deeper into the mathematics behind regression and reflect some of the material covered in STA 211: Mathematics of Regression. Additional supplemental notes will be added throughout the semester.

This document provides details about the model interpretation when the predictor and/or response variables are log-transformed. For simplicity, we will discuss transformations for the simple linear regression model as shown in Equation 1.

y=β0+β1x(1) y = \beta_0 + \beta_1 x \qquad(1)

All results and interpretations can be easily extended to transformations in multiple regression models.

Note: log refers to the natural logarithm.

Log-transformation on the response variable

Suppose we fit a linear regression model with log(y)\log(y), the log-transformed yy, as the response variable. Under this model, we assume a linear relationship exists between xx and log(y)\log(y), such that log(y)N(β0+β1x,σ2)\log(y) \sim N(\beta_0 + \beta_1 x, \sigma^2) for some β0\beta_0, β1\beta_1 and σ2\sigma^2. In other words, we can model the relationship between xx and log(y)\log(y) using the model in Equation 2.

log(y)=β0+β1x(2) \log(y) = \beta_0 + \beta_1 x \qquad(2)

If we interpret the model in terms of log(y)\log(y), then we can use the usual interpretations for slope and intercept. When reporting results, however, it is best to give all interpretations in terms of the original response variable yy, since interpretations using log-transformed variables are often more difficult to truly understand.

In order to get back on the original scale, we need to use the exponential function (also known as the anti-log), exp{x}=ex\exp\{x\} = e^x. Therefore, we use the model in Equation 2 for interpretations and predictions, we will use Equation 3 to state our conclusions in terms of yy.

exp{log(y)}=exp{β0+β1x}y=exp{β0+β1x}y=exp{β0}exp{β1x}(3) \begin{aligned} &\exp\{\log(y)\} = \exp\{\beta_0 + \beta_1 x\} \\[10pt] \Rightarrow &y = \exp\{\beta_0 + \beta_1 x\} \\[10pt] \Rightarrow &y = \exp\{\beta_0\}\exp\{\beta_1 x\} \end{aligned} \qquad(3)

In order to interpret the slope and intercept, we need to first understand the relationship between the mean, median and log transformations.

Mean, Median, and Log Transformations

Suppose we have a dataset y that contains the following observations:

[1] 3 5 6 7 8

If we log-transform the values of y then calculate the mean and median, we have

mean_log_y median_log_y
1.70503 1.79176

If we calculate the mean and median of y, then log-transform the mean and median, we have

log_mean log_median
1.75786 1.79176

This is a simple illustration to show

  1. Mean[log(y)]log[Mean(y)]\text{Mean}[{\log(y)}] \neq \log[\text{Mean}(y)] - the mean and log are not commutable

  2. Median[log(y)]=log[Median(y)]\text{Median}[\log(y)] = \log[\text{Median}(y)] - the median and log are commutable

Interpretaton of model coefficients

Using Equation 2, the mean log(y)\log(y) for any given value of xx is β0+β1x\beta_0 + \beta_1 x; however, this does not indicate that the mean of y=exp{β0+β1x}y = \exp\{\beta_0 + \beta_1 x\} (see previous section). From the assumptions of linear regression, we assume that for any given value of xx, the distribution of log(y)\log(y) is Normal, and therefore symmetric. Thus the median of log(y)\log(y) is equal to the mean of log(y)\log(y), i.e Median(log(y))=β0+β1x\text{Median}(\log(y)) = \beta_0 + \beta_1 x.

Since the log and the median are commutable, Median(log(y))=β0+β1xMedian(y)=exp{β0+β1x}\text{Median}(\log(y)) = \beta_0 + \beta_1 x \Rightarrow \text{Median}(y) = \exp\{\beta_0 + \beta_1 x\}. Thus, when we log-transform the response variable, the interpretation of the intercept and slope are in terms of the effect on the median of yy.

Intercept: The intercept is expected median of yy when the predictor variable equals 0. Therefore, when x=0x=0,

log(y)=β0+β1×0=β0y=exp{β0} \begin{aligned} &\log(y) = \beta_0 + \beta_1 \times 0 = \beta_0 \\[10pt] \Rightarrow &y = \exp\{\beta_0\} \end{aligned}

Interpretation: When x=0x=0, the median of yy is expected to be exp{β0}\exp\{\beta_0\}.

Slope: The slope is the expected change in the median of yy when xx increases by 1 unit. The change in the median of yy is

exp{[β0+β1(x+1)][β0+β1x]}=exp{β0+β1(x+1)}exp{β0+β1x}=exp{β0}exp{β1x}exp{β1}exp{β0}exp{β1x}=exp{β1} \exp\{[\beta_0 + \beta_1 (x+1)] - [\beta_0 + \beta_1 x]\} = \frac{\exp\{\beta_0 + \beta_1 (x+1)\}}{\exp\{\beta_0 + \beta_1 x\}} = \frac{\exp\{\beta_0\}\exp\{\beta_1 x\}\exp\{\beta_1\}}{\exp\{\beta_0\}\exp\{\beta_1 x\}} = \exp\{\beta_1\}

Thus, the median of yy for x+1x+1 is exp{β1}\exp\{\beta_1\} times the median of yy for xx.

Interpretation: When xx increases by one unit, the median of yy is expected to multiply by a factor of exp{β1}\exp\{\beta_1\}.

Log-transformation on the predictor variable

Suppose we fit a linear regression model with log(x)\log(x), the log-transformed xx, as the predictor variable. Under this model, we assume a linear relationship exists between log(x)\log(x) and yy, such that yN(β0+β1log(x),σ2)y \sim N(\beta_0 + \beta_1 \log(x), \sigma^2) for some β0\beta_0, β1\beta_1 and σ2\sigma^2. In other words, we can model the relationship between log(x)\log(x) and yy using the model in #eq-log-x.

y=β0+β1log(x)(4) y = \beta_0 + \beta_1 \log(x) \qquad(4)

Intercept: The intercept is the mean of yy when log(x)=0\log(x) = 0, i.e. x=1x = 1.

Interpretation: When x=1x = 1 (log(x)=0)(\log(x) = 0), the mean of yy is expected to be β0\beta_0.

Slope: The slope is interpreted in terms of the change in the mean of yy when xx is multiplied by a factor of CC, since log(Cx)=log(x)+log(C)\log(Cx) = \log(x) + \log(C). Thus, when xx is multiplied by a factor of CC, the change in the mean of yy is

[β0+β1log(Cx)][β0+β1log(x)]=β1[log(Cx)log(x)]=β1[log(C)+log(x)log(x)]=β1log(C) \begin{aligned} [\beta_0 + \beta_1 \log(Cx)] - [\beta_0 + \beta_1 \log(x)] &= \beta_1 [\log(Cx) - \log(x)] \\[10pt] & = \beta_1[\log(C) + \log(x) - \log(x)] \\[10pt] & = \beta_1 \log(C) \end{aligned}

Thus the mean of yy changes by β1log(C)\beta_1 \log(C) units.

Interpretation: When xx is multiplied by a factor of CC, the mean of yy is expected to change by β1log(C)\beta_1 \log(C) units. For example, if xx is doubled, then the mean of yy is expected to change by β1log(2)\beta_1 \log(2) units.

Log-transformation on the the response and predictor variable

Suppose we fit a linear regression model with log(x)\log(x), the log-transformed xx, as the predictor variable and log(y)\log(y), the log-transformed yy, as the response variable. Under this model, we assume a linear relationship exists between log(x)\log(x) and log(y)\log(y), such that log(y)N(β0+β1log(x),σ2)\log(y) \sim N(\beta_0 + \beta_1 \log(x), \sigma^2) for some β0\beta_0, β1\beta_1 and σ2\sigma^2. In other words, we can model the relationship between log(x)\log(x) and log(y)\log(y) using the model in Equation 5.

log(y)=β0+β1log(x)(5) \log(y) = \beta_0 + \beta_1 \log(x) \qquad(5)

Because the response variable is log-transformed, the interpretations on the original scale will be in terms of the median of yy (see the section on the log-transformed response variable for more detail).

Intercept: The intercept is the mean of yy when log(x)=0\log(x) = 0, i.e. x=1x = 1. Therefore, when log(x)=0\log(x) = 0,

log(y)=β0+β1×0=β0y=exp{β0} \begin{aligned} &\log(y) = \beta_0 + \beta_1 \times 0 = \beta_0 \\[10pt] \Rightarrow &y = \exp\{\beta_0\} \end{aligned}

Interpretation: When x=1x = 1 (log(x)=0)(\log(x) = 0), the median of yy is expected to be exp{β0}\exp\{\beta_0\}.

Slope: The slope is interpreted in terms of the change in the median yy when xx is multiplied by a factor of CC, since log(Cx)=log(x)+log(C)\log(Cx) = \log(x) + \log(C). Thus, when xx is multiplied by a factor of CC, the change in the median of yy is

exp{[β0+β1log(Cx)][β0+β1log(x)]}=exp{β1[log(Cx)log(x)]}=exp{β1[log(C)+log(x)log(x)]}=exp{β1log(C)}=Cβ1 \begin{aligned} \exp\{[\beta_0 + \beta_1 \log(Cx)] - [\beta_0 + \beta_1 \log(x)]\} &= \exp\{\beta_1 [\log(Cx) - \log(x)]\} \\[10pt] & = \exp\{\beta_1[\log(C) + \log(x) - \log(x)]\} \\[10pt] & = \exp\{\beta_1 \log(C)\} = C^{\beta_1} \end{aligned}

Thus, the median of yy for CxCx is Cβ1C^{\beta_1} times the median of yy for xx.

Interpretation: When xx is multiplied by a factor of CC, the median of yy is expected to multiple by a factor of Cβ1C^{\beta_1}. For example, if xx is doubled, then the median of yy is expected to multiply by 2β12^{\beta_1}.