Drew Barker

September 13, 2018

Don't use plagiarized sources. Get Your Custom Essay on

Calculating a regression equation

Just from $10/Page

Calculating a regression equation in R is actually pretty easy with the lm() function. The lm() function takes two main inputs, a formula and the dataset we wish to use. To demonstrate I will use a dataset that is pre-built into R called “mtcars”. This dataset contains data on Motor Trend cars. For the purpose of this demonstration, we wish to regress average miles per gallon of a car on weight of the car. First let’s call the dataset (no need to import, since it is already built in) and look at a summary of it.

data(mtcars) #This calls the dataset into the global environment summary(mtcars) #summary of the dataset with sample averages, median, min, etc.

## mpg cyl disp hp ## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0 ## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5 ## Median :19.20 Median :6.000 Median :196.3 Median :123.0 ## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7 ## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0 ## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0 ## drat wt qsec vs ## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000 ## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000 ## Median :3.695 Median :3.325 Median :17.71 Median :0.0000 ## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375 ## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000 ## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000 ## am gear carb ## Min. :0.0000 Min. :3.000 Min. :1.000 ## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000 ## Median :0.0000 Median :4.000 Median :2.000 ## Mean :0.4062 Mean :3.688 Mean :2.812 ## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000 ## Max. :1.0000 Max. :5.000 Max. :8.000

The two variables we are interested in are named “mpg” and “wt” respectively. So let’s run the regression. Take very special note of how we input the formula for the regression. The correct syntax for a formula specifying running a regression of Y on X is Y~X. Pay special attention to the “~”. This is a “tilde” character, not a hyphen or dash.

lm(mpg~wt, data=mtcars) #this is how we use the lm function to run a regression of miles per gallon on car weight.

## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Coefficients: ## (Intercept) wt ## 37.285 -5.344

We see that if we run the lm (which stands for “linear model” by the way) as it is, it calculates the regression and spits out the estimates. In this case we have an estimated intercept of 37.285 and coefficient estimate of -5.344. We can also save the linear model to a new object in the global model exactly as we did before:

reg1<-lm(mpg~wt, data=mtcars) #”saving” the regression output

The linear model output contains much more than just the parameter estimates, such as residuals and fitted values. You will often be interested in revisiting the regression, so saving the output is handy (especially for regressions with much longer formulas). With the regression output saved, we can retrieve the coefficient estimates by extracting them exactly the same as we extracted variables from a dataset:

reg1$coefficients #extracts the coefficient estimates from the linear model

## (Intercept) wt ## 37.285126 -5.344472

Take care when interpreting regression coefficients. Remember that we only have an unbiased estimate of the true causal effect of a treatment variable if the true model is in fact linear (and the zero conditional mean assumption holds). With a coefficient estimate of -5.344 we are tempted to interpret this as: “for every 1,000 lbs (since units of wt are in 1,000s of lbs) of weight, miles per gallon decreases by 5.344”. However, we should be careful to remember that this is only true if our assumptions hold (and even then this is only an estimate of the true causal effect).

Getting the R-squared statistic and standard errors for the regression is easily done with the summary function. In fact, the summary function returns much of what econometricians normally look for in a regression.

summary(reg1) #self-explanatory, summarizes the linear model output

## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.5432 -2.3647 -0.1252 1.4096 6.8727 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 37.2851 1.8776 19.858 < 2e-16 *** ## wt -5.3445 0.5591 -9.559 1.29e-10 *** ## — ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1 ## ## Residual standard error: 3.046 on 30 degrees of freedom ## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446 ## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10

We see that this returns the coefficient estimates, the standard errors, the r-squared (listed under multiple r-squared), and other things which we will discuss later in the class. Remember that these standard errors assume homoskedasticity. In the presence of heteroskedasticity, these estimates will be invalid. We will discuss how to estimate heterskedasticity-robust standard errors when we discuss hypothesis testing.

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

We value our customers and so we ensure that what we do is 100% original..

With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreThe Product ordered is guaranteed to be original. Orders are checked by the most advanced anti-plagiarism software in the market to assure that the Product is 100% original. The Company has a zero tolerance policy for plagiarism.

Read moreThe Free Revision policy is a courtesy service that the Company provides to help ensure Customer’s total satisfaction with the completed Order. To receive free revision the Company requires that the Customer provide the request within fourteen (14) days from the first completion date and within a period of thirty (30) days for dissertations.

Read moreThe Company is committed to protect the privacy of the Customer and it will never resell or share any of Customer’s personal information, including credit card data, with any third party. All the online transactions are processed through the secure and reliable online payment systems.

Read moreBy placing an order with us, you agree to the service we provide. We will endear to do all that it takes to deliver a comprehensive paper as per your requirements. We also count on your cooperation to ensure that we deliver on this mandate.

Read more
The price is based on these factors:

Academic level

Number of pages

Urgency