Calculating a regression equation

(R)egression

Drew Barker

September 13, 2018

Don't use plagiarized sources. Get Your Custom Essay on
Calculating a regression equation
Just from $10/Page
Order Essay

The lm() function

Calculating a regression equation in R is actually pretty easy with the lm() function. The lm() function takes two main inputs, a formula and the dataset we wish to use. To demonstrate I will use a dataset that is pre-built into R called “mtcars”. This dataset contains data on Motor Trend cars. For the purpose of this demonstration, we wish to regress average miles per gallon of a car on weight of the car. First let’s call the dataset (no need to import, since it is already built in) and look at a summary of it.

data(mtcars) #This calls the dataset into the global environment summary(mtcars) #summary of the dataset with sample averages, median, min, etc.

## mpg cyl disp hp ## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0 ## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5 ## Median :19.20 Median :6.000 Median :196.3 Median :123.0 ## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7 ## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0 ## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0 ## drat wt qsec vs ## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000 ## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000 ## Median :3.695 Median :3.325 Median :17.71 Median :0.0000 ## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375 ## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000 ## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000 ## am gear carb ## Min. :0.0000 Min. :3.000 Min. :1.000 ## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000 ## Median :0.0000 Median :4.000 Median :2.000 ## Mean :0.4062 Mean :3.688 Mean :2.812 ## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000 ## Max. :1.0000 Max. :5.000 Max. :8.000

The two variables we are interested in are named “mpg” and “wt” respectively. So let’s run the regression. Take very special note of how we input the formula for the regression. The correct syntax for a formula specifying running a regression of Y on X is Y~X. Pay special attention to the “~”. This is a “tilde” character, not a hyphen or dash.

lm(mpg~wt, data=mtcars) #this is how we use the lm function to run a regression of miles per gallon on car weight.

## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Coefficients: ## (Intercept) wt ## 37.285 -5.344

We see that if we run the lm (which stands for “linear model” by the way) as it is, it calculates the regression and spits out the estimates. In this case we have an estimated intercept of 37.285 and coefficient estimate of -5.344. We can also save the linear model to a new object in the global model exactly as we did before:

reg1<-lm(mpg~wt, data=mtcars) #”saving” the regression output

The linear model output contains much more than just the parameter estimates, such as residuals and fitted values. You will often be interested in revisiting the regression, so saving the output is handy (especially for regressions with much longer formulas). With the regression output saved, we can retrieve the coefficient estimates by extracting them exactly the same as we extracted variables from a dataset:

reg1$coefficients #extracts the coefficient estimates from the linear model

## (Intercept) wt ## 37.285126 -5.344472

Interpreting regression coefficients

Take care when interpreting regression coefficients. Remember that we only have an unbiased estimate of the true causal effect of a treatment variable if the true model is in fact linear (and the zero conditional mean assumption holds). With a coefficient estimate of -5.344 we are tempted to interpret this as: “for every 1,000 lbs (since units of wt are in 1,000s of lbs) of weight, miles per gallon decreases by 5.344”. However, we should be careful to remember that this is only true if our assumptions hold (and even then this is only an estimate of the true causal effect).

R-squared and standard errors

Getting the R-squared statistic and standard errors for the regression is easily done with the summary function. In fact, the summary function returns much of what econometricians normally look for in a regression.

summary(reg1) #self-explanatory, summarizes the linear model output

## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.5432 -2.3647 -0.1252 1.4096 6.8727 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 37.2851 1.8776 19.858 < 2e-16 *** ## wt -5.3445 0.5591 -9.559 1.29e-10 *** ## — ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1 ## ## Residual standard error: 3.046 on 30 degrees of freedom ## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446 ## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10

We see that this returns the coefficient estimates, the standard errors, the r-squared (listed under multiple r-squared), and other things which we will discuss later in the class. Remember that these standard errors assume homoskedasticity. In the presence of heteroskedasticity, these estimates will be invalid. We will discuss how to estimate heterskedasticity-robust standard errors when we discuss hypothesis testing.

Order a unique copy of this paper
(550 words)

Approximate price: $22

Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

We value our customers and so we ensure that what we do is 100% original..
With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

The Product ordered is guaranteed to be original. Orders are checked by the most advanced anti-plagiarism software in the market to assure that the Product is 100% original. The Company has a zero tolerance policy for plagiarism.

Read more

Free-revision policy

The Free Revision policy is a courtesy service that the Company provides to help ensure Customer’s total satisfaction with the completed Order. To receive free revision the Company requires that the Customer provide the request within fourteen (14) days from the first completion date and within a period of thirty (30) days for dissertations.

Read more

Privacy policy

The Company is committed to protect the privacy of the Customer and it will never resell or share any of Customer’s personal information, including credit card data, with any third party. All the online transactions are processed through the secure and reliable online payment systems.

Read more

Fair-cooperation guarantee

By placing an order with us, you agree to the service we provide. We will endear to do all that it takes to deliver a comprehensive paper as per your requirements. We also count on your cooperation to ensure that we deliver on this mandate.

Read more

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
error: Content is protected !!
Open chat
1
You can contact our live agent via WhatsApp! Via +1 817 953 0426

Feel free to ask questions, clarifications, or discounts available when placing an order.

Order your essay today and save 30% with the discount code STARS