Prompt: Last week’s discussion involved development of a multiple regression model that used miles per gallon as a response variable. Weight and horsepower were predictor variables. You performed an overall F-test to evaluate the significance of your model. This week, you will evaluate the significance of individual predictors. You will use output of Python script from Module Six to perform individual t-tests for each predictor variable. Specifically, you will look at Step 5 of the Python script to answer all questions in the discussion this week.Assignment:Is at least one of the two variables (weight and horsepower) significant in the model? Run the overall F-test and provide your interpretation at 5% level of significance. See Step 5 in the Python script. Include the following in your analysis:Define the null and alternative hypothesis in mathematical terms and in words.

Report the level of significance.

Include the test statistic and the P-value. (Hint: F-Statistic and Prob (F-Statistic) in the output).

Provide your conclusion and interpretation of the test. Should the null hypothesis be rejected? Why or why not?

What is the slope coefficient for the weight variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value, , for weight in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.

What is the slope coefficient for the horsepower variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value, , for horsepower in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.

What is the purpose of performing individual t-tests after carrying out the overall F-test? What are the differences in the interpretation of the two tests?

What is the coefficient of determination of your multiple regression model from Module Six? Provide appropriate interpretation of this statistic.

~~~My Discussion Post:According to my python, the coefficients of my regression equation should be 37.2363 for the constant, -3.7564 for the weight, and -0.0332 for the horsepower. The standard format of multiple linear regression equation is Ŷ = a + b1X1 + b2X2. In this case, our linear equation shall be Y = 37.24 – 3.756 X1 – 0.033 X2, rounded off to 3 decimal places. X1 represents the independent variable weight, and X2 represents the independent variable horsepower. The weight and the horsepower are, in this case, the predictor variables, while miles per gallon represents the response variable, Y. This means that miles per gallon of a vehicle depend on the weight of the vehicle and the horsepower of the vehicle. The p values are greater than their respective test statistic, meaning that the variable coefficients are significant. The slope coefficients include – 3.756 for the weight and – 0.033 for the horsepower. These coefficients are very significant as they represent the mean increase in miles per gallon for every additional 1 unit of weight and horsepower respectively. My Python Script:In [2]:import pandas as pd

from IPython.display import display, HTML

# read data from mtcars.csv data set.

cars_df_orig = pd.read_csv(“https://s3-us-west-2.amazonaws.com/data-analytics.zybooks.com/mtcars.csv”)

# randomly pick 30 observations from the data set to make the data set unique to you.

cars_df = cars_df_orig.sample(n=30, replace=False)

# print only the first five observations in the dataset.

print(“Cars data frame (showing only the first five observations)n”)

display(HTML(cars_df.head().to_html()))

Cars data frame (showing only the first five observations)

Unnamed: 0mpgcyldisphpdratwtqsecvsamgearcarb4Hornet Sportabout18.78360.01753.153.44017.02003214Cadillac Fleetwood10.48472.02052.935.25017.98003415Lincoln Continental10.48460.02153.005.42417.82003428Ford Pantera L15.88351.02644.223.17014.50015411Merc 450SE16.48275.81803.074.07017.400033Step 2: Scatterplot of miles per gallon against weight¶The block of code below will create a scatterplot of the variables “miles per gallon” (coded as mpg in the data set) and “weight” of the car (coded as wt).In [4]:import matplotlib.pyplot as plt

# create scatterplot of variables mpg against wt.

plt.plot(cars_df[“wt”], cars_df[“mpg”], ‘o’, color=’red’)

# set a title for the plot, x-axis, and y-axis.

plt.title(‘MPG against Weight’)

plt.xlabel(‘Weight (1000s lbs)’)

plt.ylabel(‘MPG’)

# show the plot.

plt.show()

Step 3: Scatterplot of miles per gallon against horsepower¶The block of code below will create a scatterplot of the variables “miles per gallon” (coded as mpg in the data set) and “horsepower” of the car (coded as hp).In [6]:import matplotlib.pyplot as plt

# create scatterplot of variables mpg against hp.

plt.plot(cars_df[“hp”], cars_df[“mpg”], ‘o’, color=’blue’)

# set a title for the plot, x-axis, and y-axis.

plt.title(‘MPG against Horsepower’)

plt.xlabel(‘Horsepower’)

plt.ylabel(‘MPG’)

# show the plot.

plt.show()

Step 4: Correlation matrix for miles per gallon, weight and horsepower¶Now you will calculate the correlation coefficient between the variables “miles per gallon” and “weight”. You will also calculate the correlation coefficient between the variables “miles per gallon” and “horsepower”. The corr method of a dataframe returns the correlation matrix with the correlation coefficients between all variables in the dataframe. You will specify to only return the matrix for the three variables.In [8]:# create correlation matrix for mpg, wt, and hp.

# The correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column).

# The correlation coefficient between mpg and hp is contained in the cell for mpg row and hp column (or hp row and mpg column).

mpg_wt_corr = cars_df[[‘mpg’,’wt’,’hp’]].corr()

print(mpg_wt_corr)

mpg wt hp

mpg 1.000000 -0.869627 -0.791014

wt -0.869627 1.000000 0.663993

hp -0.791014 0.663993 1.000000

Step 5: Multiple regression model to predict miles per gallon using weight and horsepower¶This block of code produces a multiple regression model with “miles per gallon” as the response variable, and “weight” and “horsepower” as predictor variables. The ols method in statsmodels.formula.api submodule returns all statistics for this multiple regression model.In [10]:from statsmodels.formula.api import ols

# create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables.

model = ols(‘mpg ~ wt+hp’, data=cars_df).fit()

print(model.summary())

OLS Regression Results

==============================================================================

Dep. Variable: mpg R-squared: 0.838

Model: OLS Adj. R-squared: 0.826

Method: Least Squares F-statistic: 69.75

Date: Wed, 07 Oct 2020 Prob (F-statistic): 2.16e-11

Time: 00:12:13 Log-Likelihood: -69.283

No. Observations: 30 AIC: 144.6

Df Residuals: 27 BIC: 148.8

Df Model: 2

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [0.025 0.975]
——————————————————————————

Intercept 37.2363 1.583 23.518 0.000 33.988 40.485

wt -3.7564 0.632 -5.943 0.000 -5.053 -2.460

hp -0.0332 0.009 -3.686 0.001 -0.052 -0.015

==============================================================================

Omnibus: 4.804 Durbin-Watson: 2.679

Prob(Omnibus): 0.091 Jarque-Bera (JB): 3.519

Skew: 0.825 Prob(JB): 0.172

Kurtosis: 3.307 Cond. No. 574.

==============================================================================

Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

© 2018 |** Intelli Essays Homework Service®**