Four problems from the Stat2 (Cannon, A.R., Cobb, G.W., et al.) textbook and oth

WRITE MY ESSAY

Four problems from the Stat2 (Cannon, A.R., Cobb, G.W., et al.) textbook and other sources are provided below. Produce appropriate SAS code to assist you in completing the tasks described in these problems. For each problem a data set is provided in Canvas — use an appropriate SAS data step or appropriate import functionality to get the data into SAS. You may copy and paste the data from the text files posted in Canvas into datalines in your SAS code if you choose. Please follow the Guidelines for Submitting Assignments document for submitting your work. You will be uploading two files: one with your SAS code (.sas) and one with your output including your interpretations and explanations (.docx). If one procedure can be used to get results for multiple questions, then that code needs to be only run once. You do not need to include all of your SAS output in the .docx document — copy only the relevant graphs or tables from your initial output then paste them in sequential order into your document for submission. Place your explanations/answers/interpretations directly below each corresponding piece of output. The structure should be like the following: 1(a) Relevant answer to 1(a) 1(b) Relevant answer to 1(b) … 2(a) Relevant answer to 2(a) … 1. A study involving 9 species of animals was conducted to predict an animal’s expected lifespan from their gestation period. The gestation period (in days) and life expectancy (in years) for each of the animals are contained in the AnimalGestation.txt file. Read these data lines in SAS. a. Use Proc Sgplot to produce a scatterplot of gestation period vs. life expectancy (determine the logical choices for the X and Y variables). Comment on any patterns. b. What is the equation of the regression line? Use the appropriate SAS tools to determine parameters and write the equation. c. Produce a plot of the regression line on the scatterplot of the two variables. Does the regression line appear to be a good fit? Explain. d. Analyze appropriate residual plots for the linear model relating gestation period to life expectancy. Are the conditions for the regression model met? Explain. 2. In a water distillation process, the temperature (Temp) and vapor pressure (Pressure) were recorded at various points in time. The data was placed in the file VaporPressure.txt. a. Produce the relevant scatterplot to investigate the relationship of how the temperature impacts the vapor pressure. Comment on what the scatterplot reveals about the relationship. b. Determine the equation of the regression line for predicting vapor pressure from temperature. c. Produce and examine relevant residual plots. Comment on what they reveal about whether the conditions for regression model inference are met by this model. 3. Passer rating is a complex calculation that combines various NFL quarterback statistics. The rating is meant to measure the effectiveness of the quarterback with the higher the rating, the better. Simpler statistics are desired to measure quarterback effectiveness. For a recent season, the passer rating (Rating), completion percentage (Pct), and number of touchdowns completed (TD) were recorded for several starting quarterbacks. The raw data is in the text file called Passer.txt. Create a SAS data step that will produce a data set for this information. a. Create two scatter diagrams with one plotting Pct vs. Rating and the other with TD vs. Rating. Assume that Rating is the response. b. Based on visual inspection of pattern that exists on each plot, which predictor (Pct or TD) would have a stronger fit with the response, Rating? Explain. c. Find the least squares regression line for predicting Rating from the predictor you selected in (b) above. Give the regression equation. d. Comment on the appropriate residual plots to assess adequacy of the model in (c). 4. Average monthly gasoline prices for the states of Nebraska and New York were collected over a 12-month period. Prices tend to track one another with gasoline being typically more expensive in New York. However, from time-to-time regional supply disturbances cause the two prices to track differently. The gas prices from each state are provided in the file Gas_NE_NY.txt. a. Fit a simple linear regression model for predicting the average gasoline price in New York (Price_NY) using the Nebraska gasoline price (Price_NE). Provide the regression equation. b. Provide a fit plot (regression line on the scatterplot). Comment on how well you believe the line fits the data. c. Produce appropriate residual plots to check assumptions. Do the assumptions seem to hold? Explain. d. Regional supply disturbances occur infrequently and are viewed as disruptions to the gas price relationship. Based on your residual plots, identify what you believe to be the two months when regional supply disturbances occurred. Provide graphical and numerical evidence to support your conclusion. e. Since the two months referenced above do not have the same characteristics as the others, one can determine that they do not belong with the others. Remove both months from the data set and refit the regression line. Produce appropriate residual plots. Comment on how the model fits the data now.

WRITE MY ESSAY

Leave a Comment

Scroll to Top