Oct 24, 2024
Exam corrections (optional) due TODAY at 11:59pm on Canvas
Lab 04 due TODAY at 11:59pm
Team Feedback (from TEAMMATES) due TODAY at 11:59pm
Mid semester survey (strongly encouraged!) by TODAY at 11:59pm
HW 03 due Thursday October 31 at 11:59pm (released after class)
Looking ahead
Project: Exploratory data analysis due October 31
Statistics experience due Tuesday, November 26
No curves on individual exam grades
Exams will be weighted to reflect significant progress throughout semester. There are 2 scenarios:
If Exam 02 score is at least 5 (out of 50) points greater than the Exam 01 score (before corrections), Exam 01 is 13% and Exam 02 is 27% of the final course grade
Otherwise, the exams are 20% each as stated in the syllabus.
A high respiratory rate can potentially indicate a respiratory infection in children. In order to determine what indicates a “high” rate, we first want to understand the relationship between a child’s age and their respiratory rate.
The data contain the respiratory rate for 618 children ages 15 days to 3 years. It was obtained from the Sleuth3 R package and is originally form a 1994 publication “Reference Values for Respiratory Rate in the First 3 Years of Life”.
Variables:
Age
: age in monthsRate
: respiratory rate (breaths per minute)Typically, a “fan-shaped” residual plot indicates the need for a transformation of the response variable Y
There are multiple ways to transform a variable, e.g., Y, 1/Y, log(Y)
log(Y) the most straightforward to interpret, so we use that transformation when possible
When building a model:
Choose a transformation and build the model on the transformed data
Reassess the residual plots
If the residuals plots did not sufficiently improve, try a new transformation!
We want to interpret the model in terms of the original variable
Note
The predicted value
Intercept: When
Slope: For every one unit increase in
Why is the interpretation in terms of a multiplicative change?
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 3.845 | 0.013 | 304.500 | 0 |
Age | -0.019 | 0.001 | -25.839 | 0 |
Interpret the intercept in terms of (1) log(Rate)
and (2) Rate
.
Interpret the effect of Age
in terms of (1) log(Rate)
and (2) Rate
.
Try a transformation on
Suppose we have the following regression equation:
Intercept: When
Slope: When
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 50.135 | 0.632 | 79.330 | 0 |
log_age | -5.982 | 0.263 | -22.781 | 0 |
Interpret the slope and intercept in the context of the data.
Recall the goal of the analysis:
In order to determine what indicates a “high” rate, we first want to understand the relationship between a child’s age and their respiratory rate.
Which is the preferred metric to compare the models -
Rate vs. Age | log(Rate) vs. Age | Rate vs. log(Age) |
---|---|---|
0.477 | 0.52 | 0.457 |
Which model would you choose?
See Log Transformations in Linear Regression for more details about interpreting regression models with log-transformed variables.