You are an analyst for a large company consisting of regional sales teams across the country. Twice every year, this company promotes some of its salespeople. Promotion is at the discretion of the head of each regional sales team, taking into consideration financial performance, customer satisfaction ratings, recent performance ratings and personal judgment.
You are asked by the management of the company to conduct an analysis to determine how the factors of financial performance, customer ratings and performance ratings influence the likelihood of a given salesperson being promoted. You are provided with a data set containing data for the last three years of salespeople considered for promotion.1
For this analysis, you will focus on understanding the association between performance rating and promotion using the following variables:
promoted: A binary value indicating 1 if the individual was promoted and 0 if not
performance: the most recent performance rating prior to promotion, from 1 (lowest) to 4 (highest)
What is the probability a randomly selected employee got promoted in the last three years?
What is the probability a randomly selected employee got the highest rating on the performance review in the last three years?
What are the odds a randomly selected employee got the highest rating on the performance review in the last three years?
# add code here
Exercise 2
What is the probability a randomly selected employee who got the lowest performance rating was promoted in the last three years?
What are the odds a randomly selected employee who got the lowest performance rating was promoted in the last three years?
# add code here
Exercise 3
Make a plot to visualize the relationship between being promoted and performance review. Use the plot to describe the relationship between the two variables.
# add code here
Exercise 4
How do the odds a randomly selected employee with the highest performance rating got promoted in the last three years compare to the odds for an employee with the lowest performance rating?
How do the odds a randomly selected employee with a performance rating of 2 got promoted in the last three years compare to the odds for an employee with the lowest performance rating?
# add code here
Exercise 5
We can use a logistic regression model to understand the relationship between performance rating and promotion. (We will discuss this in detail next class, but will get a preview for now.)
Let \(\pi\) be the probability a randomly selected employee was promoted in the last three years. The statistical model is
The code and output to fit this model is shown below:
promotion_fit <-glm(promoted ~ performance, data = employees, family ="binomial")tidy(promotion_fit) |>kable(digits =3)
term
estimate
std.error
statistic
p.value
(Intercept)
-1.609
0.346
-4.646
0.000
performance2
0.386
0.414
0.931
0.352
performance3
1.137
0.392
2.899
0.004
performance4
1.792
0.440
4.075
0.000
Interpret the intercept in the context of the data in terms of the log-odds of being promoted.
Interpret the coefficient of performance2 in the context of the data in terms of the log-odds of being promoted.
Interpret the coefficient of performance2 in the context of the data in terms of the odds of being promoted. How does this compare to your response to Exercise 4.
[add response here]
Submission
To submit the AE:
Render the document to produce the PDF with all of your work from today’s class.
Push all your work to your AE repo on GitHub. You’re done! 🎉