FINAL PROJECT: Abstract and Reader's Reponse > An Exploration of UMD Faculty Salary Data
Looks like a good plan, A. You may want to approach this project with a "classic" observer stance: "I want to have a conversation with the deta, to see what patterns are there." When economists study salaries, they typically divide into quintiles (five bands) from low to high. You might start there.
Then look at central tendencies: mean, median, mode
Then, compare cross colleges/departments.
In this way, you run a number of associations to get a feel for what is there and then you can decide to array information according to some questions. Here are a few ideas.
Identify administrator roles, research roles, instructional roles.
Look at graduate students (will appear as TAs or sometimes Junior Lecturer).
The new category of Professional Track Faculty (PTK) compared to tenured/tenured track is a very promising occasion for you to see large differences in salary as well as numbers. The Bannister article gets at this difference and is an interesting case to emulate.
Does this help you?
Then, you need to pick your visualizations: normal distribution, comparing distributions, and the quintiles approach (classic for income analysis).
Then look at central tendencies: mean, median, mode
Then, compare cross colleges/departments.
In this way, you run a number of associations to get a feel for what is there and then you can decide to array information according to some questions. Here are a few ideas.
Identify administrator roles, research roles, instructional roles.
Look at graduate students (will appear as TAs or sometimes Junior Lecturer).
The new category of Professional Track Faculty (PTK) compared to tenured/tenured track is a very promising occasion for you to see large differences in salary as well as numbers. The Bannister article gets at this difference and is an interesting case to emulate.
Does this help you?
Then, you need to pick your visualizations: normal distribution, comparing distributions, and the quintiles approach (classic for income analysis).
December 10, 2017 |
Marybeth Shea
Given a dataset, there is only so much you can observe by eye. Perhaps you can sort by certain columns, or even scan the data and remember certain values, but this is only through the power of the human eye and memory. Often times, we can use computational analysis techniques to supplement our analyses. Even better, sometimes we are curious about some claim, and would like to validate it. The best way to prove (or disprove) your claim is to use data because no one can dispute data. When presented with some data though, it can be unclear how to break down the next steps of data analysis. Thus, this notebook will serve as a guide to getting started with analyzing datasets using the power of data science. In this specific case, we will use UMD Faculty Salary Data as an example. The techniques shown here can be generalized to any dataset and are only a small subset of what is possible through computational techniques.
READER’S PROFILE:
A skeptical reader might be the University administration, or anyone from UMD who does not want an in-depth analysis of the salary data because it could reveal (unfavorable) trends that would otherwise go unseen by the human eye.
READER’S RESPONSE:
Hmm… where are the good trends? It seems like the writer may have something against the university, through pointing out all these negative trends. Are these visualizations doctored? Could I easily replicate these results as well? If so, these statistics and analysis would be more compelling.
REVISED PROBLEM STATEMENT:
The Diamondback’s yearly salary guides are an often browsed dataset. After it is released, many basic observations can be made about the highest paying faculty, your favorite professors’ salaries, etc. However, any more complicated observations would require the power and knowledge of computational analysis. More generally, given a dataset, it’s not always clear what next steps to take when jumping from initial observations to more complex conclusions and data visualizations. Therefore, I aim to leave the audience with a better understanding of data science protocols and a better idea of trends in UMD faculty and staff salaries.
VOICE:
I will use first person throughout the project because this is both a tutorial+analysis.
CITATION:
I will cite all my references at the end. These will pretty much be links for further reading.