Professor: Joe Roith
Office: Usually in the upstairs room, or at the kitchen table
Email: roith1@stolaf.edu
Note: I will respond to emails as quickly as possible during the week before 5 PM. I may respond to emails in the evenings or on weekends, but do not rely on it.
Drop-in office hours: You can book appointment slots with me on M/T/Th 2 - 3 PM or Wed 1 - 2 PM. Each appointment slot is 15 min.
You may also schedule an appointment with me outside of these times (check my calendar for availability).
Class meetings: M/T/Th 10:00 - 11:30 AM and 12:30 - 2:00 PM
Course computing: R and RStudio
Textbook: Introductory Statistics with Randomization and Simulation (2014), David M. Diez, Christopher D. Barr, Mine Çetinkaya-Rundel
Available free online or from Amazon for under $10
Companion website: https://www.openintro.org/stat/textbook.php
Link to the “real” syllabus: Shadow Syllabus for all your classes
Statistics is the science of learning from data. By now you are aware that vast amounts of data are collected every moment in a variety of settings like political polls, clinical trials, stock markets, and social media user metrics to name just a few. Statistical methods are especially critical to the sciences, as they are our only real way to test theories, quantify natural phenomenon, create accurate predictions, and make evidence-based decisions.
Some of the practical ways in which statistics is used in the sciences include:
Medicine (monitoring patient information and history for more accurate diagnosis and prognosis)
Biostatistics (designing and analyzing clinical trials and epidemiological studies)
Bioinformatics (studying DNA data and human genetics as related to disease)
Actuary (performing analysis in the insurance and superannuation industry)
Ecology (environmental monitoring, species management, land surveying)
Climatology (weather forecasting using historical data)
Demography (studying the dynamics of human populations)
Psychometrics (constructing instruments for educational or psychological measurement)
Image processing (aiding in computer vision, facial detection, and remote sensing)
Statistics is a unique field of study as it lends itself equally to any of the areas mentioned above, but is built on its own theories and rules. We constantly evolve approaches to data collection, analysis, and interpretation. Although we use mathematics, statistics is quite different. Be prepared to think and read critically in this class. In addition, there is a language and vocabulary of statistics that is important to use properly.
To learn ways of investigating questions involving statistical concepts.
To develop basic skills in three key areas of statistics:
data collection – methods for obtaining meaningful data
data analysis – methods for exploring, organizing, describing, and modeling data
statistical inference – making decisions with data
To develop an understanding of statistical concepts and an ability to interpret the results of statistical analyses and to communicate those results using clear and precise statistical language.
To receive an exposure to statistical problems from a wide variety of sources (medical studies, newspaper surveys, sociology studies, etc.) and encompassing a variety of data types and collection methodologies.
To obtain practical experience in study planning, data collection, and the written communication of statistical concepts through individual homework assignments and group projects.
This course is centered on the idea that you will better understand and retain important statistical concepts if you build your own knowledge and practice using it, rather than by memorizing and regurgitating a set of facts. In order to actively construct knowledge in statistics, you must:
Engage in the material and think carefully about it; there are rarely rote, black and white solutions in statistics.
Become skillful at using R, a software package for exploring, modeling, and making decisions with data. You’ll have opportunities to use R inside and outside of class.
Expect small amounts of daily homework, decent-sized weekly homework sets, and longer projects which allow you to pull your knowledge together.
This is a 6 week course, there will be a lot of work over a short period of time. You will have to be self-motivated to stay on task, and you will need to ask me questions when you have them.
My goal for this course is have it be both synchronous and asynchronous. We will meet for every morning session (10:00 - 11:30 AM) unless I inform otherwise. Afternoon sessions will sometimes be optional. I will send a weekly email with my plan for each class.
My expectations of you before each class:
Read the appropriate sections in the textbook (reading guides are available on Moodle, but I will not collect/grade them).
Complete any online tutorials assigned for that day.
Read any supplemental material posted.
Review the worksheet if it is your turn to lead discussion in your group.
I expect your participation in our classes. They will often include small and large group work. Occasionally you will be asked to lead a small group on a worksheet or task. This is not meant to intimidate you, put you on the spot, or force you to be the only contributor that day. It is meant to develop your leadership skills, promote interaction with classmates, and give me a chance to hear everyone’s voice. This will be a safe environment where mistakes and uncertainty are welcome!
Zoom is terrible… I’d much rather see you in person and watch your eyes roll at my terrible jokes. But, we’re doing the right thing this summer by not meeting in person. The hardest thing for me as a teacher is the mute button. Class is so much better when I can see and hear you. So I’m not requiring you to have your video on during class, but I highly encourage it (feel free to use a background if you’d like). And please unmute yourself when you have something to say.
This course will use R extensively. Course datasets and code are easily available on the R server which can be accessed by the link on Moodle. Supplemental materials will be provided on Moodle for learning to use R. Further instructions will be provided during class.
Learning R is necessary to do statistics. There is a learning curve and you will make mistakes, it just happens (and still happens to me). It is important that you take a breath, step back, and remember to seek help to resolve your issue. Come to my office hours, ask classmates for help, and ask me questions in class. Past students have found learning R to be ultimately very useful and even fun. R is freely downloadable for both Mac and PC at https://cran.r-project.org/, and it’s available on a St. Olaf server!
Often during our classes, I will ask you to share your screen and code. That means you may have to split your screen to watch and type at the same time. Keep up with the class or your group and be prepared to talk through your code or output.
Your course grade will be determined as follows:
| Grade Subgroup | Weight |
|---|---|
| Homework Assignments | 15% |
| Quizzes | 15% |
| Midterm Exam | 20% |
| Data Analysis Reports (2) | 10% each |
| Participation and Class Project | 10% |
| Final Exam | 20% |
College wide grading benchmarks can be found at: http://catalog.stolaf.edu/academic-regulations-procedures/grades/
There will be weekly homework assignments. Homework assignments are designed to give you practice applying new statistical concepts to new data contexts. Homework will be drawn from the exercises at the end of each chapter as well as additional questions. The homework assignments are long, so you should work through them as we go along. Many of the problems require computation. Code snippets in these cases will usually be provided.
You are encouraged to discuss problems together, but each person must hand in their own work.
You must show your work for full credit.
Homework is ALWAYS due by 11:59 PM each Friday. Anything after this will be assessed a late penalty. This includes late assignments due to technical issues. Plan ahead.
I expect that you will start soon after receiving the assignment. The assignments are definitely not designed to be one-night jobs.
Homework is completed using RMarkdown and you are required to upload both the RMarkdown (.Rmd) file and knitted pdf file to Moodle.
We will have several short Moodle quizzes each week. You will have 1 hour to complete the quiz. You must complete the quiz independently. You may use the textbook, notes, slides, worksheets, tutorials, or any other materials when taking the quizzes. All weekly quizzes must be completed by 11:59 PM each Friday.
The midterm and final exams will focus on your abilities to use the statistical software, to interpret results, to express an understanding of statistical concepts, and to engage in statistical thinking on open-ended questions. They will not focus on plug-and-chug mathematics or hairy mathematical proofs. Make-up exams will be granted only under very special circumstances, and only if arranged in advance.
Statistics is best learned by getting hands-on with real data. And while learning the correct analysis is a part of this course, communicating and reading statistical analysis is another large part. You will complete 2 Data Analysis Reports for this class. These are short (2-3 page) summaries of the entire statistical process for a research question of your choosing. More information will be provided around the 2nd week of class.
We have a special client during this semester. The Twin Cities Berry Company has asked for some help analyzing data for their farm. Throughout the semester we will use this data in examples and meet with the growers to eventually create a final statistical report (as an entire class) to present the results of some experiments. Your ideas and participation for this aspect of the class will be recorded in addition to your regular participation in class.
You can all be successful in this class! If you are struggling or if you’re feeling good about things but have some questions, there are several resources:
Come see me during my office hours, or make an appointment.
Connect with your classmates, make friends and form study groups. Other than improving your learning, you’ll get to talk with someone who doesn’t live in the same home as you!
Visit the Academic Support Center if you want to improve your general study skills and habits.
I am committed to supporting the learning of all students in my class. If you have already registered with Disability and Access (DAC) and have your letter of accommodations, please meet with me as soon as possible to discuss, plan, and implement your accommodations in the course. If you have or think you have a disability (learning, sensory, physical, chronic health, mental health or attention), please contact Disability and Access staff at 507-786-3288 or by visiting wp.stolaf.edu/academic-support/dac.
In keeping with St. Olaf College’s mission statement, this class strives to be an inclusive learning community, respecting those of differing backgrounds and beliefs. As a community, we aim to be respectful to all citizens in this class, regardless of race, ethnicity, religion, gender or sexual Orientation.
I am committed to making course content accessible to all students. If English is not your first language and this causes you concern about the course, please speak with me.
Plagiarism, the unacknowledged appropriation of another person’s words or ideas, is a serious academic offense. It is imperative that you hand in work that is your own, and that cites or gives credit to others whenever you draw from their work. Please see St. Olaf’s statements on academic integrity and plagiarism at: https://wp.stolaf.edu/thebook/academic/integrity/. See also the description of St. Olaf’s honor system at: https://wp.stolaf.edu/honorcouncil/
St. Olaf’s Academic Integrity Policy, including the Honor System, is an integral part of your academic experience. I consider any violation of this code to be extremely serious and will handle each case appropriately. Here are some guidelines for this class. They do not cover all eventualities so if you have any doubts about a course of action you can ask me.
Homework assignments may be done in collaboration with other students (this is highly encouraged). However, the final product must written by you, in your own words, unless group assignments have been specifically allowed.
In no event can you copy answers from another student, a website, solutions manuals, or elsewhere.
When you sign your pledge on an exam that you have “neither given nor received assistance, and seen no dishonest work” I treat your signature as your solemn pledge that all your actions have been honorable. For example, if we have a take-home exam, you are assuring me that you shared no information with others, that you did not solicit or receive help from anyone besides me, etc.
Don’t treat the honor code lightly; if you’re in doubt about a possible violation, ask me.
Tentative Outline of topics: The following table provides a rough sketch of the topics we’ll cover during specific weeks, along with the associated reading assignments in our textbook:
| Week | Topics | Book Sections |
|---|---|---|
| Week 1 (6/1 - 6/5) | Introductions; Intro to R; Big picture of statistics; Basics of data; Exploratory Data Analysis; Start hypothesis tests | Ch. 1; Ch. 2.1 |
| Week 2 (6/8 - 6/12) | Randomization tests for categorical variables; symbols; p-values; Normal distribution and Central Limit Theorem | Ch. 2 |
| Week 3 (6/15 - 6/19) | Confidence intervals; Formal tests for categorical variables; Chi-square tests | Ch. 2.8; Ch. 3 |
| Exam #1 opens Thursday (6/18) closes Friday (6/19) | ||
| Week 4 (6/22 - 6/26) | Inference for single mean; Paired data; Difference in two means; ANOVA | Ch. 4 |
| DAR #1 due Friday (6/26) | ||
| Week 5 (6/29 - 7/3) | Correlation; Linear Regression; Transformations | Ch. 5 |
| Finalize Report for Twin Cities Berry Company | ||
| Week 6 (7/6 - 7/10) | Multiple regression; cushion for unfinished topics; Review | Ch. 6 |
| DAR #2 due Thursday (7/9) | ||
| Final Exam opens Thursday (7/9) closes Friday (7/10) |