## Statistical Linear Models

**Math 158, Spring 2018**

Jo Hardin

2351 Millikan

jo.hardin@pomona.edu

**Office Hours: Tues 8:30-10:30am, Thurs 1:30-3:30pm, or by appointment
**

**Mentor Sessions:**

Bradley Druzinsky: Tues 6-8pm, Millikan 1021 (Emmy Noether Room)

Texts: I will follow ALSM to some extent. Although it is an old book, it is an incredibly well-written and thorough book (check for yourself, read the Amazon reviews). I have posted the sections on the course schedule as well as on my course notes (available on Sakai). It would do you well to read over the text before we cover the material in class.

*Applied Linear Statistical Models, 5th edition*; Kutner, Nachtsheim, Neter, Li (out of print, you are likely to be able to find the pdf online)

*An Introduction to Statistical Learning*; James, Witten, Hastie, Tibshirani (http://www-bcf.usc.edu/~gareth/ISL/)

Website for: *Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving*; Nolan and Temple Lang (http://rdatasciencecases.org/) (sample chapters will be posted on Sakai)

**Important Dates:**

- 2/5/18 Project Data due
- 2/12/18 Project SLR due
- 2/21/18 Exam 1
- 3/26/18 Project MLR due
- 4/11/18 Exam 2
- 4/30/18 (Monday) SENIORS Project Summary due
- 5/2/18 Exam 3
- 5/9/18 Project Summary due, 5pm

**Handouts:**

- Class notes posted on Sakai.
- Youtube videos on getting started with R and RStudio: Introduction to RStudio
- R documentation / help
- swirl package
- Google for R: http://www.rseek.org/
- R tutorial
- An Introduction to R, Venables & Smith
- R Language Definition, R Core Team
- Another tutorial, with exercises & solutions
- Mosaic Reference Guide, need to install the mosaic package
- A Student’s Guide to R; Horton, Pruim, Kaplan (click on “Raw” to download)
- Data Wrangling Cheatsheet: http://www.rstudio.com/resources/cheatsheets/

**Reflection Questions:**

- For each chapter, I will provide a list of questions (in course notes, posted to Sakai) designed to help you reflect on the big picture of what we’ve covered. They will help you get unstuck from the weeds of technical details, computing, and HW problems. You should use them regularly: both to check your understanding and to study for exams. If you don’t know the answer to any question on material we have covered, you should clarify the concept with the mentor or the instructor as quickly as possible.

**Homework:**

- Homework will be assigned from the text with some additional problems. One homework grade will be dropped. Homework will be done using the statistical software package R. All homework must be done in R markdown (or R Sweave if you want to use LaTeX). Homework is due Wednesday by noon to Sakai.
- HW is graded on a scale of 5/4/3/2/1. See the first HW assignment for more information. [Note: you will be graded down if your homework contains superfluous information (e.g., messages from library calls or stepwise regression printouts).]

**Projects:**

- There will be a semester long project done in pairs. Early in the semester you will find data that you can work with for the entire semester. For each section we cover, you will perform an analysis on the data using the methods in the section. More information will be provided as the semester progresses. Because you will be working together, it will be good to work on GitHub. See below for how to set up GitHub.

**Computing:**

- R will be used for many homework assignments.
- We will be using R on the Pomona server: https://rstudio.campus.pomona.edu/ (All Pomona students will be able to log in immediately. Non-Pomona students need to go to ITS at Pomona to get Pomona login information.)
- In particular, http://swirlstats.com/ is a great way to walk through learning the basics of R.
- If you want to use R on your own machine, you may. Please make sure all components are updated:
- R is freely available at http://www.r-project.org/ and is already installed on college computers. Additionally, installing R Studio is required http://rstudio.org/, and all R assignments should be turned in using R Markdown.

- Ideally, your paired project will be done using GitHub. To set up GitHub, follow the instructions at http://happygitwithr.com/.

**Participation:**

- This class will be interactive, and your participation is expected (every day in class). Although notes will be posted, your participation is an integral part of the in-class learning process. We will regularly have warm-up activities which will contribute to your participation grade. No laptop computers in class.

**Course Goals:**

- to understand the basic structure of a linear model.
- to know when a linear model is appropriate and what conclusions can be drawn given a particular dataset.
- to use graphical tools to investigate models associated with the data at hand.
- to communicate results effectively.

**Academic Honesty:**

You are encouraged to work together on homework assignments. Additionally, the group presentation will require close collaboration with a group of your peers. Everything you turn in must represent your own work. Copying and pasting code (or text) from your colleagues constitutes plagiarism and will not be tolerated. All exams (including take-home) will be closed person. You may not collaborate (discuss, complain, etc.) with other individuals about the exams. Pomona’s academic honesty policy is given below and will be taken seriously.

- Pomona College is an academic community, all of whose members are expected to abide by ethical standards both in their conduct and in their exercise of responsibilities toward other members of the community. The college expects students to understand and adhere to basic standards of honesty and academic integrity. These standards include, but are not limited to, the following:
- In projects and assignments prepared independently, students never represent the ideas or the language of others as their own.
- Students do not destroy or alter either the work of other students or the educational resources and materials of the College.
- Students neither give nor receive assistance in examinations.
- Students do not take unfair advantage of fellow students by representing work completed for one course as original work for another or by deliberately disregarding course rules and regulations.
- In laboratory or research projects involving the collection of data, students accurately report data observed and do not alter these data for any reason.

**Advice:**

- Please feel free to stop by, email, or call if you have any questions about or difficulty with the material, the computing, the projects, or the course. Come see me as soon as possible if you find yourself struggling. The material will build on itself, so it will be much easier to catch up if the concepts get clarified earlier rather than later. Enjoy!

**Grading:**

- 15% Homework
- 20% Semester Project
- 25% Exam 1
- 25% Exam 2
- 10% Exam 3
- 5% Class Participation