Course Description

With a vast amount of information now collected on our online and offline actions — from what we buy, to where we travel, to who we interact with — we have an unprecedented opportunity to study complex social systems. This opportunity, however, comes with scientific, engineering, and ethical challenges. In this hands-on course, we develop ideas from computer science and statistics to address problems in sociology, economics, political science, and beyond. We cover techniques for collecting and parsing data, methods for large-scale machine learning, and principles for effectively communicating results. To see how these techniques are applied in practice, we discuss recent research findings in a variety of areas. This course was previously listed as MS&E 331.

Prerequisites: An introductory course in applied statistics, and experience coding in R or Python.

There is a $25 course materials fee for running experiments on Mechanical Turk.

Sharad Goel ()
Imanol Arrieta Ibarra (TA) (email)
Class: Mondays & Wednesdays @ 1:30 - 2:50 in Thornton 110
Lab Section: Wednesdays @ 12:30 - 1:20 in Thornton 110

Sharad's Office Hours
Mondays 3 - 5pm in Huang 356

Imanol's Office Hours
Wednesdays 3 - 5pm in Huang 314

During the first week of school, there is no lab section and office hours are by appointment only.

On Sunday, Oct. 2, we will hold optional (but highly recommended) crash courses on R (10am - 12pm) and Python (1 - 3pm) in Thornton 110. This is an interactive session, so please bring your computers, and have R and Python 2.7 already installed.

We use Piazza to manage course questions and discussion, and Canvas to submit assignments. Code examples are posted on GitHub.

Computing Environment

A Unix-like setup is required (e.g., Linux, OS X, or Cygwin). We primarily use R (R Studio is recommended) and Python 2.7 (Anaconda Python is recommended), including ggplot2 for visualization and dplyr for data manipulation. We also use Vowpal Wabbit (a fast online learning algorithm), and Amazon Elastic MapReduce (a web service for distributed computing).

4 assignments (50%)
Project proposal (10%)
Final project (30%)
Scribe notes for one lecture (5%)
Participation (5%)