This course is all about variation, uncertainty, and randomness. Students will learn the vocabulary of uncertainty and the mathematical and computational tools to understand and describe it.
Elson Building, 400 Brandon Ave, Room 156
Graduate student in Data Science
Format of the class: In-class time will be a combination of lectures, group assignments, live coding, and student presentations. Please note: Circumstances may require the face-to-face portion of the class to be online.
Time: MWF, 9 - 9:50am, Dell 1 Room 105
Office Hours: MWF, 10am, Dell 1 Commons
TA Office Hours: Thursdays, 2pm, Dell 1 Commons
The following textbooks are freely available online via the UVA library.
Understanding uncertainty by Dennis V. Lindley
Understanding Probability, 3rd edition
by Henk Tijms
Introduction to Probability: Models and Applications
by N. Balakrishnan, Markos V. Koutras, Konstadinos G. Politis
The following textbooks may also be helpful.
Probability and Statistics for Data Science
by Norman Matloff
Introduction to Probability Models
by Sheldon M. Ross
The course will be taught using R.
This course covers the fundamentals of probability theory and stochastic processes. The goal of the course is that students will become conversant in the tools of probability. At the end of the course, students should be able to clearly describe and implement concepts related to random variables, properties of probability, distributions, expectations, moments, transformations, model fit, basic inference, sampling distributions, discrete and continuous time Markov chains, and Brownian motion. The course will illustrate most topics with both analytic and computational solutions.
The final exam will be a 20-30 minute one-on-one oral exam with the instructor recorded in Zoom. Prior to the exam, a set of practice questions will be provided, with the expectation that students will prepare for the oral exam by coding-up solutions and writing explanations. During the oral exam, the instructor will ask a series of questions covering topics from the course and the practice questions. For example, the instructor may ask:
Students will be graded on both the accuracy of their responses and the clarity with which they explain course concepts and solutions to questions. The final exam will occur on 14 December 2023. Students will sign up for oral exam slots in early December.
Each class will have an assigned reading. Each reading is paired with a deliverable. Please read the assigned material and make a good-faith effort on the deliverable before class.
|Fri, 2023-09-01||UU 4.1, 4.2||4|
|Mon, 2023-09-04||UU 4.3 to 4.7||5|
|Wed, 2023-09-06||UU 5.1 to 5.5||6|
|Fri, 2023-09-08||UU 5.6 to 5.13||7|
|Mon, 2023-09-11||UU 6.1 to 6.4||8|
|Wed, 2023-09-13||UU 6.5 to 6.9||Catch up|
|Fri, 2023-09-15||UU 6.10 to 6.12||Catch up|
|Fri, 2023-09-22||UU 7.1 to 7.3||11|
|Mon, 2023-09-25||UU 7.4 to 7.8||12|
|Wed, 2023-09-27||UU 8.1 to 8.3||13|
|Fri, 2023-09-29||UU 8.4 to 8.9||14|
|Mon, 2023-10-02||Fall Break|
|Wed, 2023-10-04||Binomial probabilities|
|Mon, 2023-10-23||Rock, paper, scissors|
|Wed, 2023-10-25||World Series||18|
|Fri, 2023-10-27||World Series
slides, slides 9 to 21
Some of the assignments will be traditional problem sets. Others will be more substantial projects requiring you to perform a simulation and summarize findings in a blog format. Each assignment will be graded on a pass/fail basis. Students will have opportunities to resubmit each assignment multiple times within a 2 week window after of initial feedback.
|Deliverable||First Submission Due Date||Resubmission Due Date|
|0. Getting started with Github (not graded)||None|
|2. Calculus of belief||2023-09-01||2023-10-02|
|3. Birthday problem||2023-09-04||2023-10-02|
|4. Two events||2023-09-06||2023-10-02|
|6. Basic rules||2023-09-11||2023-10-02|
|7. More rules||2023-09-13||2023-10-02|
|8. Bayes rule||2023-09-15||2023-10-03|
|9. Bayes rule, more practice||2023-09-18||2023-10-04|
|10. Diagnostic odds||2023-09-20||2023-10-09|
|12. Probability, Likelihood, Chance||2023-09-27||2023-10-18|
|13. Simpson’s paradox||2023-09-29||2023-10-18|
|15. Binomial practice||2023-10-06||2023-10-23|
|17. Simulation error||2023-10-23||2023-11-15|
|18. World Series||2023-10-30||2023-11-20|
|19. Birth weight||2023-11-08||2023-11-27|
|20. Birth weight with kernels||2023-11-10||2023-12-01|
|21. Birth weight with kernels (part 2)||2023-11-13||2023-12-01|
|22. Estimating the maternal age distribution with Maximum Likelihood||2023-11-27||2023-12-05|
|23. Bayesian updating||2023-11-27||2023-12-05|
|24. Estimating the weight distribution with method of moments||2023-11-29||2023-12-05|
|Definitions of Probability|
|Intro to R||Getting R|
|→ Rmarkdown||Example input
|Simulation & Operating Characteristics||slides
|Basic Probability Ideas|
|→ Belief vs Frequency vs Information|
|→ Notebook / data.frame definition|
|→ And, Or||Slides|
|→ Conditional Probability||Slides|
|→ Law of Total Probability||Slides|
|→ Bayes Rule||Slides|
PPV plot R code
|→ Posterior, Prior, Likelihood, Chance|
|→ Random variable|
|Discrete Probability Models|
|→ Probability Mass Function|
|→ Bernoulli Random Variables||Slides
Hands and Sequences
|→ Binomial Random Variables||”|
|→ Negative Binomial Random Variables||”|
|→ Poisson Random Variables||Slides|
|→ World Series Distribution||Hints|
|Continuous Probability Models||slides
|→ Cumulative Distribution Function|
|→ Probability Density Function|
|→ Uniform Random Variables|
|→ Normal Random Variables|
|→ Exponential Random Variables|
|→ Gamma Random Variables|
|→ Beta Random Variables|
|→ Mixture Distributions|
|Expectation and Variance|
|→ Data Types|
|→ Categorical, Ordinal, Interval, and Ratio Variables|
|Transformations of individual observations|
|Transformations of samples|
|→ Min and Max|
|→ Order Statistics|
|→ Sampling distributions|
|Methods of Fitting Models||slides
|→ Method of moments||deliverable 24|
|→ Maximum likelihood||deliverable 22|
|→ Bayesian||deliverable 23|
|→ Kernel Density Estimation||deleverables 19, 20, 21|
|Sampling Distributions from Fitted Models|
|→ Central Limit Theorem|
|→ Parallel Computing|
|→ Batch processing on the cloud|
|Brief introduction to inference|
|→ Sampling and Inference|
|→ Inference with CI|
|→ Inference with Hypothsis testing|
|Multivariate Normal Distribution|
|→ Conditional Distribution|
|→ Marginal Distribution|
Grading Policies: Courses carrying a Data Science subject area use the following grading system: A, A-; B+, B, B-; C+, C, C-; D+, D, D-; F. The symbol W is used when a student officially drops a course before its completion or if the student withdraws from an academic program of the University.
Grades will be a weighted average of the final exam score (50%) and the deliverables score (50%). As deliverables are graded on a pass/fail basis, the deliverable score will be the percentage of deliverables which earn a pass. For example, a student that earns an 90 on the final and passes 8 of 10 deliverables will earn 90.5 + 80.5 = 85 which is a B.
The instructor may alter the course content and grading policies during the semester.
Students are encouraged to study together. The instructions for each assignment will indicate if and how students may work together on the deliverable. Students should not collaborate on the final exam. Students that violate the collaborative-work policy on an assignment will fail the assignment in question and forfeit the opportunity to retake or resubmit. Students that violate the collaborative-work policy on the final exam will fail all sections of the final exam and forfeit the opportunity to retake or resubmit. Students may be referred to UVA Honor Committee.
University of Virginia Honor System: All work should be pledged in the spirit of the Honor System at the University of Virginia. The following pledge should be written out at the end of all quizzes, examinations, individual assignments, and papers: “I pledge that I have neither given nor received help on this examination (quiz, assignment, etc.)”. The pledge must be signed by the student. For more information, visit www.virginia.edu/honor.
UVA is committed to creating a learning environment that meets the needs of its diverse student body. If you anticipate or experience any barriers to learning in this course, please feel welcome to discuss your concerns with me. If you have a disability, or think you may have a disability, you may also want to meet with the Student Disability Access Center (SDAC), to request an official accommodation. You can find more information about SDAC, including how to apply online, through their website at www.studenthealth.virginia.edu/SDAC. If you have already been approved for accommodations through SDAC, please make sure to send me your accommodation letter and meet with me so we can develop an implementation plan together.