This course is all about variation, uncertainty, and randomness. Students will learn the vocabulary of uncertainty and the mathematical and computational tools to understand and describe it.

Thomas Stewart

Elson Building, 400 Brandon Ave, Room 156

thomas.stewart@virginia.edu

Github: thomasgstewart

Ethan Nelson

Graduate student in Data Science

ean8fr@virginia.edu

Github: eanelson01

**Format of the class:** In-class time will be a combination of lectures, group assignments, live coding, and student presentations. **Please note:** Circumstances may require the face-to-face portion of the class to be online.

**Time:** MWF, 10 - 10:50am, Dell 1 Room 105

**Instructor Office Hours:** MW, 11am, Dell 1 Commons (The instructor will leave if there are no questions after 15 minutes.)

**TA Office Hours:** Thursdays, 1pm, Dell 1 Commons

The following textbooks are freely available online via the UVA library.

Understanding uncertainty by Dennis V. Lindley

Understanding Probability, 3rd edition

by Henk Tijms

Introduction to Probability: Models and Applications

by N. Balakrishnan, Markos V. Koutras, Konstadinos G. Politis

The following textbooks may also be helpful.

Probability and Statistics for Data Science

by Norman Matloff

Introduction to Probability Models

by Sheldon M. Ross

The course will be taught using R.

The following are the four ideas that I hope will persist with students after the minutia of the Poisson distribution has faded from memory. Expand each section to see the associated learning outcomes and topics.

- long-run proportion
- personal beliefs
- combination of beliefs and data |
| express the rules of probability verbally, mathematically, and computationally|
- AND, OR, complement, total probability
- simulation error (relative and absolute) |
| illustrate the rules of probability with examples| |
| using long-run proportion definition of probability, derive the univariate rules of probability| |
| organize/express bivariate random variables in cross tables| |
| define joint, conditional, and marginal probabilities| |
| identify joint, conditional, and marginal probabilities in cross tables| |
| identify when a research question calls for a joint, conditional, or marginal probability| |
| describe the connection between conditional probabilities and prediction| |
| derive Bayes rule from cross tables| |
| apply Bayes rules to answer research questions| |
| determine if joint outcomes are independent| |
| calculate a measure of association between joint outcomes| |
| apply cross table framework to the special case of binary outcomes|
- Sensitivity
- Specificity
- Positive predictive value
- Negative predictive value
- Prevalence
- Incidence |
| define/describe confounding variables |
- Simpson's paradox
- DAGs
- causal pathway |
| list approaches for avoiding confounding |
- stratification
- randomization |
</details>
## Probability models are a powerful framework for describing and simplifying real world phenomena as a means of answering research questions.

| Learning outcomes | Topics | |:------|:---| | list various data types| | | match each data type with probability models that may describe it|- Bernoulli
- binomial
- negative binomial
- Poisson
- Gaussian
- gamma
- mixture |
| discuss the degree to which models describe the underlying data | |
| tease apart model fit and model utility| |
| express probability models both mathematically, computationally, and graphically|
- PMF/PDF
- CMF/CDF
- quantile function
- histogram/eCDF |
| employ probability models (computationally and analytically) to answer research questions| |
| explain and implement different approaches for fitting probability models from data|
- Tuning
- Method of Moments
- Maximum likelihood
- Bayesian posterior
- kernel density estimation|
|visualize the uncertainty inherent in fitting probability models from data|
- sampling distribution
- posterior distribution
- bootstrap distribution |
| explore how to communicate uncertainty when constructing models and answering research questions|
- confidence intervals
- support intervals
- credible intervals
- bootstrap intervals|
| propagate uncertainty in simulations | |
| explore the trade-offs of model complexity and generalizability| |
</details>
## Probability is a framework for coherently updating beliefs based on new information and data.

| Learning outcomes | Topics | |:------|:---| | select prior distributions which reflect personal belief |- informative vs weakly informative priors|
| implement bayesian updating | |
| manipulate the posterior distribution to answer research questions | |
</details>
## Probability models can be expressed and applied mathematically and computationally.

| Learning outcomes | Topics | |:------|:---| | use probability models to build simulations of complex real world processes to answer research questions | |

| | 17

[Survey/Github setup](https://link.tgstewart.xyz/survey) | | 19[Intro R](https://tgstewart.cloud/into-r.pptx) | | 22## Reading

[Get started guide](https://rmarkdown.rstudio.com/lesson-1.html)| | 24## Reading

[Intro Markdown](https://markdownguide.offshoot.io/getting-started/)

[Markdown Cheatsheet](https://www.markdownguide.org/cheat-sheet)

[Tools](https://tgstewart.cloud/tools.pptx)

[Reproducable Reports](http://tgstewart.xyz/reproducible-research-tools/) | | 26

DUE: [HW1](https://github.com/UVADS/DS-2006/blob/main/assignments/hw1-euler-problems.md)| | 29## Reading

(optional) First 5 videos of [Learn R Programming](https://www.youtube.com/playlist?list=PLjgj6kdf_snYBkIsWQYcYtUZiDpam7ygg)

(optional) [Intro to VS Code](https://www.youtube.com/watch?v=B-s71n0dHUk)

(optional) [Using Git with Visual Studio Code](https://www.youtube.com/watch?v=i_23KUAEtUM) Note that you have already cloned your repo locally, whereas the video creates a fresh repo.

DUE: [HW2](https://github.com/UVADS/DS-2006/blob/main/assignments/hw2-euler-problems-rmarkdown.md)

[Rstudio on Rivanna](https://tgstewart.cloud/rivanna-rstudio.pptx) | | 31

| Feb

| 2

DUE: [HW3](https://github.com/UVADS/DS-2006/blob/main/assignments/hw3-r-practice.md)| | 5## Reading

[Understanding uncertainty](https://ebookcentral.proquest.com/lib/uva/reader.action?docID=1574353), CH 1

DUE: [HW4](https://github.com/UVADS/DS-2006/blob/main/assignments/hw4-uncertainty.md) | | 7

DUE: [HW5](https://github.com/UVADS/DS-2006/blob/main/assignments/hw5-calculus-of-belief.md)

DUE: [HW1](https://github.com/UVADS/DS-2006/blob/main/assignments/hw1-euler-problems.md) Resubmission

[Operating Characteristics](https://tgstewart.cloud/01-probability-definition-slides.html)

| | 9

DUE: [HW6](https://github.com/UVADS/DS-2006/blob/main/assignments/hw6-birthday-problem.md)

DUE: [HW2](https://github.com/UVADS/DS-2006/blob/main/assignments/hw2-euler-problems-rmarkdown.md) Resubmission | | 12

DUE: [HW7](https://github.com/UVADS/DS-2006/blob/main/assignments/hw7-two-events.md)

DUE: [HW3](https://github.com/UVADS/DS-2006/blob/main/assignments/hw3-r-practice.md) Resubmission

[Rules of prob 1](https://tgstewart.cloud/04-probability-bayes-rule.pdf)

[Rules of prob 2](https://tgstewart.cloud/04-more-bayes.pdf) | | 14

Exam review

[Prep questions](https://tgstewart.cloud/midterm1-prep.html)

DUE: [HW8](https://github.com/UVADS/DS-2006/blob/main/assignments/hw8-independence.md) | | 16DUE: [HW4](https://github.com/UVADS/DS-2006/blob/main/assignments/hw4-uncertainty.md) Resubmission | | 19## Exam

You will be given a set of prep questions on Feb 14. Generate solutions to the prep questions prior to the in-class exam. During the exam, you will be given a test questions similar to the prep questions. You will be able to copy and paste and tweak your solutions to the prep questions to solve the exam questions.

Read/Watch [Deliverable 1](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable1-roulette.md)

DUE: [HW5](https://github.com/UVADS/DS-2006/blob/main/assignments/hw5-calculus-of-belief.md) Resubmission | | 21

Work on [Deliverable 1](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable1-roulette.md) | | 23

DUE [Deliverable 1](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable1-roulette.md)

[HW6](https://github.com/UVADS/DS-2006/blob/main/assignments/hw6-birthday-problem.md) Resubmission | | 26

| | 28

DUE: [HW9](https://github.com/UVADS/DS-2006/blob/main/assignments/hw9-basic-rules.md) | | Mar

DUE: [HW10](https://github.com/UVADS/DS-2006/blob/main/assignments/hw10-bayes-rule.md)

[Diagnostics](https://tgstewart.cloud/diagnostics.pptx) | | 4

Spring break | | 6

Spring break | | 8

Spring break | | 11

In class: [Deliverable 2](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable2-simulation-error.md) | | 13

| 14

DUE: [Deliverable 2](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable2-simulation-error.md) | 15

| | 18

[Data types](https://tgstewart.cloud/01-data-types.pptx)

DUE: [HW11](https://github.com/UVADS/DS-2006/blob/main/assignments/hw11-diagnostic-odds.md)

DUE: [HW 7 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw7-two-events.md) | | 20

DUE: [HW 8 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw8-independence.md) | | 22

[HW 12](https://github.com/UVADS/DS-2006/blob/main/assignments/hw12-data-types.md) | | 25

[HW 13](https://github.com/UVADS/DS-2006/blob/main/assignments/hw13-confounding-and-randomization.md) | | 27

Exam review

[Prep questions](https://tgstewart.cloud/midterm2-prep.pdf) | | 29| | Apr## Exam

You will be given a set of prep questions on Mar 27. Generate solutions to the prep questions prior to the in-class exam. During the exam, you will be given a test questions similar to the prep questions. You will be able to copy and paste and tweak your solutions to the prep questions to solve the exam questions.

[In class code (Prob tom)](https://tgstewart.cloud/prob-tom.R)

[Bernoulli (Binomial)](https://tgstewart.cloud/05-binomial-prob.html)

[Hands/Sequences](https://tgstewart.cloud/hands-and-sequences.pptx) | | 3

| | 5

[Bernoulli sequences](https://tgstewart.cloud/bernoulli-sequences.pptx) | | 8

DUE: [HW 12 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw12-data-types.md)| | 10

DUE: [Deliverable 1 Resubmission](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable1-roulette.md)

| | 12

DUE: [HW 14](https://github.com/UVADS/DS-2006/blob/main/assignments/hw14-world-series-distribution.md) | | 15

DUE: [HW 11 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw11-diagnostic-odds.md) | | 17

| | 19

DUE: [HW 15](https://github.com/UVADS/DS-2006/blob/main/assignments/hw15-binomial-negbinomial-practice-problems.md) | | 22

DUE: [Deliverable 2 Resubmission](https://github.com/UVADS/DS-2006/blob/main/deliverables/deliverable2-simulation-error.md) | | 24

| | 26

[KDE](https://tgstewart.cloud/cdf-pdf-kernels.html)

[KDE part 2](https://tgstewart.cloud/cdf-pdf-kernels-part2.html) | | 29

Last class

[Exam review](https://tgstewart.cloud/final-exam-prep.html)

| | May

DUE: [HW 13 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw13-confounding-and-randomization.md)

DUE: [HW 14 Resubmission](https://github.com/UVADS/DS-2006/blob/main/assignments/hw14-world-series-distribution.md) | | 3

| | 6

| | 8

| 9

Final exam

9:00am - 12:00pm| | ## Adjustments The instructor may alter the course content and grading policies during the semester. ## Collaborative learning Students are encouraged to study together. The instructions for each assignment/deliverable will indicate if and how students may work together. Students should not collaborate on midterm or final exams. Students that violate the collaborative-work policy on an assignment, deliverable, or exam will receive a score of 0 on the assignment, deliverable, or exam. Students may be referred to UVA Honor Committee. **University of Virginia Honor System.** All work should be pledged in the spirit of the Honor System at the University of Virginia. The following pledge should be written out at the end of all quizzes, examinations, individual assignments, and papers: “I pledge that I have neither given nor received help on this examination (quiz, assignment, etc.)”. The pledge must be signed by the student. For more information, visit www.virginia.edu/honor. ## Accommodations UVA is committed to creating a learning environment that meets the needs of its diverse student body. If you anticipate or experience any barriers to learning in this course, please feel welcome to discuss your concerns with me. If you have a disability, or think you may have a disability, you may also want to meet with the Student Disability Access Center (SDAC), to request an official accommodation. You can find more information about SDAC, including how to apply online, through their website at www.studenthealth.virginia.edu/SDAC. If you have already been approved for accommodations through SDAC, please make sure to send me your accommodation letter and meet with me so we can develop an implementation plan together.

- informative vs weakly informative priors|
| implement bayesian updating | |
| manipulate the posterior distribution to answer research questions | |
</details>