Pre-requisites: familiarity with Python and jupyter installed on students’ computers. Course description: this course introduces the fundamentals of statistical methods. The first part will be dedicated to introducing the language and principles of probability theory from both frequentist and Bayesian point of views. We will review the standard probability distributions and describe their main properties. Statistical inference and hypothesis testing will be the key topics of the second part of the course, presenting all the fundamental tools to be applied to the analysis of scientific data. Lectures will be complemented by two coding tutorials, allowing students to investigate the acquired statistical methods on "real-life" examples.
Syllabus:
1. Review of Probability and Statistics: The language of probability. Random variables. Cumulative distribution function, probability density function. Conditional probability. Bayes theorem. Expected value and variance. Common distributions. Types of Convergence. Asymptotic theorems: Law of Large Numbers, Central Limit Theorem.
2. Basics of Statistical Inference: Inference: parametric vs non-parametric, frequentist vs Bayesian. Estimators (consistency, bias, variance). Likelihood. Maximum likelihood estimation and maximum a posteriori probability. Confidence and credible intervals. Nuisance parameters.
3. Lab 1: hands-on jupyter lab on probability and statistics.
4. Hypothesis Testing I: Significance, acceptance and Bayesian tests. Null hypothesis, p-value, Type I and II errors. Neyman-Pearson lemma. Multiple comparisons. Resampling methods.
5. Hypothesis Testing II: Types of tests: location, independence, homogeneity. Order and rank statistics. Location: Z-test and Student's $t$-test. Independence: $\chi^2$ test. Homogeneity: Kolmogorov-Smirnov and Mann-Whitney tests. Generalization error and overfitting. Cross-Validation and complexity penalization.
6. Lab 2: hands-on jupyter lab on hypothesis testing.