Each topic requires 4 hours = 20 hours + 2 x 3hrs Lab = 26 hours (4 weeks: 05/10-30/10)
Pre-requisites: familiarity with Python and jupyter installed on students’ computers.
i. Review of Probability and Statistics: The language of probability. Random variables.Cumulative distribution function, probability density function. Conditional probability. Bayes theorem. Expected value and variance. Common distributions. Types of Convergence. Asymptotic theorems: Law of Large Numbers, Central Limit Theorem.
ii. Basics of Statistical Inference: Inference: parametric vs non-parametric, frequentist vs Bayesian. Estimators (consistency, bias, variance). Likelihood. Maximum likelihood estimation and maximum a posteriori probability. Confidence and credible intervals. Nuisance parameters.
iii. Lab 1: hands-on jupyter lab on probability and statistics.
iv. Hypothesis Testing I: Significance, acceptance and Bayesian tests. Null hypothesis, p-value, Type I and II errors. Neyman-Pearson lemma. Multiple comparisons. Resampling methods.
v. Hypothesis Testing II: Types of tests: location, independence, homogeneity. Order and rank statistics. Location: Z-test and Student's $t$-test. Independence: $\chi^2$ test. Homogeneity: Kolmogorov-Smirnov and Mann-Whitney tests. Generalization error and overfitting. Cross-Validation and complexity penalization.
vi. Lab 2: hands-on jupyter lab on hypothesis testing.
Please, notice that this is a course belonging to Data Science Excellence Department programme. MAMA PhD students can plan 33% of their credits (i.e. 50 hrs) from this programme.