- Also known as Theoretical Machine Learning, this course (unit) was originally designed for various elite class Bachelor students and Research students in some top Asia Pacific universities, including Manipal Institute of Technology, University of Chinese Academy of Sciences, Nanjing University of Science and Technology, Vellor Institute of Technology, SRM Institute of Science & Technology etc. (since 2012).
- Materials in this course include resources collected from various open-source online repositories. You are free to use, change and distribute this package.
- If you found any issue/bug for this site, please submit an issue at tulip-lab/statistical-machine-learning:
- Pull requests are welcome:
- Preliminary unit 👉 :
- Subsequent unit 👉 :
- Point of Contact 👉 : Prof. Gang Li
Prepared by TULIP Lab
This course (aka unit) delves into the foundational aspects of statistical machine learning, which plays a pivotal role in various areas, including deep learning, data science, data privacy etc.
The primary focus is on the fundamental learning theories and frameworks of statistical machine learning, and the mathematical derivations that transform these principles into practical algorithms. The unit concentrates on statistical learning framework, PAC-Learnability, Empirical Risk Minimization (ERM), No-Free-Lunch Theory, Non-Uniform Learnability, and Structural Risk Minimization (SRM) etc. Following that, the course shifts attention towards discriminative methods such as common convex optimization techniques, support vector machines, and Kernel methods.
Students will have access to a comprehensive range of subject materials, comprising slides handouts, assessment documents, and relevant readings. It is recommended that students commence their engagement with each session by thoroughly reviewing the pertinent slides handouts and readings to obtain a comprehensive understanding of the content.
Additionally, students are encouraged to supplement their knowledge by conducting independent research, utilizing online resources or referring to textbooks that cover relevant information related to the topics under study.
This unit needs a total of 48 class hours, including 36 hours teaching, and 12 hours student presentation/discussion. The unit plan is as below:
🔬 Session |
🏷️ Category |
📒 Topic |
🎯 ULOs |
👨🏫 Activity |
---|---|---|---|---|
0️⃣ | Preliminary | 📖 Induction | ULO1 | |
1️⃣ | Preliminary | 📖 Math Foundations | ULO1 | |
2️⃣ | Core | 📖 Statistical Learning Framework | ULO1 | |
3️⃣ | Core | 📖 PAC Learning | ULO1 ULO2 | |
4️⃣ | Core | 📖 VC Dimension | ULO1 ULO2 | |
5️⃣ | Core | 📖 Fundamental Theorem of PAC Learning | ULO1 ULO2 | |
6️⃣ | Core | 📖 Non-Uniform Learning | ULO1 ULO2 | |
7️⃣ | Core | 📖 Model Complexity | ULO1 ULO2 | |
Student Work | 📖 Selected Topics in SML | ULO3 | ||
8️⃣ | Core | 📖 Convex Optimization and Learning | ULO1 ULO2 | |
9️⃣ | Core | 📖 Regularized Loss Minimization | ULO1 ULO2 ULO3 | |
🔟 | Advanced | 📖 Data Privacy | ULO1 ULO2 ULO3 | |
Student Work | 📖 Selected Topics in SML | ULO3 | ||
🏆 | Advanced | 📖 [Invited Talk and Discussions] | ULO1 ULO2 |
Every cohort might be assessed differently, depending on the specific requirements of your universities.
The assessment of the unit is mainly aimed at assessing the students' achievement of the unit learning outcomes (ULOs
, a.k.a. objectives), and checking the students' mastery of those theorey and methods covered in the unit.
The detailed assessment specification and marking rubrics can be found at: S00D-Assessment. The relationship between each assessment task and the ULOs are shown as follows:
🔬 Task |
👨🏫 Category |
🎯 ULO1 |
🎯 ULO2 |
🎯 ULO3 |
Percentage |
---|---|---|---|---|---|
1️⃣ | Presentation | 50% | 25% | 25% | 25% |
2️⃣ | Project | 30% | 70% | 50% | |
2️⃣ | Report Presentation |
20% | 40% | 40% | 25% |
- SRM 2024 - The final assessment files submissions due date is 🗓️
Saturday, 18/05/2024
(tentative), group of one member only (individual work) for all tasks.
It is expected that you will submit each assessment component on time. You will not be allowed to start everything at the last moment, because we will provide you with feedback that you will be expected to use in future assessments.
㊙️
If you find that you are having trouble meeting your deadlines, contact the Unit Chair.
This course uses several key references or textbooks, together with relevant publications from TULIP Lab:
- Understanding Machine Learning: From Theory to Algorithms, Shai Shalev-Shwartz and Shai Ben-David
- Research Publications, various resources and readings
Thanks goes to these wonderful people 🌷
Made with contributors-img.