# CS3244 Module Review – Machine Learning

With machine learning the hottest topic in the AI world as of now (2016), this module is hugely popular. In fact it is overly subscribed this year, and I believe it will continue to be like that. For 16/17 Sem 1, Prof. Min-Yen Kan is taking over the module with new syllabus and python as the main programming language (try to guess the language used before).

## Overview

- This course is mixed between theoretical knowledge and practical applications, with more focus on the theory behind machine learning.
- 3 homeworks throughout the semester, each with coding tasks (in python) and essay (theoretical) questions. Last homework is a Kaggle competition, where you can put whatever you learnt into practice to solve a machine learning challenge.
- Both midterm exam and final are open-book, i.e. whatever is tested won’t be found directly from course materials.

## What you will learn

The content of the module can be *classified* into 3 categories: **Conceptual (C), Practice (P) and Theory (T)**, based on the goal of learning. **Conceptual** means you need to understand it and there is no maths, whereas **Theory** means there are maths, logic and proofs. **Practice** means you need to be able to code it out with python and relevant machine learning libraries (mostly using scikit-learn).

Each topic may fall into one or more categories:

- Perceptron learning algorithm and the linear module (linear regression, logistic regression, stochastic gradient descent) – P&T
- Support vector machine, kernelization – P&T
- Issues with Bias, Variance and Overfitting – C&P
- Regularization and validation – T
- Learning Theory (generalization, VC analysis) – T
- Decision Tree (again?), Ensemble Methods – C&P
- Unsupervised learning (Gaussian Mixture Model, Expectation Maximization) – C&P
- Neural Network and Deep Learning (Not as exciting as you might think) – C&P

## Exams

Both midterm and final exam questions tend to test on conceptual and theory questions. Of course there will also be algorithm tracing questions as well. There are some questions in the tutorials that are extremely difficult. However, the exam questions are generally doable as long as you have enough time (I did not have enough time to finish either midterm or final).

## HomeWork and Project

This module has 3 homework assignments. The first two are relatively straightforward, requiring you to implement linear model and SVM with some parameter tuning. There are also essay questions on the theoretical part of linear model and SVM.

The last homework is actually a project where you can form teams with maximum of 3 people. You get to choose between a facial recognition problem and a natural language processing problem on Kaggle. In both problems you can use any machine learning frameworks and algorithms to solve the problem so you can be very creative and tryhard. Also, based on the feedback, deep learning seems to work well for both problems but the problems are likely to be different for the next batch.

## Useful Resources

- Course website for 2016:
- Kaggle competitions as machine learning projects:
- Set Question Answering – https://inclass.kaggle.com/c/nus-3244-setqa
- Labeled Faces in the Wild – https://inclass.kaggle.com/c/labeled-faces-in-the-wild

- Nice Article on Learning Theory:

## Advice

- With such a huge demand, it can be hard to get this module. For CS students, you can set or switch your focus area to
**Artificial Intelligence**in order to have a higher chance of getting it in MPE. Otherwise, save enough CORS points. Also note that you can also enter this module as a**guest student**, which allows you to access course materials in addition to attending lectures and tutorials. However, you will NOT be officially enrolled in the module, meaning you WON’T be able to attend official exams, get grades or MC for the module. - Prepare to learn theories and maths. This is not all about coding in python and calling libraries. In fact, theory and concepts take up most of the time in lectures and exams. The coding part is mainly handled by homework assignments and the project.
- This is one of the few modules that you will do just fine soloing. For all homework assignments and the project, you can do them alone. In fact I think the top scorers for both Kaggle competitions are half teams and half soloers.

Do you mind to share me your lecture notes on cs3244 and cs4246? Very interested in this 2 modules, just want to take a close look on the teaching materials.