Machine Learning Lecture 3: working with text + nearest neighbor classification

We continue our work with sentiment analysis from Lecture 2. I go over common ways of preprocessing text in Machine Learning: n-grams, stemming, stop words, wordnet, and part of speech tagging. In part 2 I introduce a common approach to k-nearest neighbor classification with text (It is very similar to something called the vector space model with tf-idf encoding and cosine distance) Code and other helpful links:
Back to Top