Thursday, July 14, 2016

Stanford Machine Learning Week 9 review

What is Anomaly Detection?

Given a bunch of x's (where each x is a vector - a training example), detect whether for a new example x (a new vector), possibility p(x) < ε (epsilon). If yes, it is considered an anomaly; otherwise, it is considered normal.

What is Gaussian (Normal) distribution?

Gaussian distribution is defined as X ~ N(μ, σ^2), where μ (pronounced mu) is the mean of x; σ (pronounced sigma) is the standard deviation; σ^2 is the variance.

Anomaly Detection Algorithm

p(x;μ,σ^2) uses Gaussian distribution to plot the function of x for given a fixed value of mu and of sigma squared.

 

Recommender Systems: Collaborative Filtering Algorithm

Collaborative Filtering Algorithm is based on linear regression. We can think of each movie has its features: x1, x2, ..., xn. x1 may represent how romance the movie is, x2 may represent how action the movie is, etc. Then if the user has rated enough movies ( 1<= the rating y <= 5), we can use linear regression to predict hθ(x), given the features of a movie.

Learning features

Not only we can learn the thetas of each user, we can even learn the value of features of each movie automatically by using Collaborative Filtering Algorithm.
Given a dataset that consists of a set of ratings produced by some users on some movies, you wish to learn the parameter vectors x(1),...,x(nm),θ(1),...,θ(nu) that produce the best fit (minimizes the squared error).

0 comments:

Post a Comment