Machine Learning
Watch short videos of Machine Learning and RSVP for discussion meeting at

If this is your first time, take a look at Logistics
Notes on Programming Exercises

Study Group Meeting
Discussion Summary
April 24, 2013
  • We discussed these questions:
  • We shared our ideas and experiences in machine learning:
    • Regression algorithm and clustering are used a lot. We need to master the basics well.
    • What's the main advantage of a distributed system? It has a lot of memory (Tera bytes!) Pick several initial values, pick several dimensions, do parallel jobs: map, sort, reduce. It's easy to do quicksort with Tera bytes of memory from 250 machines.
    • Is it challenging to do distributed programming? No. You just change your think style a bit - where to put logic, sort it, read it, implement it in another map. Put data in a grid, change data, and process data. One map, one reducer; another map, another reducer; give customers intermediate data. It's no long a single thread. It's only limited by our imagination.
    • Reduce dimensionality. Tell customers the cost and ask them to decide: with 3-dimensions, each run is 2 hours; with 4-dimension, reach run is 18 hours.
    • Briefly talked about applying machine learning in these area: detect fraudulent insurance claim, information management, speech content summarization and coaching of speakers, real time translation , neural networks, topic modeling.
  • We had 18 people and formed two discussion groups - one had 8 people and the other had 10 people.
May 1
  • We discussed these questions:
    • How to reduce number of variables. non negative matrix factorization
    • how was the normal equation θ=(XTX)1XTy derived?
    • the advantage of gradient descent over normal equation: changes over the time; avoid too much CPU processing and disk access.
  • We shared our ideas and experiences in machine learning:
    • apply to email marketing (email customers based on their previous purchase behavior), vehicle fuel efficiency (firing, fuel, air pressure) especially when a car is at cruise control.
    • options trading algorithm. Latent Dirichlet allocation (LDA). Provide recommendation.
    • clustering model.
    • related classes: probabilistic graphical models (Bayes network), computing for data analysis (by Johns Hopkins), data analysis (by Johns Hopkins)
  • 8 people joined the discussion.
May 8
  • Discussed cost function, gradient for classification, intuitive understanding of regularization.
  • 6 people joined the discussion.
May 15
  • Discussed why neural network solves the challenge of too many features in linear regression.
  • Use neural network for regression instead of classification.
  • Whether to add more units in a hidden layer or add more hidden layers. We should start with one hidden layer and minimum units.
  • 4 people joined the discussion.
May 24
  • Discussed how to understand backward propagation, how to derive the formula.
  • 7 people joined the discussion.
May 31
  • Discussed precision, recall; brainstormed on how to solve several applications.
  • 12 people joined the discussion.
June 5
  • Discussed support vector machines, especially kernels.
  • 4 people joined the discussion.
June 18
  • Discussed how to understand SVM intuitively; clustering; PCA
  • 12 people joined the discussion
June 21
  • Discussed SVM.
  • Shared experience in using machine learning in medical field and airspace.
  • 8 people joined the discussion
June 27
  • Discussed recommender. Discussed which are the best libraries.
  • 13 people joined the discussion