I Think Tech: ml-class

Saturday, October 15, 2011

Machine Learning online course - Class 1

As I mentioned in my previous blog post I am going to use this blog as my course notebook. All posts related to this course will have a "ml-class" tag, just in case.

The first class was all introduction stuff, as expected. What I really liked about this class was the real world examples used. They were very useful in understanding what to expect from this course. Anyways, here are my notes for the class :

Initially there were formal definitions of Machine Learning, one of them with rhyming phrases. I think we can skip those parts.

There are two types of learning algorithms - Supervised and Unsupervised

1) Supervised - A bunch of right answers are already provided to the machine. The machine has to try and get more of those right answers for the next set of questions.
The data provided already has some sense of direction or some sort of inference. It is like a set of input and output values and we have to predict the output value for a new input value given based on the existing data. Here the resultant dimension is known and defined. We have to find a suitable function which when applied on the given set of input values will best match the corresponding output values. This same function will then be used to predict output values of new inputs.
- Eg :
    1) Predicting the price of the house of a particular size given the price of various houses of varying sizes.
    2) Predicting whether a tumor is malignant or not based on the size given the answer for tumors of various sizes

Different Types
    1) Regression - Machine tries to predict a continuous valued attribute, i.e. the value of the attribute whose value we are trying to predict belongs to a continuous range. (The house price example)
    2) Classification - Machine tries to predict a discrete valued output, i.e the range of values is a finite small set of discreet values. (The tumor example)

2) Unsupervised learning - The data set given doesn't provide anything conclusive. It is just a data set and we are expected to make sense out of it and come up with the inference. There is no expected or target domain defined. It has to be inferred by examining the data. Very likely several target domains will be defined over the course of analyzing the data.
- Types :
    1) Clustering of data -
      -Eg : Google news example. Several articles about the same topic are grouped/clustered together. The input data set for this is just a bunch of articles (which is just one dimension/attribute). The other dimension (which is the common topic) itself is not well defined, i.e. the topics are not known before hand. We keep defining them as we go. So we have to infer that some of the articles belong to the same/similar topic and can be grouped together.

That's it. Done with the first class. YAY.. !. I am yet to attempt the review exercises. I have decided to go for review exercises of this and the next class together.

ಹರಿಃ ಓಂ.

Friday, October 14, 2011

Starting with the Stanford online Machine Learning class

Today I am starting with the Stanford online Machine Learning classes, taught by Andrew Ng. This is my second attempt at learning machine learning, via the same medium, with the same professor and under the same program. This course has been available online since about 3 years, albeit the current one is much more polished and very meticulously designed for online learning, unlike the old ones which were just recordings of the actual classroom teaching. Two years ago (i.e. in 2009) I, along with some of my friends/colleagues (from different teams) decided to learn machine learning at our office. Quickly a team of interested folks was formed. There was a friend who had completed his post-graduation in US and he had studied and AFAIK also worked on machine learning stuff. Then there were two other friends who had completed their post-graduation from IISc. Then there was my boss, who also had done his post-graduate studies in US. Apart from these folks, there were some smart folks not keen on a post-grad degree. And finally there was me. Yeah. me too. :). The idea was that all of us would watch the video lectures and one of us would present a session weekly.

We started with a bang, with an initial class on basics of statistics and probability, taught by that experienced and well-learned friend of mine. He called it "Statistics 101". It was good. There was no video lecture for this. So it was useful for me as my math needed a lot of dusting. This was followed by the first video lecture, which I believe had introduction to ML in general and also introduced the Linear Regression with one variable. It was taught/presented by another friend who had finished his post-graduation from the prestigious IISc. It went well too. (A side note : This friend seemed to have picked up the teaching traits/style of his IISc professors and I was getting the feeling of actually attending a class in IISc). More importantly most of us had watched this video and read the lecture notes once, before hand. All in all, the plan was on track. Till now.

But then it all fell apart from the subsequent week. The meeting time clashed with another actual work related meeting for some. An upcoming release caused a couple of us to give this a skip. Then after two missed meetings, the interest had pretty much waned away and the ever increasing work load did not help anyone. After postponing the meeting many times and several declined responses for the calendar invite, the "Statistics 101" friend, who had setup the meeting, removed it entirely from the calendar and Machine Learning studies ceased to exist officially too.

Now, a couple of months ago when I found out about the ML class being offered again, in an entirely new package tailor made for online learning, I decided that this time I would take it seriously and learn ML for real. I signed up immediately, not just for ML, but also for the DB and AI classes that are also being offered simultaneously, that too for the "Advanced Track" in all of those (which now looks like a bad move. I don't think I will be able to take up the AI class). Although classes officially started on Sunday/Monday, I could not get to until today. I just kept postponing it. Thanks to the review questions deadline, it came to now or never situation and I finally took the bold step of starting with the ML and DB classes. Luckily DB class doesn't have any assignments due on 16th Oct. So I just watched the introductory video. Then I started with the ML video lectures, which I am going through right now. I hope to keep up with the course schedule and get to all assignments on time, although they allow two delayed submissions. More importantly, I hope to learn something that I can use at my work right away, because I know there is scope for that at work.

I intend to continuously blog, as I go through the course. This will sort act like my notebook and also keep my blog alive and updated and also have some meaningful content. :)

Good luck to myself. .. !
Hari: Om.
ಹರಿಃ ಓಂ.

OnSwipe redirect code

Saturday, October 15, 2011

Machine Learning online course - Class 1

Friday, October 14, 2011

Starting with the Stanford online Machine Learning class