What is this AI & Machine Learning? And what do I need to know to learn Machine Learning?
If most of the stuff you come across on AI are either of below two types — then read on...
Artificial Intelligence is the broader concept of machines being able to carry out tasks in a way that we would consider “smart “And, Machine Learning is a current application of AI based around the idea that we should just be able to give machines access to data and let them learn for themselves.
Okay, Tell me more about this Machine Learning.
Meet Bharath - he is trying to buy a car and is saving up for it. To get an estimate of how much he should save, he goes through a ton of ads on internet and calls up multiple dealers. Here is what he finds -
New cars in segment he is looking at - are around $15,000 and
one year used ones sell for around half that - $10,000 and
two year used ones are for $9000 and so on.
So, our brilliant car buyer figures out used cars reduce by $1000 for every year that it was used and hits a lower bound at around $4000.
Generally, people do this all the time - trying to estimate something given some data and context. In Machine learning language, this is called Regression.
Same way - most of the tasks, essentially can be broken down to below 4 categories.
Given a new “thing” - find out what is its class? a.k.a. classification problems. Is it a dog or a cat, is the email legit or spam etc. Think bins or classes.
Given a new “thing” - What can it be its value? a.k.a. regression problems. Predicting price of car, predicting stress in a network - think continuous numbers.
Understand and then manipulate underlying patterns in given data.
Learning a task by understanding what is good or a bad behavior to do the task
At a high level - a machine can learn to do any of the above 4 types of tasks, by looking at data - say hello to Machine learning.
Okay, why do we need Machines to learn these things?
Let us get back to the car purchase example, Remember Bharath started out with a specific category of car in his head, and then went on to arrive at a method to estimate price of a car. Now he wants to be able to generalize this method, to be able to accommodate multiple other parameters like - different makes / brands, Month of purchase ( in India cars somehow get cheaper in July and then again in Nov - Dec ), dealership , color or car etc. Now these are too many variables to track and too much of data for Bharath to keep a track of, forget trying to figure out underlying patterns, Hence - Machines. And it works!! and it works unreasonably well !!
Machine copes with this task much better than a real person does when carefully analyzing all the dependencies in their mind.
Okay, what do you mean by “LEARN” when you say machines can learn?
Let us dive a little deeper into “learning” aspect by taking one of the classification problems I alluded to in previous section. Imagine you have multiple things and you want a machine to be able to learn to differentiate any object from others.
This is how you make a machine learn -
Take some data which gives information on objects and some attributes that object might have. Example (sorry for the dark example) - if feed to a machine - bunch of measurements from 10 biopsy reports.
In Machine Learning Language what we have done now is - giving Machine a “labeled Data” - data here means set of attributes also known as “Features” - in our case Cell circumference, Cell density, Cell radius and Cell reflectivity.
Something happens inside your machine and now if you give any new record to your machine (this means giving bunch of attributes) - your machine should be able to correctly label the cell type as Malignant or a benign cell.
So, if you give a cell with its circumference as 0.232 , radius as 1.293 and so one - computer should be able to tell you it’s a Benign cell.
In real life problems - there might be hundreds of attributes (with neural nets - you work with billions of attributes).
In General - given any bunch of features and labels, a machine learns to label next set of features by itself.
Super cool - but how does a machine do it? What is happening inside the machine?
Let's start by drawing up a simple graph with 2 axis -
x-axis corresponding to cell density and y-axis corresponding to cell radius features for our toy example.
Now I have drawn only 2 axis corresponding to two features from our toy datasets but what a machine does is - it draws up one axis for each of the features, we cannot visualize anything with more than 3 dimensions but this is what machines do. Once done - corresponding to each of measurement one point appears on our multidimensional graph for each label.
Just as how we have done in our 2D graph, the Machine draws a decision boundary which separates one label from other. Given any new set of features, our computer simply maps it on to the graph and sees which side of decision boundary does the new point lie and gives back the corresponding label.
In machine learning language this is called - SUPPORT VECTOR MACHINES - fancy name, right?
Like this Support Vector machine - there are multiple algorithms which a computer might use to draw these decision boundaries and arrive at identifying a label for given set of features. They too have fancy names like logistic regression, naive bayes, Decision trees, Random Forests ... but in essence, all these algorithms try to done thing, fit a pattern to the data.
With this understanding I hope below diagram makes sense now.
Nice, now what do I need to learn so i can start writing these algorithms?
Well to start with - you do not need to write these algorithms. Good folks with tons of brains have already written a lot of algorithms which we can use on any datasets. However, we still need to understand -
what these algorithms are,
how to use these algorithms on any data
and more importantly, how do these work.
This is important because each algorithm would be used on certain type of data and would need data in some specific way so we would still need to understand how these algorithms work.
And for us to know how this work. we would need some amount of statistics and some amount of mathematics.
Do not worry, there is nothing that cannot be explained in plain english. (well at least statistics can be)
Do check the next article on why you need to know statistics to understand and use machine learning.