Supervised Machine Learning is the process of determining the relationship between a given set of features (or variables) and a target value, which is also known as a label or a classification. Which means building ML Models that can take in certain input data and spit out a predicted value.
Let’s understand this by taking the example of – “Whether a bank should give House Loan to an applicant or not?” based upon information submitted by applicant to bank. Let’s assume that in the House Load Application, applicant submitted Age, Sex, Education Level, Income Level, Marital Status, Demographics, Previous Loans paid.
|Age||Sex||Education Level||Income Level|
|Marital Status||Demographics||Previous Load Paid|
|30||Female||Undergraduate||98,000 USD||Yes||New York||Yes|
|29||Female||High School||67,000 USD||No||LA||Yes|
|64||Male||Undergraduate||180,000 USD||No||Bay Area||Yes|
Based upon this data, a Supervised Machine Learning Model can be trained which can provide a yes or no answer to question “Give the loan to applicant?” for newer applicants.
Supervised Machine Learning models can be further divided into Classification Tasks and Regression Tasks.
Classification Tasks in Supervised Machine Learning
Classification Tasks are used to build models out of data with discrete categories as labels. For example – A classification task can be used to predict whether a person will pay back loan or not. You have more than two discrete categories, such as predicting ranking of a horde in a race, but they must be a finite number.
In the above image, Machine Learning Model is classifying observation in dataset as yellow, blue, blackish or pink.
Most classification tasks output the prediction as probability of an instance to belong to each output label. The assigned label is one with highest probability.
Supervised Classification Machine Learning Algorithms
- Decision Trees – This algorithm follow a tree-like architecture that simulates decision process following a series of decisions, considering one variable at a time.
- Naive Bayes Classifier – This algorithm relies on a group of probabilistic equations based on Bayes’ theorem, which assumes independence among features. It has ability to consider several attributes.
- Artificial Neural Networks (ANNs) – These replicate the structure and performance of a biological neural network to perform pattern recognition tasks. An ANN consists of interconnected neurons, laid out with a set architecture. They pass information to one another until a result is achieved.
Regression Tasks in Supervised Machine Learning
Regression Tasks are used for data with continuous quantities as labels. For example – A regression task can be used for predicting house prices. This means that value is represented by a quantity and not by a set of possible outputs. Output labels can be of integer or float types.
Some commonly used Supervised Machine Learning algorithms for Regression Tasks are Linear Regression, Regression Trees, Support Vector Regression, Artificial Neural Networks (ANNs).
Evaluating Performance of a Supervised Machine Learning Model
Model evaluation is essential for the development of successful models that perform well not just on the data that was used to train the model, but also on data that it has not seen yet. When dealing with supervised machine learning problems, the process of assessing the model is made particularly simple as ground truth is already known which can be compared to prediction of model.
When applying a model to unseen data that does not have a label class to compare it to, determining the accuracy percentage of the model is essential. For example, a model with an accuracy of 95% may lead you to believe that the chances of making an accurate forecast are great, and as a result, the model should be considered trustworthy. But definitely that assumption can be wrong as well because metric “accuracy of model” implies what? is known. Moreover a specific performance measurement metric for a Supervised Machine Learning Model should be selected on case by case basis. Because for some models it would be better to use one metric while same metric can imply something else for other model. So be careful while selecting a specific metric to measure performance of a model.
Evaluating models’ performance should be done on two types of datasets – Validation DataSet to fine-tune the model and Testing DataSet to evaluate how well model will function when applied to data which it does not know about.
Metrics used for Measuring Performance of a Supervised Classification Machine Learning Model
Confusion Matrix is a table that contains information about performance of a model. In Confusion Metric table columns represent instances which belong to a predicted class, while rows represent instances that actually belong to a class.
Let’s understand what exactly is “Confusion Metric” by taking an example of how many images in a given dataset of 500 images are of dogs.
|Dog Image||Not Dog Image|
|Not Dog Image||150||350|
Each cell in a Confusion Matrix can be classified as True Positives (TP), False Positives (FP), True Negatives (TN), False Negatives (FN).
|Cell of Confusion Metric||Description||Example|
|True Positives (TP)||Refers to instances that model correctly classified the event as positive||Correctly classifying image of the dog as god image|
|False Positives (FP)||Refers to the instances that model incorrectly classified the event as positive||Images of other animals being classified as Dog Images by model|
|True Negatives (TN)||Refers to the instances that model correctly classified event as negative||Images of other animals are being classified as not images of Dog by model|
|False Negatives (FN)||Refers to the instances that model incorrectly classified the event as negative||Images of dog being classified as not images of Dog by model|
Accuracy of a model measures its capability to correctly classify all instances. It can be calculated by summing up number of True Positives (TP) and True Negatives (TN) then dividing by total number of instances.
Accuracy = (TP + TN)/Total Number of Instances
Precision measures the model’s ability to correctly classify positive labels by comparing it with total number of instances predicted as positive.
Precision can be calculated by taking ratio of True Positives (TP) and sum of True Positives (TP) and False Positives (FP).
Precision = TP divided by sum of TP, FP
Recall measures the number of correctly predicted positive labels against all positive labels.
Recall can be calculated by taking ratio of True Positives (TP) and sum of True Positives (TP) and False Negatives (FN).
Recall = TP divided by sum of TP, FN
Metrics used for Measuring Performance of a Supervised Regression Machine Learning Model
Considering that regression tasks are those where final output is continuous rather than being categorical, the performance of model can be measured by comparing predicted value with actual value.
For example – The performance of a Supervised Regression Machine Learning Model which makes prediction about price of a house in a locality can be measured by comparing actual price of house with that predicted by model.
Let’s say actual price of a house in a locality is 700,000 USD but our model is predicting price of that being 699,999 USD which is pretty close, so we can say that given model is efficient enough as difference between predicted and actual value is quite low.
For measuring this difference between predicted and actual values Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE) can be used.
Mean Absolute Error (MAE)
It measures average absolute difference between predicted values and actual values, without taking into account the direction of error.
- m = Number of total instances
- yi = Actual Value
- ŷi = Predicted Value
|Actual House Price (yi)||Predicted House Price (ŷi)||Predicted – Actual (yi –ŷi)|
|500,000 USD||499,012 USD||988 USD|
|650,000 USD||590,918 USD||59082 USD|
|839,193 USD||832,039 USD||7154 USD|
|127,092 USD||120,043 USD||7049 USD|
|983,028 USD||980,832 USD||2196 USD|
= 76469 USD
MAE = 76469/5 = 15293.8 USD
Root Mean Square Error (RMSE)
It measures average magnitude of error between actual value and predicted value. It can be calculated by taking square root of average of squared difference between actual, predicted values.
- m = Number of total instances
- yi = Actual Value
- ŷi = Predicted Value
|Actual House Price (yi)||Predicted House Price (ŷi)||Predicted – Actual (yi –ŷi)||Predicted – Actual |
|500,000 USD||499,012 USD||988 USD||976144 USD|
|650,000 USD||590,918 USD||59082 USD||3490682724USD|
|839,193 USD||832,039 USD||7154 USD||51179716USD|
|127,092 USD||120,043 USD||7049 USD||49688401USD|
|983,028 USD||980,832 USD||2196 USD||4822416USD|
RMSE = Square Root (3597349401/5) = Square Root (719469880.2) = 26822.93 USD
Which one metric out of MAE, RMSE to use for measuring the performance of Supervised Regression Machine Learning Model?
Both of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) expresses the average error, in a range from 0 to infinity, where lower the value the better the performance of model. The main difference between these two metrics is that MAE assigns same weight of importance to all errors, while RMSE squares the error, assigning higher weight to larger errrors.
Consider this, RMSE metric is especially useful in cases where larger errors should be penalized, meaning that outliers are taken into account in the measurement of performance. For example – RMSE metric can be used when a value that is off by 4 is more than twice as bad as being off by 2. The MAE, on the other hand, is used when a value that is off by 4 is just twice as bad as a value off by 2.