Cross Validation Methods

Hi Team,

Continuing to the previous blog, I have explained two cross-validation methods in this blog, I will be writing the rest of the methods in an upcoming blog.

Types of cross-validation:  There are several cross-validation methods in Statistics. However, in this blog, I’ll be explaining two cross-validation methods as said above.

  1. Leave-one-out cross-validation (LOOCV)

In the CV method, the test data consists of 1 sample data (observation) and training data Contains n – 1 sample data. We should repeat this process “n” times, by excluding different data points each time.

Dis-Advantage: When dealing with complex and large datasets, it requires “n” iterations to validate the model. Also, this model can have a high variance since it depends on one data point in each iteration.

 

  1. K-Fold cross-validation

This is the commonly used CV method for assessing the performance and preventing overfitting. In this technique, the whole dataset is partitioned in k parts of equal size. It’s known as k-fold since there are k parts where k can be any number.

 

For instance, if there are 500 records and we require 5 experiments (K value) to perform on these 500 records.

Then, Records/K = 500/5 = 100

 

These 100 records out of 500 records are used to find the ACCURACY1 in Experiment 1.

Again, we consider the other 100 records out of 500 records to find the ACCURACY2 in Experiment 2.

The same approach continues for Experiments 3, 4, and 5 to find the ACCURACY3, ACCURACY4 and ACCURACY5.

 

Based on the obtained 5 Accuracy values, we can find out the MEAN, MIN, and MAX Accuracy from 5 experiments.

 

Thank You!!

Leave a Reply

Your email address will not be published. Required fields are marked *