-- Practice Midterm Solutions
Names: Madhujita Ambaskar, Rishabh Pandey. Solution for Q.10
Sometimes when we have a limited amount of data, we can divide that data into test set & training set. This can result in a small test set which in turn can result in the error in the accuracy of test measurements being larger. Due to these large errors, it might be difficult to compare two algorithms' performance.
Cross Validation solves this by splitting the data set into training and test data several times and in several different ways, performing measurements for each of these trials, and combining these measurements by computing an appropriate aggregation such as an average. Cross-validation can be either exhaustive or non-exhaustive.
One example for exhaustive cross validation is Leave-p-out cross validation where you cycle over all possible p-subsets of the data sets, train on the data excluding the p-subset, test on the p-subset.
One example of non exhaustive cross validation is Repeated Random Sub-sampling Validation where you randomly repeat m-times, choose a subset of size p as the test data set, train on the remaining data, test on this subset.
(
Edited: 2021-10-11)
Names: Madhujita Ambaskar, Rishabh Pandey. Solution for Q.10
Sometimes when we have a limited amount of data, we can divide that data into test set & training set. This can result in a small test set which in turn can result in the error in the accuracy of test measurements being larger. Due to these large errors, it might be difficult to compare two algorithms' performance.
Cross Validation solves this by splitting the data set into training and test data several times and in several different ways, performing measurements for each of these trials, and combining these measurements by computing an appropriate aggregation such as an average. Cross-validation can be either exhaustive or non-exhaustive.
One example for exhaustive cross validation is Leave-p-out cross validation where you cycle over all possible p-subsets of the data sets, train on the data excluding the p-subset, test on the p-subset.
One example of non exhaustive cross validation is Repeated Random Sub-sampling Validation where you randomly repeat m-times, choose a subset of size p as the test data set, train on the remaining data, test on this subset.