-- Practice Midterm 2 Solutions
1.
from PIL import Image
import numpy as np
img = Image.open('bob.png')
img = img.convert('L')
img_array = np.fromstring(img.tobytes(), dtype = np.uint8)
img_arraay = nparray.reshape(img.size[1], img.size[0])
2.
You split the data set into training and test data several times and in several different ways, perform your measurements for each of these trials, and combine these measurements by computing an appropriate aggregation.
For exhaustive one, cycle over all possible p-subsets of the data sets, train on the data excluding the p-subset, test on the p-subset. This is called Leave-p-out cross validation. for non-exhaustive one, Randomly repeat m-times, choose a subset of size p as the test data set, train on the remaining data, test on this subset; this is called 'Repeated Random Sub-sampling Validation'
3
Kullback-Leibler, Is a measure of dissimilarity b/w two given prob distributions :
E[log(P(x)/Q(x)).
1.
from PIL import Image
import numpy as np
img = Image.open('bob.png')
img = img.convert('L')
img_array = np.fromstring(img.tobytes(), dtype = np.uint8)
img_arraay = nparray.reshape(img.size[1], img.size[0])
2.
You split the data set into training and test data several times and in several different ways, perform your measurements for each of these trials, and combine these measurements by computing an appropriate aggregation.
For exhaustive one, cycle over all possible p-subsets of the data sets, train on the data excluding the p-subset, test on the p-subset. This is called Leave-p-out cross validation. for non-exhaustive one, Randomly repeat m-times, choose a subset of size p as the test data set, train on the remaining data, test on this subset; this is called 'Repeated Random Sub-sampling Validation'
3
Kullback-Leibler, Is a measure of dissimilarity b/w two given prob distributions :
E[log(P(x)/Q(x)).