-- Aug 25 In-Class Exercise
Let us consider the 2D space to be the plane above the line y = 0. Any point in the half-space is ((x, y), 1), and ((x, y), 0) if it is not.
Non-Parametric Approach :
Let us use the k-nearest neighbors (k-NN) method, which is a non-parametric classification method to classify the points. Let us take k=3 for this example.
Our k-NN classifier is trained on a dataset containing 11 points:
((3.5, 10.5), 1), ((5, 10), 1), ((7.5, 9), 1), ((8, 11), 1), ((9.4, 9.6), 1), ((10.3, 10.8), 1), ((6, -1), 0), ((7, -2), 0), ((8.5, -1.5), 0), ((8.5, -3), 0), ((10, -0.7), 0)
Correct classification : Here the point P1 (3, 7) will be classified as 1 because it is nearest to the three points (3.5, 10.5), (5, 10), (8, 11) and all points are in the half-space.
Incorrect Classification (Misclassification) : The point P2 (8, 1) will be misclassified as 0 because it is nearest to the three points (6, -1), (8.5, -1.5), (10, -0.7) and all are not in the half-space. Ideally, it should’ve been classified as 1 as it was above the line y=0 but has been misclassified by the k-NN algorithm.
Parametric Approach : Let us say that the actual line which divides the 2D half-space is actually y = 10^-33*x, due to limitations in storing the bits we approximate it to y = 0. Because of this, the point (10^33, 0.5) which is below the line y = 10^-33*x , gets misclassified as 1 because it is above the line y = 0, when it should’ve been classified as 0.
(
Edited: 2021-08-30)
Let us consider the 2D space to be the plane above the line y = 0. Any point in the half-space is ((x, y), 1), and ((x, y), 0) if it is not.
'''Non-Parametric Approach''':
Let us use the k-nearest neighbors (k-NN) method, which is a non-parametric classification method to classify the points. Let us take k=3 for this example.
((resource:graph-knn.PNG|Resource Description for graph-knn.PNG))
Our k-NN classifier is trained on a dataset containing 11 points:
((3.5, 10.5), 1), ((5, 10), 1), ((7.5, 9), 1), ((8, 11), 1), ((9.4, 9.6), 1), ((10.3, 10.8), 1), ((6, -1), 0), ((7, -2), 0), ((8.5, -1.5), 0), ((8.5, -3), 0), ((10, -0.7), 0)
'''Correct classification''': Here the point P1 (3, 7) will be classified as 1 because it is nearest to the three points (3.5, 10.5), (5, 10), (8, 11) and all points are in the half-space.
'''Incorrect Classification (Misclassification)''': The point P2 (8, 1) will be misclassified as 0 because it is nearest to the three points (6, -1), (8.5, -1.5), (10, -0.7) and all are not in the half-space. Ideally, it should’ve been classified as 1 as it was above the line y=0 but has been misclassified by the k-NN algorithm.
'''Parametric Approach''': Let us say that the actual line which divides the 2D half-space is actually y = 10^-33*x, due to limitations in storing the bits we approximate it to y = 0. Because of this, the point (10^33, 0.5) which is below the line y = 10^-33*x , gets misclassified as 1 because it is above the line y = 0, when it should’ve been classified as 0.