-- Dec 2 In-Class Exercise Thread
We will need to create a terms list that resembles an inverted index.
Each term has a list mapped to it that contains all the skipgrams in which it is included.
Each skipgram includes the two words before and the two words after this term.
To find how many skipgrams two terms share, each skipgram can be an String or ArrayList object
(assuming we are doing this in Java) that contains the four terms. Given two terms t1 and t2,
we can use the "contains" method to see how many of t1's skipgrams t2 is found, increasing the
tally with each positive occurrence. This will return the total number of skipgrams they share.
The other parameter (total number of skipgrams in t1 + total number of skipgrams in t2) can be
found easily by looking up the total sizes of the skipgram lists mapped to each of the two terms
and adding them together.
We will need to create a terms list that resembles an inverted index.
Each term has a list mapped to it that contains all the skipgrams in which it is included.
Each skipgram includes the two words before and the two words after this term.
To find how many skipgrams two terms share, each skipgram can be an String or ArrayList object
(assuming we are doing this in Java) that contains the four terms. Given two terms t1 and t2,
we can use the "contains" method to see how many of t1's skipgrams t2 is found, increasing the
tally with each positive occurrence. This will return the total number of skipgrams they share.
The other parameter (total number of skipgrams in t1 + total number of skipgrams in t2) can be
found easily by looking up the total sizes of the skipgram lists mapped to each of the two terms
and adding them together.