2022-03-02

Mar 2. In-Class Exercise Thread.

Please post your solution to the Mar. 2 In-Class Exercise to this thread.
Best,
Chris
Please post your solution to the Mar. 2 In-Class Exercise to this thread. Best, Chris

-- Mar 2. In-Class Exercise Thread
-> Cars, difficulties, and motocrosses.
o/p: car difficulti and motocross 
 
example: marketing - can be stemmed to market; IR will return results relevant to markets 
<pre> -> Cars, difficulties, and motocrosses. o/p: car difficulti and motocross example: marketing - can be stemmed to market; IR will return results relevant to markets </pre>

-- Mar 2. In-Class Exercise Thread
1)
Normalized - cars difficulties and motocrosses
Stemmed - car difficulti and motocross 
 
2)
"jumping jacks" could be used as a solution in which stemming would increase recall, but decrease precision as IR system would search jump jack which is not related to original query. 
 
(Edited: 2022-03-02)
<pre> 1) Normalized - cars difficulties and motocrosses Stemmed - car difficulti and motocross 2) "jumping jacks" could be used as a solution in which stemming would increase recall, but decrease precision as IR system would search jump jack which is not related to original query. </pre>

-- Mar 2. In-Class Exercise Thread
1. [Cars -> car, difficulties -> difficulti, motorcrosses -> motorcross]
2. By stemming a user-entered term, more documents are matched, as the alternate word forms for a user-entered term are matched as well, increasing the total recall. This comes at the expense of reducing the precision.
ex: testing -> test
(Edited: 2022-03-02)
1. [Cars -> car, difficulties -> difficulti, motorcrosses -> motorcross] 2. By stemming a user-entered term, more documents are matched, as the alternate word forms for a user-entered term are matched as well, increasing the total recall. This comes at the expense of reducing the precision. ex: testing -> test

-- Mar 2. In-Class Exercise Thread
 1) ['car', 'difficulti', 'motocross'] 
 2) More documents will be retrieved by the system when we use a stemmer, which increases the recall. However, this also brings incorrect results and might give them a high rank, which would negatively affect the precision. An example would be the query 'colonizer' which would become 'colon' after using Porter Stemmer, this would make the IR system show results irrelevant to the original query due to stemming.
(Edited: 2022-03-02)
1) ['car', 'difficulti', 'motocross'] 2) More documents will be retrieved by the system when we use a stemmer, which increases the recall. However, this also brings incorrect results and might give them a high rank, which would negatively affect the precision. An example would be the query 'colonizer' which would become 'colon' after using Porter Stemmer, this would make the IR system show results irrelevant to the original query due to stemming.

-- Mar 2. In-Class Exercise Thread
1. Cars, difficulties, and motocrosses." becomes:
 car
 difficulti
 motocross
2. An example would calling would become call and if there were a document with calling, calls, callers, called it would all become call, making results grand, increasing recall, and decreasing precision.
1. Cars, difficulties, and motocrosses." becomes: car difficulti motocross 2. An example would calling would become call and if there were a document with calling, calls, callers, called it would all become call, making results grand, increasing recall, and decreasing precision.

-- Mar 2. In-Class Exercise Thread
Resource Description for Screenshot 2022-03-02 at 2.26.33 PM.png
2) When a stemmed word can have multiple meanings. Different queries might map to the same wrong results. example : A dog barking --stemmed--> A dog bark. Query term - Bark (A tree bark)
((resource:Screenshot 2022-03-02 at 2.26.33 PM.png|Resource Description for Screenshot 2022-03-02 at 2.26.33 PM.png)) 2) When a stemmed word can have multiple meanings. Different queries might map to the same wrong results. example : A dog barking --stemmed--> A dog bark. Query term - Bark (A tree bark)

-- Mar 2. In-Class Exercise Thread
1. car difficulti and motorcross
2. feeding will be stemmed to feed. This will lead to more results on animal feed instead of feeding which would result in more charity/child care results.
(Edited: 2022-03-02)
1. car difficulti and motorcross 2. feeding will be stemmed to feed. This will lead to more results on animal feed instead of feeding which would result in more charity/child care results.

-- Mar 2. In-Class Exercise Thread
Resource Description for Screen Shot 2022-03-02 at 2.31.24 PM.png
((resource:Screen Shot 2022-03-02 at 2.31.24 PM.png|Resource Description for Screen Shot 2022-03-02 at 2.31.24 PM.png))

-- Mar 2. In-Class Exercise Thread
1) "Cars, difficulties, and motocrosses"
After normalization: Cars difficulties and motocrosses
After stemming using Porter Stemming: car difficulti and motorcross
2) "boating book" -> means a book about boating
   "boat booking" -> means to do a booking for a boat
   both the above examples stem to 'boat book' which increases the recall since it returns more relevant results but reduces precision since the match against the query is not precise.
(Edited: 2022-03-02)
1) "Cars, difficulties, and motocrosses" After normalization: Cars difficulties and motocrosses After stemming using Porter Stemming: car difficulti and motorcross 2) "boating book" -> means a book about boating "boat booking" -> means to do a booking for a boat both the above examples stem to 'boat book' which increases the recall since it returns more relevant results but reduces precision since the match against the query is not precise.
[ Next ]
X