Run the crawler and produce pairs (i,j)
Map: feed each pair (i,j) to mapper[i]. Mapper[i,j] produces a tuple of the form ((j, (i, in)), (i,(j, out))) which are then fed to the reducers according to the keys
Reducer: The reducer either receives (j, (i, in)) or (j, (i', out)). For every unique pair of (i,i'), output (j,(i,i'))
Map: each mapper receives a tuple (j,(i,i')) and outputs (i,i')
Reduce: each reducer receives (i,i') output (i,i')(Edited: 2018-03-07)
Map: For every i, output a list like: <i,j> -> <<j,<i,in>>, <i,<j,out>>>
Reduce: Now, for all keys with j, will be mapped to the same reduce like: <j,<i,in>>,<j,<i',out>> -> <j,<i,i'>>
The output of reduce will then be given to another map function, where the key will be stripped: <j,<i,i'>> -> <i,i'>
Finally, the reducer will output all the <i,i'>(Edited: 2018-03-07)
Round 1
mapper - U is the set containing (i,j)
for i,j in U:
output(<<j,<i,in>>) as out1
output(<i,<j,out>>>) as out2
return out1, out2
reducer -
for key1, value1 in out1:
if key1 in out2.keys():
two_hops_output.append(<<key1,value1>, <key2, out2[key2]>>)
return two_hops_output
Round 2 -
mapper - input - two_hops_output
for value1, value2 in two_hops_output:
output(value[0],value[0]) as out1
return out1
reducer -
for value in out1:
final_output.append(value)
return final_output
For every i, output: <i,j> -> <<j,<i,in>>, <i,<j,out>>>Reduce 1:
All records with key j, will be shuffled to the same reduce: <j,<i,in>>,<j,<i',out>> -> <j,<i,i'>>
The key is removed: <j,<i,i'>> -> <i,i'>Reduce 2:
It simply outputs all the <i,i'> received from all mappers.
pairs <i,j> is the input to Map
Step1 Map: feed each pair <i,j> to m[i]. m[i] produces a tuple of the form <<j, <i, in>>, <i,<j, out>>>
Reduce: The reducer would reduce the tuples by key j input: <j, <i, in>> , <j, <i', out>> output: <j,<i,i'>>
Step2 Map: strip key j input: each mapper receives tuple <j,<i,i'>> outputs: <i,i'>
Reduce: does nothing. input: <i,i'> output <i,i'>