-- Apr 21 In-Class Exercise Thread
With each machine getting 1/6 of 10^9, the max any machine will have to process will be 166,666,667 documents, which will be 1666 writes with some left in memory. The generations follow a log2 structure, with each doubling of writes corresponding to a new generation, so log2(1666) = 10.70217268536555, or 11 generations on disk and 1 in memory, for 12 generations. The max merges will be 10 , for when we have 1 2 3 4 5 6 7 8 9 10 and add another, making
1 1 2 3 4 5 6 7 8 9 10 -> 2 2 3 4 5 6 7 8 9 10
2 2 3 4 5 6 7 8 9 10 -> 3 3 4 5 6 7 8 9 10
3 3 4 5 6 7 8 9 10 -> 4 4 5 6 7 8 9 10
4 4 5 6 7 8 9 10 -> 5 5 6 7 8 9 10
5 5 6 7 8 9 10 -> 6 6 7 8 9 10
6 6 7 8 9 10 -> 7 7 8 9 10
7 7 8 9 10 -> 8 8 9 10
8 8 9 10 -> 9 9 10
9 9 10 -> 10 10
10 10 -> 11
(
Edited: 2021-04-21)
With each machine getting 1/6 of 10^9, the max any machine will have to process will be 166,666,667 documents, which will be 1666 writes with some left in memory. The generations follow a log2 structure, with each doubling of writes corresponding to a new generation, so log2(1666) = 10.70217268536555, or 11 generations on disk and 1 in memory, for 12 generations. The max merges will be 10 , for when we have 1 2 3 4 5 6 7 8 9 10 and add another, making
1 1 2 3 4 5 6 7 8 9 10 -> 2 2 3 4 5 6 7 8 9 10
2 2 3 4 5 6 7 8 9 10 -> 3 3 4 5 6 7 8 9 10
3 3 4 5 6 7 8 9 10 -> 4 4 5 6 7 8 9 10
4 4 5 6 7 8 9 10 -> 5 5 6 7 8 9 10
5 5 6 7 8 9 10 -> 6 6 7 8 9 10
6 6 7 8 9 10 -> 7 7 8 9 10
7 7 8 9 10 -> 8 8 9 10
8 8 9 10 -> 9 9 10
9 9 10 -> 10 10
10 10 -> 11