1) n/10 = -1.44log2(0.01) = -1.44 * -6.6438561898 = 9.567152913
therefore n = 10 * 9.567152913 = 95.67152913 which is about 96 bits
and log2(0.01) is close to 7, so m = 7 hash functions.
2) Say our hash functions are as follows:
h1(k) = (3 * k + 7) mod 96 h2(k) = (113 * k + 2) mod 96 h3(k) = (227 * k + 11) mod 96 h4(k) = (7 * k + 3) mod 96 h5(k) = (53 * k + 59) mod 96 h6(k) = (89 * k + 33 mod 96 h7(k) = (97 * k + 221) mod 96
Let D = 00000.....00000 (96 zeroes)
Now inserting: k = 1000 h1(k) = 31 h2(k) = 10 h3(k) = 67 <----- collision!!! h4(k) = 91 h5(k) = 67 <----- collision!!! h6(k) = 41 h7(k) = 69
Collision means there will only be 6 1's in D
000000000100000000000000000000100000000010000000000000000000000000101000000000000000000000100000
Now inserting: k = 55 h1(k) = 76 h2(k) = 73 h3(k) = 16 h4(k) = 4 h5(k) = 94 h6(k) = 32 h7(k) = 84
000100000100000100000000000000110000000010000000000000000000000000101000100100000001000000100100
Now inserting: k = 99 h1(k) = 16 <--- already active! h2(k) = 53 h3(k) = 20 h4(k) = 24 h5(k) = 26 h6(k) = 12 h7(k) = 32 <--- already active!
000100000101000100010001010000110000000010000000000010000000000000101000100100000001000000100100
Looking up k = 55 returns the same values, so that renders a true positive. Looking up k = 75, however: h1(k) = 40 h2(k) = 29 h3(k) = 44 h4(k) = 48 h5(k) = 2 h6(k) = 84 <--- OUCH! </3 h7(k) = 8
84 was already active. This means that looking up k = 75 renders a false positive since k = 55 rendered that key active. Clearly, our hash functions need to be more collision resistant.(Edited: 2020-10-21)
n = 10 * (-1.44log(0.01)) = 10*(1.44*6.64385) = 95.6714 ~ 96 bits m = 6.64385 ~ 7
Hash Functions - 1. a = 5, b = 8 => h(k) = (5k + 8)mod96 2. a = 6, b = 9 => h(k) = (6k + 9)mod96 3. a = 8, b = 7 => h(k) = (8k + 7)mod96 4. a = 2, b = 1 => h(k) = (2k + 1)mod96 5. a = 3, b = 5 => h(k) = (3k + 5)mod96 6. a = 10, b = 5 => h(k) = (10k + 5)mod96 7. a = 9, b = 23 => h(k) = (9k + 23)mod96
Inserting 1000 hi's = 16, 57, 39, 81, 29, 21, 95 Choosing D (for simplicity) = 00000000000....00000000 (total 96 bits) Final look of bits - 000000000000000100001000000010000000001000000000000000001000000000000000000000001000000000000010(Edited: 2020-10-21)
1. m = log2(0.01) = 6.64 = 7 n = 10 * (-1.44log(0.01)) 95.6 = 96 2. h9,4(k) = 9 * k + 4 mod 96 h1,7(k) = 1 * k + 7 mod 96 h5,15(k) = 5 * k + 15 mod 96 h15,8(k) = 15 * k + 8 mod 96 h1,4(k) = 1 * k + 4 mod 96 h6,3(k) = 6 * k + 3 mod 96 h4,1(k) = 4 * k + 1 mod 96 3. k = 1000 h9,4(k) = 9 * k + 4 mod 96 = 76 bit index h1,7(k) = 1 * k + 7 mod 96 = 47 bit index h5,15(k) = 5 * k + 15 mod 96 = 23 bit index h15,8(k) = 15 * k + 8 mod 96 = 32 bit index h1,4(k) = 1 * k + 4 mod 96 = 44 bit index h6,3(k) = 6 * k + 3 mod 96 = 51 bit index h4,1(k) = 4 * k + 1 mod 96 = 65 bit index k = 55 h9,4(k) = 9 * k + 4 mod 96 = 19 bit index h1,7(k) = 1 * k + 7 mod 96 = 62 bit index h5,15(k) = 5 * k + 15 mod 96 = 2 bit index h15,8(k) = 15 * k + 8 mod 96 = 65 bit index h1,4(k) = 1 * k + 4 mod 96 = 59 bit index h6,3(k) = 6 * k + 3 mod 96 = 45 bit index h4,1(k) = 4 * k + 1 mod 96 = 29 bit index
k = 99 h9,4(k) = 9 * k + 4 mod 96 = 31 bit index h1,7(k) = 1 * k + 7 mod 96 = 10 bit index h5,15(k) = 5 * k + 15 mod 96 = 30 bit index h15,8(k) = 15 * k + 8 mod 96 = 53 bit index h1,4(k) = 1 * k + 4 mod 96 = 7 bit index h6,3(k) = 6 * k + 3 mod 96 = 21 bit index h4,1(k) = 4 * k + 1 mod 96 = 13 bit index
4. Lookup k = 75 h9,4(k) = 9 * k + 4 mod 96 = 7 bit index <- yes h1,7(k) = 1 * k + 7 mod 96 = 82 bit index <- no h5,15(k) = 5 * k + 15 mod 96 = 6 bit index <- no h15,8(k) = 15 * k + 8 mod 96 = 77 bit index <- no h1,4(k) = 1 * k + 4 mod 96 = 79 bit index <- no h6,3(k) = 6 * k + 3 mod 96 = 69 bit index <- no h4,1(k) = 4 * k + 1 mod 96 = 13 bit index <- no No false positives(Edited: 2020-10-21)
hash keys m = -(log_2(0.01)) = 6.64 ~ 7 hash keys n bit D vector = 10 * (-1.44log(0.01)) = 95.61 ~ 96 bit insert k = 1000 h5,7(k) = (5007)mod 96 = 15th bit h2,3(k) = (2003)mod 96 = 83rd bit h17,19(k) = (17019)mod 96 = 27th bit h31,37(k) = (31037)mod 96 = 29th bit h11,13(k) = (11013)mod 96 = 69th bit h23,29(k) = (23029)mod 96 = 85th bit h41,43(k) = (41043)mod 96 = 51st bit insert k = 55 h5,7(k) = 90th bit h2,3(k) = 17th bit h17,19(k) = 90th bit h31,37(k) = 14th bit h11,13(k) = 42nd bit h23,29(k) = 46th bit h41,43(k) = 90th bit insert k = 99 h5,7(k) = 22nd bit h2,3(k) = 9th bit h17,19(k) = 70th bit h31,37(k) = 34th bit h11,13(k) = 1102nd bit h23,29(k) = 2nd bit h41,43(k) = 70th bit lookup 55 and 75: 55 is present in the filter if k=75 h5,7(k) = 94th bit h2,3(k) = 57th bit h17,19(k) = 46th bit h31,37(k) = 58th bit h11,13(k) = 70th bit h23,29(k) = 26th bit h41,43(k) = 46th bit
Since 46 and 70 are the only bits already on none of other bit indexes of 75 are on so it does not show a false positive.(Edited: 2020-10-21)
n = "For a 1% error rate, we need about 9.6 bits/key" * 10keys = 96 bits m = log2(0.01) = 7 if(k===1000){ h1=(1*k)+11 mod 96=51 h2=(2*k)+12 mod 96=92 h3=(3*k)+13 mod 96=37 h4=(4*k)+14 mod 96=78 h5=(5*k)+15 mod 96=23 h6=(6*k)+16 mod 96=64 h7=(7*k)+17 mod 96=9 }
if(k===55){ h1=(1*k)+11 mod 96=96 h2=(2*k)+12 mod 96=26 h3=(3*k)+13 mod 96=82 h4=(4*k)+14 mod 96=42 h5=(5*k)+15 mod 96=2 h6=(6*k)+16 mod 96=58 h7=(7*k)+17 mod 96=18 }
if(k===99){ h1=(1*k)+11 mod 96=14 h2=(2*k)+12 mod 96=18 h3=(3*k)+13 mod 96=22 h4=(4*k)+14 mod 96=26 h5=(5*k)+15 mod 96=30 h6=(6*k)+16 mod 96=34 h7=(7*k)+17 mod 96=38 } if(k===75){ h1=(1*k)+11 mod 96=86 h2=(2*k)+12 mod 96=66 h3=(3*k)+13 mod 96=46 h4=(4*k)+14 mod 96=26 h5=(5*k)+15 mod 96=6 h6=(6*k)+16 mod 96=82 h7=(7*k)+17 mod 96=62 }
check 55: 02 18 26 42 58 82 96 check 75: 14 18 22 26 30 34 38only 18th bit have a collision, not all of them, so no positive false (Edited: 2020-10-21)
For 1% error rate, n = 9.6 * 10 = 96 bits m = log2(0.01) = 6.64 = 7 hash functions
Defining the hash functions as : h1(3,7) = 3k + 7 mod 96 h2(6,13) = 6k + 13 mod 96 h3(15,55) = 15k + 55 mod 96 h4(84,62) = 84k + 62 mod 96 h5(32,81) = 32k + 81 mod 96 h6(45,23) = 45k + 23 mod 96 h7(74,35) = 74k + 35 mod 96 Inserting k=1000 Setting bits : 31 61 79 62 17 95 19 Vector D [] = 00000000 00000000 01010000 00000001 00000000 00000000 00000000 00000110 00000000 00000001 00000000 00000001
Inserting k=55 Setting bits : 76 55 16 74 17 2 73 Vector D[] = 00100000 00000000 11010000 00000001 00000000 00000000 00000001 00000110 00000000 01101001 00000000 00000001 Inserting k=99 Setting bits : 16 31 4 26 81 62 65 Vector D[] = 00101000 00000000 11010000 00100001 00000000 00000000 00000001 00000110 01000000 01101001 01000000 00000001
Lookup 55 Search bit 76 = 1 Search bit 55 = 1 Search bit 16 = 1 Search bit 74 = 1 Search bit 17 = 1 Search bit 2 = 1 Search bit 73 = 1 Present
Lookup 75 Search bit 40 = 0 Not present
There are no false positives
m = -(log_2(0.01)) = ~ 7 hash keys n = 10 * 9.6 = ~ 96 bit Insert k = 1000 h1,2(k) = (1002)mod 96 = 42th bit h3,4(k) = (3004)mod 96 = 28rd bit h5,6(k) = (5006)mod 96 = 14th bit h7,8(k) = (7008)mod 96 = 0th bit h9,10(k) = (9010)mod 96 = 82th bit h11,12(k) = (11012)mod 96 = 68th bit h13,14(k) = (13014)mod 96 = 54st bit insert k = 55 h1,2(k) = 57th bit h3,4(k) = 73rd bit h5,6(k) = 89th bit h7,8(k) = 9th bit h9,10(k) = 25th bit h11,12(k) = 41st bit h13,14(k) = 57th bit insert k = 99 h1,2(k) = 5th bit h3,4(k) = 13th bit h5,6(k) = 21st bit h7,8(k) = 29th bit h9,10(k) = 37th bit h11,12(k) = 45th bit h13,14(k) = 53rd bit lookup 55 and 75: 55 is present k=75: h1,2(k) = 77th bit h3,4(k) = 37th bit h5,6(k) = 93rd bit h7,8(k) = 53rd bit h9,10(k) = 13th bit h11,12(k) = 69th bit h13,14(k) = 29th bit
only 37, 53, 13, and 29 are already on. Do not get a false positive