算法代寫-COMS31900

時間：2021-01-25

Exercise Sheet 1: Hashing and Bloom filters

COMS31900 Advanced Algorithms 2019/2020

Please feel free to discuss these problems on the unit discussion board. If you would like to

have your answers marked, please either hand them in in person at the lecture or email them

to me with the email subject ”Problem sheet 1” by the deadline stated.

1 Weakly-universal Hashing

A hash function family H = {h1, h2, . . . } is weakly-universal iff for randomly and uniformly

chosen h ∈ H, we have Pr(h[x] = h[y]) ≤ 1/m for any distinct x, y ∈ U . Consider the following

hash function families. For each one, prove that it is weakly universal or give a counter-example.

1. Let p be a prime number and m be an integer, p ≥ m. Consider the hash function

family where you pick at random a ∈ {1, . . . , p? 1} and then define ha : {0, . . . , p? 1} →

{0, . . . ,m? 1} as ha(x) = (ax mod p) mod m.

2. Let p be a prime and m be an integer such that p ≥ m. Consider the hash function

family where you pick at random b ∈ {0, . . . , p? 1} and then define hb : {0, . . . , p? 1} →

{0, . . . ,m? 1} as hb(x) = ((x + b) mod p) mod m.

3. Let p be a multiple of m. Consider the hash function family where you pick at random

a ∈ {1, . . . ,m ? 1} and b ∈ {0, . . . ,m ? 1}. Define ha,b : {0, . . . , p ? 1} → {0, . . . ,m ? 1}

as ha,b(x) = ((ax + b) mod p) mod m).

2 Cuckoo Hashing

1. This question is about cuckoo hashing. Consider a small variant of cuckoo hashing where

we use two tables T1 and T2 of the same size and hash function h1 and h2. When inserting

a new key x, we first try to put x at position h1(x) in T1. If this leads to a collision,

then the previously stored key y is moved to position h2(y) in T2. If this leads to another

collision, then the next key is again inserted at the appropriate position in T1, and so on.

In some cases, this procedure continues forever, i.e. the same configuration appears after

some steps of moving the keys around to dissolve collisions.

(a) Consider two tables of size 5 each and two hash functions h1(k) = k mod 5 and

h2(k) = bk5c mod 5. Insert the keys 27, 2, 32 in this order into initially empty hash

tables, and show the result.

(b) Find another key such that its insertion leads to an infinite sequence of key displace-

ments.

2. In order to use cuckoo hashing under an unbounded number of key insertions, we cannot

have a hash table of fixed size. The size of the hash table has to scale with the number

1

of keys inserted. Suppose that we never delete a key that has been inserted. Consider

the following approach with Cuckoo hashing. When the current hash table fills up to its

capacity, a new hash table of doubled size is created. All keys are then rehashed to the

new table. Argue that the average time it takes to resize and rebuild the hash table, if

spread out over all insertions, is constant in expectation. That is, the expected amortised

cost of rebuilding is constant.

3 Bloom Filters

1. Answer the following three questions about Bloom filters:

(a) What operations do we perform on Bloom filters?

(b) What is the difference between hash tables and Bloom filters in terms of which data

we can access?

(c) Why is there is a problem when deleting elements from a Bloom filter?

2. Suppose you have two Bloom filters A and B (each having the same number of cells and

the same hash functions) representing the two sets A and B. Let C = A&B be the Bloom

filter formed by computing the bitwise Boolean and of A and B.

(a) C may not always be the same as the Bloom filter that would be constructed by

adding the elements of the set (A intersect B) one at a time. Explain why not.

(b) Does C correctly represent the set (A intersect B), in the sense that it gives a positive

answer for membership queries of all elements in this set? Explain why or why not.

(c) Suppose that we want to store a set S of n = 20 elements, drawn from a universe

of U = 10000 possible keys, in a Bloom filter of exactly N = 100 cells, and that we

care only about the accuracy of the Bloom filter and not its speed. For this problem

size, what is the best choice of the number of hash functions (the parameter r in the

lecture)? (That is what value of r gives the smallest possible probability that a key

not in S is a false positive?) What is the probability of a false positive for this choice

of r?

4 Perfect Hashing

This question is about perfect hashing:

1. Our perfect hashing scheme assumed the set of keys stored in the table is static. Suppose

instead that we want to add a few new items to our table after the initial construction.

Suggest a way to modify our intitial construction so that we can insert these new items

using no new space and without making significant changes to our existing table (in

particular, we don’t want to change our initial hash function). Your scheme should still

do lookups of all items in O(1) time, but you may use a bit more initial space.

2. Suppose now that we want to delete some of our initial items. Describe a simple way to

support deletions in our perfect hashing scheme.

2

學霸聯盟:"http://aessay.org"

COMS31900 Advanced Algorithms 2019/2020

Please feel free to discuss these problems on the unit discussion board. If you would like to

have your answers marked, please either hand them in in person at the lecture or email them

to me with the email subject ”Problem sheet 1” by the deadline stated.

1 Weakly-universal Hashing

A hash function family H = {h1, h2, . . . } is weakly-universal iff for randomly and uniformly

chosen h ∈ H, we have Pr(h[x] = h[y]) ≤ 1/m for any distinct x, y ∈ U . Consider the following

hash function families. For each one, prove that it is weakly universal or give a counter-example.

1. Let p be a prime number and m be an integer, p ≥ m. Consider the hash function

family where you pick at random a ∈ {1, . . . , p? 1} and then define ha : {0, . . . , p? 1} →

{0, . . . ,m? 1} as ha(x) = (ax mod p) mod m.

2. Let p be a prime and m be an integer such that p ≥ m. Consider the hash function

family where you pick at random b ∈ {0, . . . , p? 1} and then define hb : {0, . . . , p? 1} →

{0, . . . ,m? 1} as hb(x) = ((x + b) mod p) mod m.

3. Let p be a multiple of m. Consider the hash function family where you pick at random

a ∈ {1, . . . ,m ? 1} and b ∈ {0, . . . ,m ? 1}. Define ha,b : {0, . . . , p ? 1} → {0, . . . ,m ? 1}

as ha,b(x) = ((ax + b) mod p) mod m).

2 Cuckoo Hashing

1. This question is about cuckoo hashing. Consider a small variant of cuckoo hashing where

we use two tables T1 and T2 of the same size and hash function h1 and h2. When inserting

a new key x, we first try to put x at position h1(x) in T1. If this leads to a collision,

then the previously stored key y is moved to position h2(y) in T2. If this leads to another

collision, then the next key is again inserted at the appropriate position in T1, and so on.

In some cases, this procedure continues forever, i.e. the same configuration appears after

some steps of moving the keys around to dissolve collisions.

(a) Consider two tables of size 5 each and two hash functions h1(k) = k mod 5 and

h2(k) = bk5c mod 5. Insert the keys 27, 2, 32 in this order into initially empty hash

tables, and show the result.

(b) Find another key such that its insertion leads to an infinite sequence of key displace-

ments.

2. In order to use cuckoo hashing under an unbounded number of key insertions, we cannot

have a hash table of fixed size. The size of the hash table has to scale with the number

1

of keys inserted. Suppose that we never delete a key that has been inserted. Consider

the following approach with Cuckoo hashing. When the current hash table fills up to its

capacity, a new hash table of doubled size is created. All keys are then rehashed to the

new table. Argue that the average time it takes to resize and rebuild the hash table, if

spread out over all insertions, is constant in expectation. That is, the expected amortised

cost of rebuilding is constant.

3 Bloom Filters

1. Answer the following three questions about Bloom filters:

(a) What operations do we perform on Bloom filters?

(b) What is the difference between hash tables and Bloom filters in terms of which data

we can access?

(c) Why is there is a problem when deleting elements from a Bloom filter?

2. Suppose you have two Bloom filters A and B (each having the same number of cells and

the same hash functions) representing the two sets A and B. Let C = A&B be the Bloom

filter formed by computing the bitwise Boolean and of A and B.

(a) C may not always be the same as the Bloom filter that would be constructed by

adding the elements of the set (A intersect B) one at a time. Explain why not.

(b) Does C correctly represent the set (A intersect B), in the sense that it gives a positive

answer for membership queries of all elements in this set? Explain why or why not.

(c) Suppose that we want to store a set S of n = 20 elements, drawn from a universe

of U = 10000 possible keys, in a Bloom filter of exactly N = 100 cells, and that we

care only about the accuracy of the Bloom filter and not its speed. For this problem

size, what is the best choice of the number of hash functions (the parameter r in the

lecture)? (That is what value of r gives the smallest possible probability that a key

not in S is a false positive?) What is the probability of a false positive for this choice

of r?

4 Perfect Hashing

This question is about perfect hashing:

1. Our perfect hashing scheme assumed the set of keys stored in the table is static. Suppose

instead that we want to add a few new items to our table after the initial construction.

Suggest a way to modify our intitial construction so that we can insert these new items

using no new space and without making significant changes to our existing table (in

particular, we don’t want to change our initial hash function). Your scheme should still

do lookups of all items in O(1) time, but you may use a bit more initial space.

2. Suppose now that we want to delete some of our initial items. Describe a simple way to

support deletions in our perfect hashing scheme.

2

學霸聯盟:"http://aessay.org"

- 留學生代寫
- Python代寫
- Java代寫
- c/c++代寫
- 數據庫代寫
- 算法代寫
- 機器學習代寫
- 數據挖掘代寫
- 數據分析代寫
- android/ios代寫
- web/html代寫
- 計算機網絡代寫
- 操作系統代寫
- 計算機體系結構代寫
- R代寫
- 數學代寫
- Finance 金融作業代寫
- Principles of Microeconomics 微觀經濟學代寫
- Accounting 會計代寫
- Statistics統計代寫
- 生物代寫
- 物理代寫
- 機械代寫
- Assignment代寫
- sql數據庫代寫
- analysis代寫
- Haskell代寫
- Linux代寫
- Shell代寫
- SPSS, SAS, R 數據分析代寫
- Principles of Macroeconomics 宏觀經濟學代寫
- Economics 經濟代寫
- Econometrics 計量經濟代寫
- Money and Banking 貨幣銀行學代寫
- Financial statistics 金融統計代寫
- Economic statistics 經濟統計代寫
- Probability theory 概率論代寫
- Algebra 代數代寫
- Engineering工程作業代寫
- Mechanical and Automation Engineering 機械與自動化工程代寫
- Actuarial Science 精算科學代寫
- JavaScript代寫
- Matlab代寫
- Unity代寫
- BigDate大數據代寫
- 匯編代寫
- stat代寫
- scala代寫
- OpenGL代寫
- CS代寫