CS 251, Spring 2023: Project 3, Part 2

CS 251, Spring 2023: Project 3, Part 2

Hashing: Applications and Experimental Analysis

Released: Wednesday, March 1

Due:    Part 1, Friday, March 10, 10:59 pm on Vocareum and Gradescope

Part 2, Wednesday, March 29, 10:59 pm on Vocareum and Gradescope

Q1703105484

Introduction

In Part 1, the project explores the performance of different hash functions with two types of open addressing for collision management. You will explore the impact of the hash functions and the load factor on the number of collisions.

Part 2 introduces you to a new data structure, Bloom Filters. You are asked to implement basic operations of a Bloom filter and to experimentally analyze the impact of the size of the Bloom filter and the number of hash functions used on the false positive rate of lookup operations. Both parts of the project ask you to prepare a report on your experimental results.

Learning Objectives

  • Explain practical differences between the hash functions and the collision management strategies.
  • Identify the advantages and disadvantages of Bloom filters.
  • Demonstrate how the different parameters used by Bloom filters impact performance and false positive rate.
  • Give examples of applications that should use/should not use Bloom filters.
  • Visualize and explain results generated by experiments in a clear and meaningful way.

Part 1: HashTable

See the Part 1 file for a complete description. We repeat the hash function related material.

Hash Functions

Provided with the template/starter code is the Hasher class, which implements the hash functions, listed below. Do not modify this class.

Each hash function is a static method that takes a String as input and returns a signed 32-bit integer hash value. Note that the string is converted to an integer inside the hash function.

public static int crc32(String str)

Returns the cyclic redundancy check (CRC) of the input string, using the Guava library implementation. The CRC is a checksum typically used to verify data integrity, but here we use it as a hash function.

public static int adler32(String str)

Returns the Adler-32 checksum of the input string, using the Guava library implementation.

public static int murmur3_32(String str, int seedIdx)

Returns the 32-bit MurmurHash3 of the input string, using the Guava library implementation. MurmurHash3 requires a random seed as input. The Hasher class provides a static array of seeds to use, and the seedIdx argument indexes into this array.

public static int polynomial(String str, int primeIdx)

Computes a polynomial rolling hash of the input string. It uses polynomial accumulation to generate an integer and then uses the division method to generate the hash value. Specifically, the function computes:

,

where p is a prime number and c1,...,ck are the characters in the string. Similar to MurmurHash3, the Hasher class provides a static array of prime numbers to use for p, and the primeIdx argument indexes into this array.

public HashTable(int capacity, String hashType)

HashTable constructor. Allocate the String array with the given capacity. The hashType argument determines which hash functions from the Hasher class to use, as well as the collision resolution strategy.

Possible values for this argument are “crc32-lp”, “crc32-dh”, “adler32-lp”, “adler32-dh”, “murmur3-lp”, “murmur3-dh”, “poly-lp”, and “poly-dh”. The “-lp” suffixes indicate linear probing, whereas the “-dh” suffixes indicate double hashing.

Refer to the table below for the functions to use. The returned index must be a valid index into the String array, that is, the possible values should be in the range [0, m-1], where m is the array length. The hash2() function is only needed for double hashing.

hashType

hash1(str)

hash2(str)

“crc32-lp”

Hasher.crc32(str)

“crc32-dh”

Hasher.crc32(str)

Hasher.adler32(str)

“adler32-lp”

Hasher.adler32(str)

“adler32-dh”

Hasher.adler32(str)

Hasher.crc32(str)

“murmur3-lp”

Hasher.murmur3_32(str, 0)

“murmur3-dh”

Hasher.murmur3_32(str, 0)

Hasher.murmur3_32(str, 1)

“poly-lp”

Hasher.polynomial(str, 0)

“poly-dh”

Hasher.polynomial(str, 0)

Hasher.polynomial(str, 1)

Note that Java does not have an unsigned integer type. Thus all of the hash functions in Hasher return a signed 32-bit integer. In Java, the modulus operator % returns a negative value when the dividend is negative. Instead, use the Integer.remainderUnsigned() method provided by Java, which interprets a signed integer as though it were unsigned.

Part 2: Bloom Filters

Download and add the files to your project from Part 1. Utils.java has been modified so make sure you updated it. You can get the test data from Brightspace and will add the input and exected files to your input and expected directory respectively.

In Part 2, you will use the hash functions in a data structure called Bloom filters. Assume you are creating a new account on CreativeMe:

  1. You choose your favorite username, enter it and get the message “Username already taken”.
  2. You append your favorite 4-digit number to the username and again get the message “Username already taken”.
  3. Next, you replace the 4-digit number with your grandfather’s birthdate and the characters !*! and enter: Napoleon.August.15.1944!*! Still, you get “Username already taken”.

You are frustrated and wonder why all these usernames are taken and how CreativeMe determines this so quickly. Do they use linear search, binary search, hashing? Your grandfather says, “I bet they are using Bloom filters, a clever and simple data structure from the 1970’s.” What is a Bloom filter? In Part 2 of this project you will find out.

A Bloom filter is a fast and space-efficient probabilistic data structure for the approximate set membership problem supporting insert and lookup. However, the lookup operation is inexact: it determines whether a given element is possibly in the set or definitely not in the set. In other words, false positives are possible, but false negatives are not. The tradeoff for this lack of precision is a very small memory footprint and, worst-case constant time insertion and lookup operations (assuming hashing takes constant time). Bloom filters are well suited for applications such as the scenario above, where quick lookups among a very large set are desired, and false positives are inconsequential (i.e., you can always choose a different username). Since false negatives are not possible with Bloom filters, you will never be able to pick a username that is already in use. A potential drawback of using Bloom filters is that elements can only be added, not deleted.

    1. Description of Bloom Filters

A Bloom filter consists of an array B of m bits. It employs k independent hash functions h1, h2, ..., hk mapping to {0, ..., m-1}. The values of m and k are chosen based on the expected number of insertions n, and the desired false positive rate ε. The array B is initialized to zero and subsequently modified during insertions, as described below.

Insertion

Let X be the element to be inserted. The insertion computes hash values h1(X), h2(X), ..., hk(X) as indices and sets the bits B[h1(X)], B[h2(X)], ..., B[hk(X)] on those indices to 1. In other words, X is hashed with each of the k hash functions to compute k bit positions in B, which are all set to 1.

Note that:

  1. a bit in B may already be set to 1;
  2. two or more hash functions can map to the same index;
  3. bits set to 1 remain 1;

Lookup

Let S be the set of all elements already inserted. Given element Y, we want to determine whether

Y is in set S or not. To do so, we compute the k hash values: h1(Y), h2(Y), ..., hk(Y).

    • If there exists at least one hash value hi(Y) with B[hi(Y)] == 0, we claim the element Y is not in S.
    • If B[hi(Y)] == 1 for all 1 ≤ i k, we assume Y is in set S, although we could be wrong. When all k entries are set to 1 and Y is not in S, we have a false positive. Bloom filters are designed to minimize the false positive rate by choosing appropriate values of m and k.

Figure 1 illustrates the operations. Assume m = 18 and k = 3 (using hash functions h1, h2, and h3). The range of h1, h2, and h3 is the set of integers {0, ..., 17}, which corresponds to the indices in the array B. We insert elements X, Y, Z according to hash values as follows:

X ⇒  h1(X) = 4,   h2(X) = 6,   h3(X) = 7,

Y ⇒  h1(Y) = 4,   h2(Y) = 11,   h3(Y) = 14,

Z ⇒  h1(Z) = 2,   h2(Z) = 7,   h3(Z) = 16,

After the three insertions, we have, B[i] == 1 for i {2,4,6,7,11,14,16}.

1: A Bloom filter with m=18 and k=3 and elements X, Y, Z inserted.

Next, consider elements U and V with the following hash values: U ⇒  h1(U) = 2,   h2(U) = 4,   h3(U) = 6,

V ⇒  h1(V) = 10,   h2(V) = 13,   h3(V) = 14,

We perform three lookups:

    • lookup(Z): applying the hash functions to element Z yields indices 2, 7, and 16. On these indices, all three bits B[2], B[7], B[16] in array B are already set to 1.We claim (correctly) that element Z is in the input set. (This is an example of a true positive).
    • lookup(V): applying the hash functions to element V yields indices 10, 13, and 14. As B[10]

== B[13] == 0, we claim that V is not an element in set S. (This is an example of a true negative.)

    • lookup(U): applying the hash functions to element U yields indices 2, 4 and 6. On these indices, all three bits B[2], B[4], B[6] in array B are already set to 1. We (incorrectly) claim that U is in the input set. This is an example of a false positive.

Note: Figure 1 represents the conceptual working of the Bloom Filter. Refer section 2.3 for the actual implementation details of the Bloom Filter.

    1. Hash Function Families

Bloom filter operations require k different hash functions. Part 1 introduced four hash functions, but Bloom filters need more than four hash functions. To produce an independent family of a specified number of hash functions, we use the approach of combining two independent hash

functions to generate k hash functions. We assume k ≤ 100.

Let g1 and g2 be two independent hash functions. We construct k independent hash functions from g1 and g2 as follows:

of the hash functions of Part 1 have an additional argument. For example, MurmurHash3 takes a random seed as input. By providing MurmurHash3 with k different seeds, we can also generate k independent hash functions.

We define two families of functions, F1 andF2 described as follows, , each one generating k

independent hash functions used in the Bloom filter implementations:

  • Family F1: Combine the CRC32 and Adler-32 checksums as described above.
  • Family F2: Use MurmurHash3 with k different random seeds. For each of the k hash functions, generate a table index in the range {0,1, . . . , 𝑚 − 1} by taking the 32 bit hash value modulo m.

Recall that the Hasher class from part 1 already provides static arrays of 100 random seeds and 100 prime numbers. The seedIdx and primeIdx arguments index into these arrays.

    1. Bloom Filter Implementation

We have provided the BloomFilter class template for you to start with. Your implementation must use the methods and members provided, as the grading depends on them. You may need to add additional members to the class in order to complete it.

// BloomFilter class

public class BloomFilter {

public int m;         // Number of bits

public int k;         // Number of hash functions

public byte[] B; // Array of bits

// ...

}

Note that member B is an array of bytes, which means each element B[i] is an 8-bit binary sequence as illustrated in Figure 2. You will need to implement logic to get and set individual bits within this array in the getBit() and setBit() methods using bitwise operators (e.g., or (|), and (&), left shift (<<)).

Figure 2: A Bloom filter using byte array with m=18 and k=3 and elements X, Y, Z inserted.

Array B should not contain more bytes than are needed to represent m bits. For example, if m

= 18, then B.length should be 3, resulting in 6 unused bits (indexed from 19 to 23). Java provides the BitSet class, but you are not allowed to use it in your implementation.

Your implementation must map between byte indices (in array B) and bit indices (in the Bloom filter) using the following scheme: Byte 𝑘 in array B contains bits 8𝑘 through 8𝑘 + 7, indexed in increasing order from least to most significant.

The methods to implement are:

public BloomFilter(int m, int k, String hashType)

BloomFilter constructor. Allocate the bit array and set the internal state. The hashType argument determines which hash family to use for all k hash functions in this BloomFilter instance (see section 2.2). Possible values are “checksum” and “murmur”. You have to implement both of them. Your bloom filter will use the Family based on the parameter passed in.

public boolean getBit(int b)

Return whether the bit at the given bit index b is set to 1 (true) or 0 (false).

public void setBit(int b)

Set the bit at the given bit index b to 1.

Note that there is no need to set any bit to 0 after initialization.

public int hash(String str, int k)

Hash the given string with the kth hash function of the hash family given by the hashType argument to the constructor. The returned hash must be a valid bit index in the range [0, m-1]. See the notes under the hash methods in section 1.2 regarding signed values.

public void insert(String str)

Perform insertion into the Bloom filter according to the description in section 2.1.

public boolean lookup(String str)

Return whether the given string is possibly in the Bloom filter as described in section 2.1.

Test your implementation using the main() method in the BloomFilter class. By default, this method creates a BloomFilter instance with m = 40 and k = 3, using the polynomial rolling hash family. When you run the program, it should print the bit array as 1s and 0s, and then await keyboard input. Each line that is typed into the terminal will be read as a string and inserted into the Bloom filter, followed by another printout of the bit array. If a string is already in the Bloom filter or a false positive occurs, the program will print “May already contain the string”. Press the appropriate sequence to end the input stream (e.g., Ctrl+D on Linux).

Testing your code

We provide a set of white box unit tests for the BloomFilter class. We strongly recommend that you proceed to Section 2.4 only after passing all of these tests.

The overall test structure is similar to Part 1. We use the JUnit framework and provide test cases as JUnit methods in BloomFilterTest.java. We again use input and expected test files, although we no longer have distinct output and state files. The Utils.java file has been changed since Part 1; please ensure you are using the latest version which contains a checkArrayEquals method with byte[] arguments.

The white-box tests check both the internal Bloom filter state and the values returned by the Bloom filter’s methods. Your implementation must map between byte indices (in array B) and bit indices (in the Bloom filter) using the following scheme: Byte 𝑘 in array B contains bits 8𝑘 through 8𝑘 + 7, indexed in increasing order from least to most significant.

The test case descriptions specify the method(s) being tested. Below, we provide some additional remarks:

  1. Small-scale               test               for               getBit() and                                  setBit(). This test case instantiates a Bloom filter with 32 bits, and then sets the bits in the order of indices specified in the input file. After setting each bit, the test verifies the contents of array B against the output file, and checks the return value of getBit() on each index.

  1. Medium-scale              tests             for             getBit() and                                    setBit(). Similar to test case 1, but with much larger Bloom filters: We take m to be 1024 (a power of two), 1009 (a prime number), and 1250 (a composite number that is not divisible by

eight).

  1. Medium-scale                                tests                               for                                                       hash(). This test case instantiates several Bloom filters with varying capacities and hash types and uses them to hash 10,000 random strings, verifying the output of the 100 hash functions.

  1. Small-scale                   special-case                  tests                  for                             insert(). This test case inserts the input strings into a Bloom filter with 64 bits and verifies the contents          of                               array              B    after                              each                       insertion.

  1. Medium-scale                              tests                              for                              insert().

Similar to test case 4, but with much larger Bloom filters and more input strings.

  1. Medium-scale                               tests                              for                                                     lookup(). This test case instantiates a Bloom filter and sets array B to the state in the first line of the input file. Then the test calls lookup() on each string in the subsequent lines of the input file and checks the return values against the output file.

    1. Comparing Experimental and Analytical Performance

Designing a Bloom filter requires setting values for m and k, which are typically chosen based on a desired false positive rate ε and the expected number of insertions n. The probability of false positives can be expressed as a function 𝑃(𝑛, 𝑚, 𝑘) , which can then be used to determine the value of k that minimizes it.

Specifically, given n, m and k the probability of false positives is

𝑃(𝑛, 𝑚, 𝑘) = (1 − (1 −

 

1 )𝑘𝑛

𝑚

 

)𝑘,

which is referred to as the theoretical false positive rate. The derivation and analysis of this formula is beyond the scope of CS 251. For those interested, more information can be found here. In the following subsections you will perform experiments and analyze your BloomFilter implementation.

Write code in main() to produce the data for your analysis, which will be plotted and discussed in your report.

      1. Measuring False Positive Rate against the number of Hash Functions

Given n and m, the value of k that minimizes the false positive rate is given by the formula:

The theoretical false positive rate for kopt

𝑜𝑝𝑡

𝑃(𝑛, 𝑚, 𝑘       ) ≈ 𝑒−𝑚/𝑛(𝑙𝑛 2)2 .

Thus, given n and m, one can construct a Bloom filter where the number of hash functions kopt gives the lowest false positive rate. We recommend you explore this relationship for various values of n and m.

Given two input files randstrs.txt and words_ordered.txt, and number of insertions n (parameter passed in), construct a Bloom filter with values m and k given below in (*), and your choice of hash function family. Make sure that you are consistent over the whole report. Read the first n strings from the input file and

        1. Insert the n strings into your Bloom filter.
        2. Read and lookup the remaining strings from the input file, keeping count of the number of lookups (you don’t know the number of strings in the file until EOF).
        3. Each string in the input file is unique. Hence, any positive lookups are false positives. Count the number of false positives and obtain the measured false positive rate 𝛿 =


# 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑙𝑜𝑜𝑘𝑢𝑝𝑠

# 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑙𝑜𝑜𝑘𝑢𝑝𝑠

(*) Repeat this procedure (i.e., A-C) for each of the combinations of the following m and k values:

  • m { 12n, 15n, 18n }
  • k ∈ { kopt - 5, kopt - 4, kopt - 3, kopt - 2, kopt - 1, kopt, kopt + 1, kopt + 2, kopt + 3, kopt + 4, kopt + 5 }

In all cases round m and k to the nearest integer (e.g., for 3.2, use 3; for 3.8, use 4; for 3.5, use 4).

This will result in 3x11=33 measured false positive rates. These 33 pairs will have some values of k repeated. Note that kopt depends on m. Print the measured false positive rates in a table in Comma-Separated Value (CSV) format with 4 columns and 34 rows using the following format:

  • The first row (row 1) contains four quantities as the heading: k and the 3 values of m (in terms of n).
  • Rows 2 through 12 contain the measured false positive rates for m = 12n in column 2,
  • Rows 13 through 23 contain measured false positive rates for m = 15n in column 3, and
  • Rows 24 through 34 contain measured false positive rates for m = 18n in column 4. File sample_bf illustrates the expected format (with made-up values). Use this format!

An example has been provided on Brightspace.

Your program will be run with two command line arguments: the input filename (randstrs.txt or words_ordered.txt) and the number of insertions n, in that order. Perform the above procedure and print the table to the standard output stream System.out.

Save the output as a .csv file by copying and pasting the output text, or by using the output redirection operator >:

$ java BloomFilter input_file.txt 500000 > output_file.csv

Produce two such tables: one for the input file randstrs.txt with n = 500,000, and the other for the input file words_ordered.txt with n = 185,000, and include the .csv files with your submission. Then, plot two line charts for your report. Each chart will have three separate lines plotted, in separate colors, one for each value of m. The x-axis is k, and the y-axis is the measured false positive rate in the logarithmic scale. File sample_bf shows an example. Be sure to include titles, axis labels, and a legend with your charts.

Analyze the data generated for the two input files. Based on your observations, your report should address the following questions:

  1. For input file randstrs.txt with n = 500,000, for each of the 33 combinations of k and m, is your measured false positive rate 𝛿 always within 5% of the theoretical false positive rate P? Explain your answer.
  2. For the input file words_ordered.txt with n = 185,000, for each of the 33 combinations of k and m, is your measured false positive rate 𝛿 always within 5% of the theoretical false positive rate P? Explain your answer.
  3. For a curve associated with a fixed value of m, what can you say about the shape of the curve? In particular, is kopt the global optimum in your experiments? Explain your answer.
  4. How do the three curves (associated with m ∈{12n,15n,18n}, respectively) relate to each other? Specifically, how does kopt change with increasing values of m? Explain your answer.

      1. Measuring False Positive Rate against the size of the Bloom Filter

Given n and ε, the minimum number of bits required to achieve the false positive rate ε is 𝑚𝑜𝑝𝑡 =

−1.44𝑙𝑜𝑔2𝜀 × 𝑛 . The value of k at which this optimum is achieved is 𝑘𝑜𝑝𝑡 = −𝑙𝑜𝑔2𝜀.

For your experimental work on measuring false positive rates, you are given an input file

randstrs.txt, from which the n items are to be inserted in the Bloom filter (as in 2.4.1).

  • Construct a Bloom filter with values m and k given below in (**), and your choice of hash function family from section 2.4.1.
  • Read the first n strings from the input file and insert them into your Bloom filter.
  • Read and lookup all the remaining strings from the input file, keeping count of the number of lookups (you don’t know the number of strings in the file until EOF).
  • Each string in the input files given to you is unique. Hence, any positive lookups are false positives. Count the number of false positives and obtain the measured false

positive rate 𝛿 = # 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑙𝑜𝑜𝑘𝑢𝑝𝑠

# 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑙𝑜𝑜𝑘𝑢𝑝𝑠

(**) Similar to the experimental approach used in 2.4.1, measure the false positive rate 𝛿 for the following combinations of m and k:

  • m {0.25mopt, 0.5mopt, 0.75mopt, mopt, 1.25mopt, 1.5mopt, 1.75mopt, 2.0mopt, 2.25mopt, 2.5mopt, 2.75mopt}
  • k { 0.6kopt, kopt, 1.4kopt }

In all cases, round m and k to the nearest integer (e.g., for 3.2, use 3; for 3.8, use 4; for 3.5, use 4).

This results in 11x3=33 measured false positive rates. Unlike in 2.4.1, kopt = −𝑙𝑜𝑔2𝜀 does not depend on m, so for a fixed 𝜀, there will only be 3 distinct values of k. In particular, k ∈ { 0.6kopt, kopt, 1.4kopt }. Print the measured false positive rates in a CSV table with 12 rows and 4 columns.

  • The first row (row 1) must contain values of k, and the first column (column 1) must contain values of m.
  • Column 2, rows 2 through 12 must contain the measured false positive rates for 0.6kopt.
  • Column 3, rows 2 through 12 must contain the measured false positive rates for kopt.
  • Column 4, rows 2 through 12 must contain the measured false positive rates for 1.4kopt.

Your program will be run with three arguments: the input filename, number of insertions, and desired false positive rate:

$ java BloomFilter input_file.txt 500000 0.1 > output_file.csv

Produce three such tables with the following arguments:

  • randstrs.txt, n=500000, ε=0.1
  • randstrs.txt, n=500000, ε=0.05
  • randstrs.txt, n=500000, ε=0.02

Plot three line charts with these tables, to be included in your report. Each chart will have three separate lines plotted, in separate colors, one for each value of k. The x-axis is m, and the y- axis is the measured false positive rate in the logarithmic scale. Be sure to include titles, axis labels, and a legend with your charts.

Based on your observations, your report should address the following questions:

  1. For the 33 combinations of k and m, is your measured false positive rate 𝛿 always within 5% of the desired false positive rate ε? Explain your answer.
    1. when ε=0.1
    2. when ε=0.05
    3. when ε=0.02
  2. For a curve associated with a fixed value of k, what can you say about the shape of the curve? Specifically, is 𝛿 a monotonic function of m? Explain your answer.
  3. How do the three curves in each chart relate to each other? Explain your answer.

Include your CSVs at the end of your report!

Code Requirements for Experimental Work

The Hasher class relies on the Guava library, included with the template code. To compile it, you will need to specify the path to the .jar file as part of the classpath:

$ javac -classpath .:guava-31.1-jre.jar Hasher.java

Alternatively, we recommend you set the CLASSPATH environment variable to simplify the command:

$ export CLASSPATH=.:guava-31.1-jre.jar

$ javac Hasher.java

Compile the remaining classes similarly:

$ javac HashTable.java BloomFilter.java

To run the experiments in section 2.4.1, run the BloomFilter program with two arguments: the input filename and number of insertions n:

$ java BloomFilter input_file 500000 > output_file.csv

To run the experiments in section 2.4.2, run the BloomFilter program with three arguments: the input filename, number of insertions n, and the desired false positive rate ε:

$ java BloomFilter input_file 500000 0.1 > output_file.csv

The above examples assume the environment variable CLASSPATH has been set.

Structure of the Report

Overall report guidelines and expectations

  • The report needs to be typed (use LaTeX or Word).
  • Your name (Last, First), your Purdue email, and your section number (LE1@4:30 or LE2@1:30) need to be on top of page 1.
  • The RC statement needs to be on page 1 under your name.
  • Reports are uploaded to Gradescope, and code is uploaded to Vocareum.
  • All illustrations and figures need to be clearly readable. They cannot be handwritten or drawn, for example, on an iPad. Handwritten tables are not acceptable.
  • Report for Part 2 should address the questions stated in sections 2.4.1 and 2.4.2.
  • The suggested length of the report is up to 6 pages, with font size 12 and 1.5 spacing.
    • Include the image of your CSVs at the end of your report! Above the Appendix. This will not be counted in the 6 pages. Make sure it is clear to read.

    • If you have additional supporting tables exceeding the page limit, include them as an Appendix (which has no page limit). However, only your report will be graded. The appendix may be consulted by the grader for additional information.

Grading

The expected grading rubric is given below.

Part 2 (50 points)

  • Coding (20 points)
  • Overall quality of the report for 2.4.1 (15 points)
    • Quality and conciseness of writing. In particular, addressing the four questions asked and giving accurate explanations.
    • Quality of tables and plots included.
    • Logical flow of arguments and a clear focus on relevant ideas.
  • Overall quality of the report for 2.4.2 (15 points)
    • Quality and conciseness of writing. In particular, addressing the three questions asked and giving accurate explanations.
    • Quality of tables and plots included.
    • Logical flow of arguments and a clear focus on relevant ideas.

Submission

  • To Vocareum (everything should be inside the src folder):
    • BloomFilter.Java

  • To Gradescope:
    • report_p3_part2.pdf

Note:

  • Additional imports are not allowed. (other than the ones in the starter code)
  • If for some reason you do not see the src folder under the work folder on Vocareum, please make a new src folder.
  • This has been explained on Piazza already so you can refer to it there.
  • Please go to office hours for additional help.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值