COMP20003 Algorithms and Data Structures

vx_codinghelp

于 2024-09-03 09:11:17 发布

阅读量143

点赞数 1

文章标签： windows 网络 linux

本文链接：https://blog.csdn.net/vx_codinghelp/article/details/141847468

版权

Assignment 2: Spellcheck Lookup
0. General
You must read fully and carefully the assignment speciﬁcation and instructions.
Course: COMP20003 Algorithms and Data Structures @ Semester 2, 2024
Deadline Submission: Friday 6th September 2024 @ 11:59 pm (end of Week 7)
Course Weight: 15%
Assignment type: individual
ILOs covered: 1, 2, 3, 4
Submission method: via ED
Purpose
The purpose of this assignment is for you to:
Improve your proﬁciency in C programming and your dexterity with dynamic memory
allocation.
Demonstrate understanding of a concrete data structure (radix tree) and implement a set of
algorithms.
Practice multi-ﬁle programming and improve your proﬁciency in using UNIX utilities.
Practice writing reports which systematically compare the eﬃciency of diﬀerent data structures.
Walkthrough1. Background
In Assignment 1 you implemented a dictionary using a linked list. Many applications of dictionaries
cannot assume that the user will query in the correct format. Take Google search as an example. If
someone is so excited to check the recent Olympics updates, they may accidentally search something
similar to this:
Thankfully, there is enough context in the original query that the intended search result is
recommended back to the user. In Assignment 2 you will be extending your implementation to
account for misspelled keys when querying your dictionary.2. Your Task
Assignment:
Overall, you will create a partial error-handling dictionary (spellchecker) using a radix tree.
You will be using the same dataset as Assignment 1.
Users will be able to query the radix tree and will get either the expected key, or the closest
recommended key.
You will then write a report to analyse the time and memory complexity of your Assignment 1
linked list compared to your radix tree implementation.
C Implementation:
Your programs will build the dictionary by reading data from a ﬁle. They will insert each
suburb into the dictionary (either the linked list (Stage 3) or radix tree (Stage 4)).
Your programs will handle the search for keys. There are three situations that your programs
must handle:
Handle exact matches: output all records that match the key (Stage 3 and 4).
Handle similar matches: if there are no exact matches, ﬁnd the most similar key and
output its associated records (Stage 4 only).
Handle no matches being found: if neither exact nor similar matches are found, indicate
that there is no match or recommendation (Stage 3 and 4).
Your programs should also be able to populate and query the dictionary with the Assignment
1 linked list approach and the radix tree approach.
Your programs should also provide enough information to compare the eﬃciency of the
linked list with the radix tree with a operation-based measure that ignores less important runtime
factors (e.g. comparison count).
It is recommended to use your own solution of Assignment 1, and extend it for Assignment 2. But, we will also
provide a solution for Assignment 1 if you prefer to work with our code.3. An Introduction to Tries and Radix Trees
First, it is important to establish the diﬀerence between a Trie and a Tree. This is best illustrated
with an example. One example of a tree is a binary search tree (BST), where each node in the tree
stores an entire string, as illustrated below. The nodes are ordered and allow easy searching. When
searching in the BST, we compare the query with the entire string at each node, deciding whether to
switch to the left or right subtree or stop (if the subtree is empty) based on the result of the
comparison.
A trie is slightly diﬀerent. It has multiple names, such as retrieval tree or preﬁx tree. In a trie, the
traversal of the tree determines the corresponding key. For the same strings as above with one
letter per node, it would look like:Tries allow for quick lookup of strings. Querying this trie with the key "hey" would ﬁnd no valid path
after the "e" node. Therefore, you can determine that the key "hey" does not exist in this trie.
A radix trie is a subtype of trie, sometimes called a compressed trie. Radix trees do not store a single
letter at each node, but compress multiple letters into a single node to save space. At the character
level, it would look like this:
Radix tries can again be broken down into diﬀerent types depending on how many bits are used to
deﬁne the branching factor. In this case, we are using a radix (r) of 2, which means every node has 2
children. This is accomplished by using 1 bit of precision, so each branch would be either a 0 or 1.
This type of radix trie (with r=2) is called a Patricia trie. Another example of a radix trie with r=4
would have 4 children, determined by the binary numbers 00, 01, 10, 11. The radix is related to the
binary precision by r = 2x where x is the number of bits used for branching.
Radix trees beneﬁt from less memory usage and quicker search times when compared to a regular
trie.
While these visual representations are at the character level, a Patricia trie is implemented using
the binary of each character. Each node in the trie stores the binary preﬁx at the current node,
rather than the character preﬁx. For example, we can insert 3 binary numbers into a PATRICIA trie:
0000, 0010 and 0011.Combining the previous worded example with the binary representation gives us a patricia tree of the
form:
You should trace along each path and validate that the stored strings are the same as the example
above. Each character is represented over 8 bits, and the last character is followed by an end of string
00000000 8 bit character, i.e. NULL .
It is important to note that these representations only show the preﬁx at each node. An actual
implementation will require more information within a node struct. To see this, look at the "extra
insertion example" slide.
Implementation TipsTo implement the Radix Tree, a recommended approach is to use the functions provided below.
1. Get a particular bit from a given character getBit(key, n) and
2. Extract a stem from a given key splitStem(key, start, length) which extracts a certain
number of bits from a given key.
If you want to understand the code provided, you need to understand the following bit operators
over numbers a and b:

Operator
a & b
a | b
a |= b
a << b
a >> b
Meaning
Where both corresponding bits in a and b are 1,
give a 1 in the output, else give a 0 in that position
Where either corresponding bit in a and b is 1,
give a 1 in the output, else give a 0 in that position.
Compute a | b and assign the result of the operation to a.
For all bits in a, move each bit b bits to the left.
For all bits in a, move each bit b bits to the right.
Use The Following Functions
The two recommended functions, getBit and splitStem are provided here as they can typically be
used without making major changes to their structure. Diagrams explaining their algorithmic
structure and example outputs are also provided.
getBit
This function works out the byte which the bit we're trying to extract lies in and then works out which
bit it will be in. The oﬀset is not simply the remainder of the division because the bits are ﬁlled and
split from the left (highest value bit) rather than the right.
/* Number of bits in a single character. */
#define BITS_PER_BYTE 8
/* Helper function. Gets the bit at bitIndex from the string s. */
static int getBit(char *s, unsigned int bitIndex);
static int getBit(char *s, unsigned int bitIndex){
assert(s && bitIndex >= 0);
unsigned int byte = bitIndex / BITS_PER_BYTE;
unsigned int indexFromLeft = bitIndex % BITS_PER_BYTE;
/*
Since we split from the highest order bit first, the bit we are interested
will be the highest order bit, rather than a bit that occurs at the end of the
number.
*/
unsigned int offset = (BITS_PER_BYTE - (indexFromLeft) - 1) % BITS_PER_BYTE;
unsigned char byteOfInterest = s[byte]; unsigned int offsetMask = (1 << offset);
unsigned int maskedByte = (byteOfInterest & offsetMask);
/*
The masked byte will still have the bit in its original position, to return
either 0 or 1, we need to move the bit to the lowest order bit in the number.
*/
unsigned int bitOnly = maskedByte >> offset;
return bitOnly;
}
An example diagram of what the calculation of the offset does from each possible value of
indexFromLeft is shown here:
Oﬀset
7 indexFromLeft
0
6
1
5
2
4
3
3
4
2
5
1
6
0
7 oﬀset
An example for getting bit number 12 from the string "hi" could be called using getBit("hi", 12) .
An example of how the calculation occurs is shown below:getBit
Input
IndexFromLeft
Oﬀset
byteOfInterest
oﬀsetMask
maskedByte
bitOnly
s
bitIndex
0b01101000 01101001 00000000
h i \0
s[byte]
12
bitIndex % 8 bitIndex / 8
4
4 7 indexFromLeft
0
6
1
5
3 2
3
4
2
5
1
6
0
7 oﬀset
1 << oﬀset
maskedByte >> oﬀset
byte = 1
0 1 1 0 1 0 0 1
byteOfInterest & oﬀsetMask
0 0 0 0 0 0 0 1
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1
createStem
This function creates a new collection of bytes from an existing collection of bytes. Conceptually this
is extracting a substring from a string and returning the new substring, but the returned value will not
necessarily be NULL terminated and may not start from the beginning of a character - so may not be
printable as a string for multiple reasons.
/* Allocates new memory to hold the numBits specified and fills the allocated
memory with the numBits specified starting from the startBit of the oldKey
array of bytes. */
char *createStem(char *oldKey, unsigned int startBit, unsigned int numBits);
char *createStem(char *oldKey, unsigned int startBit, unsigned int numBits){
assert(numBits > 0 && startBit >= 0 && oldKey);
int extraBytes = 0; if((numBits % BITS_PER_BYTE) > 0){
extraBytes = 1;
}
int totalBytes = (numBits / BITS_PER_BYTE) + extraBytes
char *newStem = malloc(sizeof(char) * totalBytes);
assert(newStem);
for(unsigned int i = 0; i < totalBytes; i++){
newStem[i] = 0;
}
for(unsigned int i = 0; i < numBits; i++){
unsigned int indexFromLeft = i % BITS_PER_BYTE;
unsigned int offset = (BITS_PER_BYTE - indexFromLeft - 1) % BITS_PER_BYTE;
unsigned int bitMaskForPosition = 1 << offset;
unsigned int bitValueAtPosition = getBit(oldKey, startBit + i);
unsigned int byteInNewStem = i / BITS_PER_BYTE;
newStem[byteInNewStem] |= bitMaskForPosition * bitValueAtPosition;
}
return newStem;
}
If the number of bits to extract matches exactly on a character border (i.e. if it is a multiple of 8) an
extra character is not needed, but otherwise, an extra character is needed - with the remainder of the
value being ﬁlled with 0's.
It works by fetching each bit from the given byte array and setting it in the new byte array.
An example of createStem("hello", 0, 12) is shown here:
Input
Processing
Output
oldKey
startBit
0b01101000 01100101 01101100 01101100 01101111 00000000
h e l l o \0
numBits
0
12
0b01101000 0110 0101 01101100 01101100 01101111 00000000
h e l l o \0
0b01101000 0110xxxx
h `-o
newStem
An example of createStem("hello", 12, 15) is shown here:Input
Processing
Output
oldKey
startBit
0b01101000 01100101 01101100 01101100 01101111 00000000
h e l l o \0
numBits
12
15
0b01101000 0110 0101 01101100 011 01100 01101111 00000000
h e l l o \0
0b0101 01101100 011x
e l `-[DEL]
newStem
Note: The result of createStem is not a string, strlen will not reliably count its length and printing it as a
string may lead to unexpected results.4. Extra Insertion Example and Insertion Process
This extra slide shows an additional example for inserting data into a preﬁx tree. This shows the
process of inserting the seven strings:
"a" 01100001 00000000
"b" 01100010 00000000
"c" 01100011 00000000
"d" 01100100 00000000
"e" 01100101 00000000
"f" 01100110 00000000
"g" 01100111 00000000
The ﬁrst inserted string ("a" - 1100001 00000000) has no splits, so the preﬁx is the whole string:
Root Node
preﬁxBits 16
preﬁx 0b01100001 00000000
a \0
branchA NULL
branchB NULL
data ...
The second inserted string ("b" - 1100010 00000000) ﬁrst diﬀers in the second last bit of the ﬁrst
character, so the tree splits at that point. The root node's preﬁx only determines the ﬁrst 6 bits of the
string - narrowing all strings in the the tree to starting with a character which has the ﬁrst six bits
011000. Therefore, the root node can lead to any character from ` to c. Check this ASCII-Binary table
(numbers 96-99).The third inserted string ("c" - 01100011 00000000) produces a new diﬀerence in the last bit (of the
ﬁrst character) in the node completing the string "b". After the split, the right branch of the second
level node can lead the (ﬁrst) character to be 'b' and 'c'.The fourth inserted string ("d" - 01100100 00000000) produces a diﬀerence in the root node, as the
(ﬁrst) character diﬀers in its third last bit - causing a split in the root node after 5 bits. After the split,
the root node only narrows the letters in the ﬁrst character of all items in the tree from the character
'`' to the character 'g'.The ﬁfth inserted string ("e"- 01100101 00000000) produces a new diﬀerence in the node completing
the string "d" in the last bit of the (ﬁrst) character in the string. This causes a split, the third last and
second last bits for 'd' and 'e' are still the same, so the node retains these two bits.The sixth inserted string ("f" - 01100110 00000000) produces a split in the node deciding between
the (ﬁrst) character being 'd' or 'e', with the second last bit of the character (the second bit in the node)
being diﬀerent.Then the ﬁnal inserted string ("g" - 01100111 00000000) produces a split in the node completing the
string "f", with the last bit of the ﬁrst character diﬀering.Insertion
In addition to the two functions provided to work over bit representations of strings, you will likely
have some kind of insertion operation to handle the example above.
The code logic details will likely vary depending on your data structure and implementation of
Assignment 1. Therefore, this diagram is provided as an overarching diagram to help you through the
task of inserting a new string.Insert into Tree
Check if in Tree Already
Not in Tree Already
Start
Start at Root
See if All Bits Match
Check Next Bit
all match
Not in Tree
Bit mismatch or
more bits in tree node
than remaining in string
Node Found
all exhausted in search string
Advance Down branchB
1
Advance Down branchA
0 Start at Root
Insert into
Found Node
See if All Bits Match
Check Next Bit
all match
Split Node
Insert Mismatched Stem
as New Branch Child
and Old Stem as Other
Branch, Leaving Common
Stem in Node
Bit mismatch or
more bits in tree node
than remaining in string
Advance Down branchB
1
Advance Down branchA
0Inserting into the found node simply adds to the list of data in the node and the split node step will
split at the common bit position, leaving the common stem in the node, placing the stem from the
inserted string in a newly created node down the appropriate branch (A if a 0 is the following bit in
the string, and B if it is a 1) and the stem from the existing string down the other branch in a newly
created branch. After this split - the insertion is complete.
You may ﬁnd it useful to set the data in the existing node to NULL to avoid the same data existing in multiple
places in your tree.
By storing the \0 character, we do not need to worry about the case where the string continues after an
existing node is exhausted - we will always reach a mismatch or the string will match, simplifying both the
search and insertion logic.5. Dataset and Assumptions
Note: this slide is identical to the Assignment 1 dataset and assumptions.
The opendatasoft website provides numerous databases, enabling organisations and individuals to
seamlessly access and share data. The dataset used in this assignment is a simpliﬁed version derived
from the "State Suburbs - Australia" database on that website which originates from ABS data. You
can browse or visualize the original database using the link above. The whole simpliﬁed dataset, as
well as smaller samples, can be found in the Dataset Download slide.
The processed dataset can be found in the Dataset Download slide. You aren't expected to perform
this processing yourself.
The provided dataset has the following 10 ﬁelds:
COMP20003 Code - The record IDs added by the COMP20003 teaching team (integer)
Official Code Suburb - The suburb ID (integer)
Official Name Suburb - The name of the suburb (string)
Year - The year the information was recorded for (integer)
Official Code State - The IDs of the corresponding states (string)
Official Name State - The names of the corresponding states (string)
Official Code Local Government Area - The IDs of the corresponding local government areas (string)
Official Name Local Government Area - The names of the corresponding local government areas (string)Latitude - The latitude (y) of the suburb centre (double)
Longitude - The longitude (x) of the suburb centre (double)
The provided text gives a detailed description of the data format and assumptions for the
assignment. Here's a summary of the key points:
Fields and Data Types:
COMP20003 Code , Official Code Suburb , Year : integers
Longitude , Latitude : doubles
all other ﬁelds: strings (may contain spaces and commas)
Special Cases:
Official Code State , Official Code Local Government Area : comma-separated lists of
integers (treated as strings for this assignment).
comma-containing strings: enclosed in double quotes ("") within the CSV ﬁle, the quotes are
removed when stored in your program according to standard CSV rules.
CSV File Assumptions:
Note that normally a suburb lies entirely inside a single state, as well as a single local government
area, but this is not a universal rule. Suburbs might be in multiple local governments areas and
multiple states. This is why the ﬁelds Official Code State and Official Code Local Government
Area are not classiﬁed as integers. Each of these ﬁelds is actually a comma-separated list of integers.
For the purposes of this assignment, it is fully suﬃcient to consider this as a string.
This data is in CSV format, with each ﬁeld separated by a comma. For the purposes of the
assignment, you may assume that:
the input data is well-formatted,
input ﬁles are not empty,
input ﬁles include a header line that contains the ﬁlled headers in the CSV format ,
any ﬁeld containing a comma will begin with a double quotation mark ( " ) and end with a
quotation mark ( " ),
each ﬁeld contains at least one character and at most 127 characters,
ﬁelds always occur in the order given above, and that
the maximum length of an input record (a single full line of the CSV) is 511 characters.
In this dataset, CSV conventions are followed for strings containing commas. These strings are
enclosed in double quotation marks (") at the beginning and end. However, when processing the data
and storing it in your dictionary, these quotation marks are removed. You can safely assume there
won't be any double quotes within the actual string values stored in the dictionary so your program
does not have to handle this case.Processing Assumptions:
The column headers (such as COMP20003 Code and Official Name Suburb ) can be assumed
to match the names expected and can be used directly in the output as the labels for each ﬁeld.
The number of columns (10 in this assignment), as well as the order, text and data type of each
column are assumed to be known. While it's possible to automatically determine this
information, it's beyond the scope of this assignment.6. Implementation Details
Assignment 2 will involve roughly three new stages.
Stage 3 will extend your Assignment 1 solution to count the number of comparisons it
takes to ﬁnd a given key, i.e. a search query.
Stage 4 will implement the lookup of data by a given key (Oﬃcial Name Suburb) in a Patricia
tree.
Stage 5 is a report about the complexity of a linked list compared to the Patricia tree.
Stage 3: Linked List Search Comparisons
In this stage, you will extend your Assignment 1 solution to count the number of binary and string
comparisons and node accesses used when searching for a given key.
Your Makefile will produce an executable called dict3 . The command line arguments are identical
to assignment 1 but with the stage being 3 instead.
The ﬁrst argument will be the stage, for this part, the value will always be 3 (for linked list
lookup with comparison count added).
The second argument will be the name of the input CSV data ﬁle.
The third argument will be the name of the output ﬁle.
Your dict3 program should function exactly the same as assignment 1's stage 1. You will add the
functionality to count comparisons when searching for a key which will be added to the stdout
output. For this stage, and this stage only, you may assume:
Each character compared adds exactly 8 bits to the bit comparison count.
The node access is incremented once per accessed linked list node.
Each string comparison, even if a mismatch occurs on the ﬁrst character, is 1 string
comparison.
You should create the functionality to store comparisons with stage 4 in mind. You will also be
recording comparisons in the Patricia tree implementation, so your code should be easily applied to
both. For this reason, you may want to extend your search function to include a pointer to a struct
that holds information about the query and results.
Important Notes:
You do not have to implement any spellchecking in the linked list. Your only task at this stage is
to add functionality to count comparisons. You do not need to add this functionality to the deletion of nodes. Only the search will be
assessed. You should, however, be able to recognize how to implement this in your deletion
code.
Example Execution
make -B dict3
./dict3 3 tests/dataset_1.csv output.txt < tests/test1.in
Example Output
The expected output to the output ﬁle would look like:
Carlton -->
COMP20003 Code: 9773, Official Code Suburb: 20495, Official Name Suburb: Carlton, Year: 2021, Official Code State: 2,
South Melbourne -->
With the following printed to stdout:
Carlton --> 1 records - comparisons: b64 n1 s1
South Melbourne --> NOTFOUND
Note: the bit comparisons are 8 times the character comparisons here
Stage 4: Patricia Tree Spellchecker
In Stage 4, you will implement the another dictionary allowing the lookup of data by key ( Suburb
Name ).
Your Makefile should produce an executable program called dict4 . This program should take three
command line arguments:
The ﬁrst argument will be the stage, for this part, the value will always be 4 (for Patricia tree
lookup and comparison counting).
The second argument will be the name of the input CSV data ﬁle.
The third argument will be the name of the output ﬁle.
Your dict4 program should:
Read the data in the same format as assignment 1 and stage 1, and save each entry in a Patricia
tree using the Oﬃcial Name Suburb as the key.
The program should:Accept Official Name Suburb s from stdin , and search the tree for the matching key.
Since your program should act as a simple spellchecker, an exact match may not be
found. Follow the process "mismatch in key" as outlined below to implement
spellchecking logic.
You must output all matching records to the output ﬁle, where a matching record may be
an exact match, or the closest match determined by the mismatch key algorithm. If no
matches for the query are found (i.e. the trie is empty), your program should output
NOTFOUND . When outputting data records, you should follow the guidelines described in
the slide Dataset and Assumptions.
In addition to outputting the record(s) to the output ﬁle, the number of records found and
the comparisons and node accesses when querying should be output to stdout .
Mismatch in Key
Since a Patricia tree is a type of preﬁx tree, any mismatch that occurs will contain the same preﬁx. For
example, say we are searching for "trie", but our dictionary contains "tree". Here, a mismatch occurs
at some binary bit in the third character. This mismatch will only occur once we are past the node in
our tree that contains all preﬁxes with "tr". Therefore, when the mismatch occurs, we can get all
descendants at the mismatched node and calculate the most likely candidate using the edit distance
of the query and a key. This way, if we query "trie", we should return "tree". This is explained with the
following graphic:
The patricia tree contains 4 strings as listed. If we query "trie", the mismatch happens on the bit
highlighted in red. At this node, the only descendant data is "tree", and is therefore determined to be
the closest match.If we query "Tree", the mismatch happens on the third bit. This means that all descendant data is
taken to be possible candidates, so all 4 keys have their edit distance against the query calculated.
"tree" is determined to be the closest match with an edit distance of 1, and is returned.
Tips:
The results struct you created to hold comparisons can hold matching data
A node access count should be incremented each time a new node of the trie is looked at.
If two strings have an equal edit distance, return all results for the ﬁrst suburb name
encountered with that edit distance (this will be the alphabetically earliest result).
Edit Distance
In Stage 4, you will need to calculate the edit distance of multiple strings to determine the closest
match. You are welcome to use the code here to calculate the edit distance - you do not have to worry
about its complexity:
int editDistance(char *str1, char *str2, int n, int m);
int int min(int a, int b, int c);
/* Returns min of 3 integers
reference: https://www.geeksforgeeks.org/edit-distance-in-c/ */
int min(int a, int b, int c) {
if (a < b) {
if(a < c) {
return a;
} else {
return c;
}
} else {
if(b < c) {
return b;
} else {
return c;
}
}
}
/* Returns the edit distance of two strings
reference: https://www.geeksforgeeks.org/edit-distance-in-c/ */
int editDistance(char *str1, char *str2, int n, int m){
assert(m >= 0 && n >= 0 && (str1 || m == 0) && (str2 || n == 0));
// Declare a 2D array to store the dynamic programming
// table
int dp[n + 1][m + 1];
// Initialize the dp table
for (int i = 0; i <= n; i++) {
for (int j = 0; j <= m; j++) {
// If the first string is empty, the only option // is to insert all characters of the second
// string
if (i == 0) {
dp[i][j] = j;
}
// If the second string is empty, the only
// option is to remove all characters of the
// first string
else if (j == 0) {
dp[i][j] = i;
}
// If the last characters are the same, no
// modification is necessary to the string.
else if (str1[i - 1] == str2[j - 1]) {
dp[i][j] = min(1 + dp[i - 1][j], 1 + dp[i][j - 1],
dp[i - 1][j - 1]);
}
// If the last characters are different,
// consider all three operations and find the
// minimum
else {
dp[i][j] = 1 + min(dp[i - 1][j], dp[i][j - 1],
dp[i - 1][j - 1]);
}
}
}
// Return the result from the dynamic programming table
return dp[n][m];
}
Example Execution
make -B dict4
./dict4 4 tests/dataset_15.csv output.txt < tests/test1.in
Example Output
Output ﬁle:
Carlton -->
COMP20003 Code: 9773, Official Code Suburb: 20495, Official Name Suburb: Carlton, Year: 2021, Official Code State: 2,
South Melbourne -->
COMP20003 Code: 390, Official Code Suburb: 22313, Official Name Suburb: South Wharf, Year: 2021, Official Code State:
stdout:
Carlton --> 1 records - comparisons: b58 n4 s1
South Melbourne --> 1 records - comparisons: b52 n5 s1Stage 5: Complexity Report
The ﬁnal deliverable for this assignment is a small report analyzing the complexity of both
dictionaries. You will run various test cases through your program to test its correctness and also to
examine the number of key comparisons used when searching keys across diﬀerent dictionaries. You
will report on the number of comparisons used by your linked list and patricia tree implementations
for diﬀerent data ﬁle inputs and key preﬁxes. You will compare these results with what you expect
based on theory (Big-O).
Your approach should be systematic, varying the size of the ﬁles you use and their characteristics (e.g.
sorted/unsorted, duplicates, range of key space, length of preﬁx, etc.), and observing how the number
of key comparisons varies given diﬀerent sizes of key preﬁxes. Looking at statistical measures about
the results of your tests may help.
UNIX programs such as sort (in particular -R ), head , tail , cat and awk may be valuable. You
may also ﬁnd small C programs (or additional "Stages") useful, the test data you produce and use
does not have to be restricted to the dataset provided.
Additionally, though your C code should run on Ed, you may like to run oﬄine tests or other work
outside of Ed and are welcome to do so.
You will write up your ﬁndings and submit them along with your Assignment Submission in the
Assignment Submission Tab.
Make sure to click the "Mark" button after uploading your report. It is often the case for results to be released
and more than a handful of students to ﬁnd an unexpected 0 mark for the report due to not technically
submitting it for marking.
You are encouraged to present the information you've collected in appropriate tables and graphs.
Select informative data, more is not always better.
In particular, you should present your ﬁndings clearly, in light of what you know about the data
structures used in your programs and in light of their known computational complexity. You may ﬁnd
that your results are what you expected, based on theory. Alternatively, you may ﬁnd your results do
not agree with theory. In either case, you should state what you expected from the theory, and if
there is a discrepancy you should suggest possible reasons. You might want to discuss space-time
trade-oﬀs, if this is appropriate to your code and data.
It is recommended to look at extreme cases where each option will do better or worse - e.g. think
about dense data where the Patricia tree will save many bit comparisons and character sparse data
where string comparisons would terminate quickly in both data structures. It is also highly
recommended to look at the theory covered so far and how you can match it - i.e. if a particular
complexity is expected, that it appears at large input sizes and represents the growth behaviour - not
just single point comparisons. Similarly, pay particular attention to the meaning of variables,
character comparisons, string comparisons and bit comparisons may all have diﬀerent behaviours and certain investigations might expect diﬀerent relations on these.
You are not constrained to any particular structure in this report, but a well laid out plan with a strong
structure tends to lead to the best results. A more general and helpful guiding structure is shown in
the Additional Support slide, but you may ﬁnd it valuable to keep in mind a general ﬂow:
Introduction - Summary of data structures and inputs.
Linked list and Patricia Tree:
- Data (data ﬁle, number of keys, characteristics, summary statistics for relevant metrics)
- Comparison with theory
Discussion
Indicatively, around 4 pages is typical for the report. Though a hard page limit is not imposed. But
please consider that markers will need to read your report so aim to make it reasonably concise if
you can.7. Implementation Requirements
The implementation requirements are broadly the same as for Assignment 1. Though the data structure
requirements diﬀer slightly. The comparison count matching requirements are also slightly varied to reﬂect
diﬀerences in counting that may be equally sensible to the recommended counting method.
The following implementation requirements must be adhered to:
Programming Language: You must write your code in the C programming language.
Record Structure: Each data record (representing a suburb) must be stored in a custom structure
( struct ) that reﬂects the previously mentioned data types. This struct will have member
variables for each ﬁeld (Year, Suburb Code, etc.) with the appropriate data types (int, double,
string).
Linked List Implementation: You must use a linked list to implement the dictionary in Stage 3.
The order in which data records appear in the input ﬁle must be preserved within the linked list.
Patricia Tree Implementation: You must use a Patricia tree to implement the dictionary in Stage 4.
The order in which data records appear in the input ﬁle must be preserved within the Patricia
tree (i.e. duplicates should appear in the same order as the relative order in the original ﬁle).
Modular Design: You must write your code in a modular way. This means your dictionary
operations should be kept together in a separate .c ﬁle, with its own header .h ﬁle, separate
from the main program. Other distinct modules should similarly be separated into their own
sections, requiring as little knowledge of each other's internal details as possible. This facilitates
easier use in other programs without extensive rewriting or copying.
Multiple Dictionary Support: Your code should be easily extensible to handle multiple
dictionaries. The functions for interacting with your dictionary should take arguments that
include not only the values required for the operation but also a pointer to a speciﬁc dictionary
(e.g., search(dictionary, value) ).
Single File Read: Your implementation must read the input ﬁle only once.
Space-Eﬃcient Strings: Your program should store strings in a space-eﬃcient manner. If you use
malloc() to allocate space for a string, remember to allocate enough space for the ﬁnal endof-string
character ( \0 ).
Exact Output Matching: Your outputs for sample tests should be exactly identical to the
expected output, including the order in which the data appears. This means your program's
results must match the provided test cases character-for-character, preserving the order of
elements within the output. Your comparison counts are not required to match the provided
test cases, but you should understand and note why these do not match by noting how you
count these if you do not match the provided test cases.
Makefile : A full Makefile is not provided. The Makefile should direct the compilation of
your program. To use it, ensure it's in the same directory as your code. Type make dict3 to build the ﬁrst program and make dict4 to make the second program. You must submit your
Makefile with your assignment.
Hints:
• If you haven’t used make before, try it on simple programs ﬁrst. If it doesn’t work, read the error messages
carefully. A common problem in compiling multiﬁle executables is in the included header ﬁles. Note also that
the whitespace before the command is a tab, and not multiple spaces.
• It is not a good idea to code your program as a single ﬁle and then try to break it down into multiple ﬁles.
Start by using multiple ﬁles, with minimal content, and make sure they are communicating with each other
before starting more serious coding.8. Programming Style
Below is a style guide which assignments are evaluated against. For this subject, the 80 character
limit is a guideline rather than a rule — if your code exceeds this limit, you should consider whether
your code would be more readable if you instead rearranged it.
/** ***********************
* C Programming Style for Engineering Computation
* Created by Aidan Nagorcka-Smith (aidann@student.unimelb.edu.au) 13/03/2011
* Definitions and includes
* Definitions are in UPPER_CASE
* Includes go before definitions
* Space between includes, definitions and the main function.
* Use definitions for any constants in your program, do not just write them
* in.
*
* Tabs may be set to 4-spaces or 8-spaces, depending on your editor. The code
* Below is ``gnu'' style. If your editor has ``bsd'' it will follow the 8-space
* style. Both are very standard.
*/
/**
* GOOD:
*/
#include <stdio.h>
#include <stdlib.h>
#define MAX_STRING_SIZE 1000
#define DEBUG 0
int main(int argc, char **argv) {
...
/**
* BAD:
*/
/* Definitions and includes are mixed up */
#include <stdlib.h>
#define MAX_STING_SIZE 1000
/* Definitions are given names like variables */
#define debug 0
#include <stdio.h>
/* No spacing between includes, definitions and main function*/
int main(int argc, char **argv) {
...
/** *****************************
* Variables
* Give them useful lower_case names or camelCase. Either is fine,* as long as you are consistent and apply always the same style.
* Initialise them to something that makes sense.
*/
/**
* GOOD: lower_case
*/
int main(int argc, char **argv) {
int i = 0;
int num_fifties = 0;
int num_twenties = 0;
int num_tens = 0;
...
/**
* GOOD: camelCase
*/
int main(int argc, char **argv) {
int i = 0;
int numFifties = 0;
int numTwenties = 0;
int numTens = 0;
...
/**
* BAD:
*/
int main(int argc, char **argv) {
/* Variable not initialised - causes a bug because we didn't remember to
* set it before the loop */
int i;
/* Variable in all caps - we'll get confused between this and constants
*/
int NUM_FIFTIES = 0;
/* Overly abbreviated variable names make things hard. */
int nt = 0

while (i < 10) {
...
i++;
}
...
/** ********************
* Spacing:
* Space intelligently, vertically to group blocks of code that are doing a
* specific operation, or to separate variable declarations from other code.* One tab of indentation within either a function or a loop.
* Spaces after commas.
* Space between ) and {.
* No space between the ** and the argv in the definition of the main
* function.
* When declaring a pointer variable or argument, you may place the asterisk
* adjacent to either the type or to the variable name.
* Lines at most 80 characters long.
* Closing brace goes on its own line
*/
/**
* GOOD:
*/
int main(int argc, char **argv) {
int i = 0;
for(i = 100; i >= 0; i--) {
if (i > 0) {
printf("%d bottles of beer, take one down and pass it around,"
" %d bottles of beer.\n", i, i - 1);
} else {
printf("%d bottles of beer, take one down and pass it around."
" We're empty.\n", i);
}
}
return 0;
}
/**
* BAD:
*/
/* No space after commas
* Space between the ** and argv in the main function definition
* No space between the ) and { at the start of a function */
int main(int argc,char ** argv){
int i = 0;
/* No space between variable declarations and the rest of the function.
* No spaces around the boolean operators */
for(i=100;i>=0;i--) {
/* No indentation */
if (i > 0) {
/* Line too long */
printf("%d bottles of beer, take one down and pass it around, %d
bottles of beer.\n", i, i - 1);
} else {
/* Spacing for no good reason. */
printf("%d bottles of beer, take one down and pass it around."
" We're empty.\n", i); }
}
/* Closing brace not on its own line */
return 0;}
/** ****************
* Braces:
* Opening braces go on the same line as the loop or function name
* Closing braces go on their own line
* Closing braces go at the same indentation level as the thing they are
* closing
*/
/**
* GOOD:
*/
int main(int argc, char **argv) {
...

for(...) {
...
}
return 0;
}
/**
* BAD:
*/
int main(int argc, char **argv) {
...
/* Opening brace on a different line to the for loop open */
for(...)
{
...
/* Closing brace at a different indentation to the thing it's
closing
*/
}
/* Closing brace not on its own line. */
return 0;}
/** **************
* Commenting:
* Each program should have a comment explaining what it does and who created
* it.
* Also comment how to run the program, including optional command line* parameters.
* Any interesting code should have a comment to explain itself.
* We should not comment obvious things - write code that documents itself
*/
/**
* GOOD:
*/
/* change.c
*
* Created by Aidan Nagorcka-Smith (aidann@student.unimelb.edu.au)
13/03/2011
*
* Print the number of each coin that would be needed to make up some
change
* that is input by the user
*
* To run the program type:
* ./coins --num_coins 5 --shape_coins trapezoid --output blabla.txt
*
* To see all the input parameters, type:
* ./coins --help
* Options::
* --help Show help message
* --num_coins arg Input number of coins
* --shape_coins arg Input coins shape
* --bound arg (=1) Max bound on xxx, default value 1
* --output arg Output solution file
*
*/
int main(int argc, char **argv) {
int input_change = 0;

printf("Please input the value of the change (0-99 cents
inclusive):\n");
scanf("%d", &input_change);
printf("\n");

// Valid change values are 0-99 inclusive.
if(input_change < 0 || input_change > 99) {
printf("Input not in the range 0-99.\n")
}

...

/**
* BAD:
*/
/* No explanation of what the program is doing */
int main(int argc, char **argv) { /* Commenting obvious things */
/* Create a int variable called input_change to store the input from
the
* user. */
int input_change;
...
/** ****************
* Code structure:
* Fail fast - input checks should happen first, then do the computation.
* Structure the code so that all error handling happens in an easy to read
* location
*/
/**
* GOOD:
*/
if (input_is_bad) {
printf("Error: Input was not valid. Exiting.\n");
exit(EXIT_FAILURE);
}
/* Do computations here */
...
/**
* BAD:
*/
if (input_is_good) {
/* lots of computation here, pushing the else part off the screen.
*/
...
} else {
fprintf(stderr, "Error: Input was not valid. Exiting.\n");
exit(EXIT_FAILURE);
}
Some automatic evaluations of your code style may be performed where they are reliable. As
determining whether these style-related issues are occurring sometimes involves non-trivial (and
sometimes even undecidable) calculations, a simpler and more error-prone (but highly successful)
solution is used. You may need to add a comment to identify these cases, so check any failing test
outputs for instructions on how to resolve incorrectly ﬂagged issues.9. Additional Support
Your tutors will be available to help with your assignment during the scheduled workshop times.
Questions related to the assignment may be posted on the Ed discussion forum, using the folder tag
Assignments for new posts. You should feel free to answer other students’ questions if you are
conﬁdent of your skills.
A tutor will check the discussion forum regularly and answer some questions. However, be aware
that:
Tutors might answer questions only during working hours. Critically - these do not include the
weekend, so if you have a question on the weekend, a response is much more likely from a
fellow student.
For some questions, you will need to use your judgment and document your thinking. For
example, a question like "How much data should I use for the experiments?" will not be
answered by the tutors. You must try out diﬀerent data sets and see what makes sense for your
experiment. Consider how you can contribute to helping others check their code. Consider what
is justiﬁable and make clear your rationale.
If you have questions about your code speciﬁcally which you feel would reveal too much of the
assignment, feel free to post a private question on the discussion forum.
Helpful Hints
For the report, it is highly recommended you check out the Academic Skills Unit's article on writing
research reports: Research Reports guide. Students looking at the guide while producing their report
typically received on average around 100% more marks.
Additionally, it may be worth putting together the skeleton of the report (including method and what
you expect you'll see) as early as possible. You will likely change many of these assumptions as you
explore the data structures and their performance, but it will tend to highlight areas where you were
wrong in your assumptions. It's also often easier to come up with a comprehensive method when
you don't have to put your plans into action just yet - though of course it makes sense to revise these
plans once you have the understanding from your implementation.10. Assessment
There are a total of 15 marks given for this assignment.
Your C program will be marked on the basis of accuracy, readability, and good C programming
structure, safety and style, including documentation (1 mark). Safety refers to checking whether
opening a ﬁle returns something, whether malloc() s do their job, etc. The documentation should
explain all major design decisions, and should be formatted so that it does not interfere with reading
the code. As much as possible, try to make your code self-documenting by choosing descriptive
variable names.
It is common to lose marks for modularity and eﬃciency because the Requirements slide was not adequately
read, make sure to double check requirements at intervals during the writing of your program and to check Ed
regularly.
The remainder of the code-related marks will be based on the correct functioning of your submission.
The second component of the rubric is about the associated analysis which compares the
performance between the two approaches (Stage 3 and Stage 4) - this analysis may not look too
diﬃcult at ﬁrst glance, but is non-trivial and will likely involve revisiting the lectures from earlier
weeks to successfully prepare.
Note that unlike the ﬁrst assignment, passing test cases will not guarantee you marks - for example, your
approach should be reasonably time eﬃcient (such as sorting and uniqueness checks). Passing test cases is
approximately aligned with demonstrated functionality but does not directly evaluate eﬃciency of code or
careful adherence to the speciﬁcation's Implementation Details, Requirements or Task requirements.

Marks
2
1
4
2
1
1
1
1
1
1
Task
Stage 3 implementation is correct and reasonably efficient.
Stage 3 memory correctly freed and is free of errors.
Stage 4 implementation is correct and reasonably efficient.
Stage 4 memory correctly freed and is free of errors.
Program style consistent with Programming Style slide,
code sufficiently modular. Memory allocations and
file opens checked.
Experimentation follows a sensible and systematic approach.
Analysis covers all relevant cases.
Analysis meaningfully examines complexity claims.
Diagrams are used well to support claims.
Good comparison between theory and empirical results.
UngradingIn Assignment 2, we will continue the ungrading experiment. The experimentation has historically
been an area where many marks were lost, in grading your work, you are encouraged to mark to
reﬂect further work done to engage with the problem using understanding gained in the subject and
in your own research. Since common pitfalls are easy to fall into, you might like to highlight what
work has gone into making sure your results are both convincing and as robust as possible.
Note: Getting perfect marks on Assignment 2 is often time consuming - consider taking advantage of
ungrading if you are spending far too long on part of the assignment. As mentioned in Assignment 1, you may
ﬁnd a partial solution where you have learned a lot is still valuable. However, make sure you give each area
suﬃcient time - e.g. the report is allocated one third of the marks, not completing it because you spent too
long on the other parts is not likely to justify receiving marks for it. The indicative work time is 15 hours, so if
you are nearing this, it is worthwhile planning your time to ensure you learn in all areas.11. Plagiarism
This is an individual assignment. The work must be your own work.
While you may discuss your program development, coding problems and experimentation with your
classmates, you must not share ﬁles, as doing this without proper attribution is considered
plagiarism.
If you have borrowed ideas or taken inspiration from code and you are in doubt about whether it is
plagiarism, provide a comment highlighting where you got that inspiration.
If you refer to published work in the discussion of your experiments, be sure to include a citation to
the publication or the web link.
“Borrowing” of someone else’s code without acknowledgment is plagiarism. Plagiarism is considered
a serious oﬀense at the University of Melbourne. You should read the University code on Academic
integrity and details on plagiarism. Make sure you are not plagiarizing, intentionally or
unintentionally.
You are also advised that there will be a C programming component (on paper, not on a computer) in
the ﬁnal examination. Students who do not program their own assignments will be at a disadvantage
for this part of the examination.12. Late Policy
The late penalty is 10% of the available marks for that project for each day (or part thereof) overdue.
Requests for extensions on medical grounds will need to be supported by a medical certiﬁcate. Any
request received less than 48 hours before the assessment date (or after the date!) will generally not
be accepted except in the most extreme circumstances. In general, extensions will not be granted if
the interruption covers less than 10% of the project duration. Remember that departmental servers
are often heavily loaded near project deadlines, and unexpected outages can occur; these will not be
considered as grounds for an extension.
Students who experience diﬃculties due to personal circumstances are encouraged to make use of
the appropriate University student support services, and to contact the lecturer, at the earliest
opportunity.
Finally, we are here to help! There is information about getting help in this subject on the LMS.
Frequently asked questions about the project will be answered on Ed.
Note that the process for applying for extensions is explained in more detail here:
Please see the following page to understand the Faculty of Engineering and Information Technology's
processes for assessment extensions and special consideration
https://eng.unimelb.edu.au/students/coursework/study-resources/extensions-and-special-consideration13. Dataset Download
The original dataset (dataset_full.csv) contains the following suburbs (each represented as a whiteedged
polygon):
This dataset is then sampled for four more datasets of increasing scale. Early datasets are likely to
reveal issues in your approach, while the larger datasets are more likely to reveal issues with your
memory allocation, etc.
dataset_1.csv - 1 suburbdataset_15.csv - 15 suburbs
dataset_100.csv - 100 suburbs (no map provided)
Note that in Assignment 2 - dataset_100 has all duplicates removed, removing 51 suburbs with COMP20003
Codes:
- 1986, 2856, 3201, 3670, 4278, 4338, 4794, 5549, 5926, 6139, 6165, 6206, 6466, 6499, 6636, 7035, 7057, 7302, 7720,
7750, 7903, 8161, 8205, 8540, 8552, 8972, 9032, 9130, 9272, 9414, 9602, 9993, 10486, 10671, 11249, 11480, 11627, 12364, 12417, 12570, 12786, 12852, 14412, 14446, 14507, 14539, 14551, 14736
This aﬀects the suburbs:
- Cedar Creek, Springﬁeld, Seddon, Windsor, Kensington, Kingswood, Williamstown, Ascot, Red Hill, Richmond,
Ascot, Glenroy, Greenlands, Carlton, Back Creek, Buckingham
dataset_1000.csv - 1000 suburbs (no map provided)14. Assignment Submission
Code Submission:
File Upload: Upload your C code ﬁles (including your Makeﬁle and report and any other ﬁles
required to run your code) here. Your programs must compile and run correctly on Ed.
Cross-Environment Compatibility: While you may have developed your program in another
environment, it still needs to compile and run successfully on Ed at the time of submission. Due
to potential compiler diﬀerences, it's recommended to upload and test your code on Ed
regularly, especially if you're working in a diﬀerent environment.
Missing Files: A common reason for compilation errors is accidentally omitting a ﬁle from the
submission. Please double-check your submission and resubmit all necessary ﬁles if required.
Testing:
Test Cases: Test cases for the ﬁrst 9 marks are available here. The remaining 6 marks will be
assessed manually.
Character Limit: Marking feedback is limited to 50,000 characters, so some output may be
truncated.
Sample Usage: A set of tests is provided for your reference. Here's an example of how to use
them:
./dict3 stage tests/dataset_1.csv output.out < tests/test1.in > output.stdout.out
This command should execute your ﬁrst program ( dict3 ) with the following arguments:
stage : an argument specifying the stage, must be 3 (if the executable is dict3 ) or 4 (if the
executable is dict4 ) for this assignment.
tests/dataset_1.csv : The input data ﬁle
output.out : The ﬁle where your program's output will be written
Redirects input ( < ) from tests/test1.in
Redirects standard output ( > ) to output.stdout.out
The expected outputs in tests/test1.out and tests/test1.stdout.out should then match the
corresponding ﬁles you created ( output.out and output.stdout.out ).
Remaking Your Code: You can remake your code from scratch (the -B ﬂag forces it to ignore
any existing up-to-date elements from the dependencies) using
make -B dict3 dict4The full suite of tests are:
./dict3 3 tests/dataset_1.csv output.out < tests/test1.in > output.stdout.out
./dict3 3 tests/dataset_15.csv output.out < tests/test15.in > output.stdout.out
./dict3 3 tests/dataset_100.csv output.out < tests/test100.in > output.stdout.out
./dict3 3 tests/dataset_1000.csv output.out < tests/test1000.in > output.stdout.out
./dict3 3 tests/dataset_full.csv output.out < tests/testfull.in > output.stdout.out
./dict4 4 tests/dataset_1.csv output.out < tests/test1.in > output.stdout.out
./dict4 4 tests/dataset_15.csv output.out < tests/test15.in > output.stdout.out
./dict4 4 tests/dataset_100.csv output.out < tests/test100.in > output.stdout.out
./dict4 4 tests/dataset_1000.csv output.out < tests/test1000.in > output.stdout.out
./dict4 4 tests/dataset_full.csv output.out < tests/testfull.in > output.stdout.out
Testing with valgrind will typically follow a pattern similar to:
valgrind ./dict3 3 tests/dataset_1000.csv output.out < tests/test1000.in > output.stdout.out
or
valgrind --track-origins=yes --leak-check=full ./dict3 3 tests/dataset_1000.csv output.out < tests/test1000.in > outp
Note that testing using the Mark button may take a while to run as the tests are heavier duty than the tests in
the workshop questions.15. Reﬂection Submission
Question 1
No response
Question 2
No response
Question 3
No response
Question 4
No response
Reﬂection
This form provides an opportunity to both look back at your work to reﬂect on it, as well as evaluate
how you did.
(Correctness) Self-Evaluated Marks
Stage 3 implementation is correct and reasonably eﬃcient. (2 Marks)
Stage 4 implementation is correct and reasonably eﬃcient. (4 Marks)
If you were assessing your assignment against these marking criteria, how many marks do you think
you would give your solution based on its correctness and eﬃciency?
(Correctness) Learning and Challenges
Please include your top lessons learnt and challenges faced in this criterion.
(Correctness) Ideas that Almost Worked Well
If there were any ideas you tried or wanted to try, include them here and explain why they didn't
make it.
(Correctness) New Test Cases Shared on Ed
Tell us about your test cases and why they were useful.Question 5
No response
Question 6
No response
Question 7
No response
Question 8
No response
Question 9
No response
Question 10
(Correctness) Justiﬁcation
Let us know why you assigned yourself the marks you did.
(Memory) Self-Evaluated Marks
Stage 3 memory correctly freed and is free of errors. (1 Mark)
Stage 4 memory correctly freed and is free of errors. (2 Marks)
If you were assessing your assignment against the marking criteria, how many marks do you think
you would give your solution based on its memory mastery?
(Memory) Learning and Challenges
Please include your top lessons learnt and challenges faced in this criterion.
(Memory) Ideas that Almost Worked Well
If there were any ideas you tried or wanted to try, include them here and explain why they didn't
make it.
(Memory) New Test Cases Shared on Ed
Tell us about your test cases and why they were useful.
(Memory) JustiﬁcationNo response
Question 11
No response
Question 12
No response
Question 13
No response
Question 14
No response
Question 15
Let us know why you assigned yourself the marks you did.
(Reﬂection) Self-Evaluated Marks
Experimentation follows a sensible and systematic approach. (1 mark)
Analysis covers all relevant cases. (1 mark)
Analysis meaningfully examines complexity claims. (1 mark)
Diagrams are used well to support claims. (1 mark)
Good comparison between theory and empirical results. (1 mark)
If you were assessing your reﬂection for the assignment, how many marks do you think you would
give your solution based on its quality? (Note that the rubric criteria are designed to capture a broad
range of answers - the strength of ungrading here is that you may be able to more meaningfully
reward work well done).
(Reﬂection) Learning and Challenges
Please include your top lessons learnt and challenges faced in this criterion.
(Reﬂection) Ideas that Almost Worked Well
If there were any ideas you tried or wanted to try, include them here and explain why they didn't
make it.
(Reﬂection) New Test Cases Shared on Ed
Tell us about your test cases and why they were useful.No response
Question 16
No response
Question 17
(Reﬂection) Justiﬁcation
Let us know why you assigned yourself the marks you did.
(General) Understanding Demonstrated
Academic honesty is important, however there are academically honest ways to complete the
assignment which do not demonstrate understanding (e.g. crediting a peer for the full
implementation of your assignment may well be honest, but clearly may not be worth any marks!). In
this question, reﬂect brieﬂy on any assistance you made use of (e.g. Google, ChatGPT, workshop peer
discussion, assignment consultations, Ed discussion, teamwork, prior workshop solutions). In
particular, if you found yourself without this kind of assistance in the future, let us know if you think
you would be able to complete a project with similar challenges. If you like, also let us know what
you think is reasonable (i.e. even compiler feedback could be seen as a form of assistance - one form
which you may not have access to in an exam situation).
Will you click the blue submit button in the top right corner after ﬁlling in these two surveys?
Yes
No16. Submission Certiﬁcate
Question 1
Question 2
In this form you will conﬁrm & certify your submission against the COMP20003 Honour Code and
academic integrity of your submission.
COMP20003 Honour Code
We're committed to the integrity of your work on the COMP20003 course. To do so, every learnerstudent
should:
Submit their own original work
Not share their code (solution) with other students
Did you enjoy the assignment?
(This question is for us to understand the usefulness of this project assignment for the future)
Yes, a lot!
Yes
More or less
Not that much
No, I didn't
I don't want to participate in this question
Do you feel you learned with this assignment?
(This question is for us to understand the usefulness of this project assignment for the future)Question 3
Question 4
Question 5
No response
Yes, I learned quite a lot!
Yes, I learned some things
I learned some, but not much
I learned almost nothing
I don't want to participate in this question
Since you have both your ungrading results and your centralised marks for Assignment 1, which
would you have preferred for that assignment?
Ungrading
Centralised
Which method of marking do you think you'll prefer for Assignment 2?
Ungrading
Centralised
Feel free to share any feedback here
We are interested in any constructive negative or positive feedback, both are helpful to reﬂect what
may need to change and what is working. Good humor and politeness are always welcome! ;-)
Thanks!Question 6
No response
I certify that this was all our original work. If we took any parts from
elsewhere, then they were non-essential parts of the project, and they
are clearly attributed at the top of the ﬁle and in a separate report. I will
show I agree to this honor code by typing "Yes":
This declaration is the same as the one in Khan Academy. We trust you all to submit your own work
only; please don't let us down. If you do, we will pursue the strongest consequences available to us.
You are always better oﬀ getting a very bad mark (even zero) than risking to go that path, as the
consequences are serious for students.
The project will not be marked unless this question is answered correctly and exactly with "Yes"

vx_codinghelp

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
COMP20003 Algorithms and Data Structures

0. GeneralPurposetrie.form:Operatora & ba | ba |= ba << ba >> bMeaninggetBit/*number.*//**/
复制链接

扫一扫