CSSE2310– Semester 2, 2024 Assignment 1 (Version 1.2)

CSSE2310– Semester 2, 2024 Assignment 1 (Version 1.2)

w,e,c,h,a,t : help-assignment

The goal of this assignment is to give you practice at C programming. You will be building on this ability in the remainder of the course (and subsequent programming assignments will be more difficult than this one). You are to create a program (called uqentropy) that will determine the strength of possible passwords entered on stdin based on either (or both) their presence in a set of known passwords (from one or more files) and/or the length and characteristics of the possible passwords. The assignment will also test your ability to code to a particular programming style guide, and to use a revision control system appropriately.

w,e,c,h,a,t : help-assignment

Specification
The uqentropy program will read candidate passwords from standard input (stdin) and evaluate their strength 39 by determining their entropy– which is a measure of randomness or uncertainty. The more “random” a password, 40 the harder it is to guess, so the stronger it is.
Two different approaches will be used by uqentropy to calculate entropy.

Entropy
The entropy (E) of a password can be defined as log2(N) where N is the number of possible passwords– i.e. the number of possible combinations of the set of symbols used to create the password. N can be defined as S to the power of L, (i.e. SL) where S is the size of the set of symbols and L is the length of the password. Entropy (E) is measured in bits1. A password with 10 bits of entropy would be equivalent in strength to a set of 10 bits chosen perfectly randomly– it would require 2^10 guesses to guarantee guessing the password. Each extra bit of entropy makes a password twice as hard to guess.
For example, a password known to be constructed from 8 lower case letters (a to z only) has an entropy of E =log2(26^8) ≈ 37.6 bits. In this case, S = 26 (the number of symbols in the set– lower case letters in this case), and L = 8 (the length of the password). As another example, if a password is known to be constructed from a set of symbols including lower case letters, upper case letters and digits (62 possible symbols) and has length 10, then the entropy of the password is E =log2(62^10) ≈ 59.5 bits.
Another way of thinking about entropy is based on how many guesses it would take to guess the password using a brute force attack of all possible passwords. The expected number of guesses (n) is half the number of possible passwords (N). Sometimes a brute force attack will get it right quite quickly, other times you’ll need to check nearly all possible passwords. On average, you’ll need to check 50% of the possible passwords. Using the formula above (E = log2(N)), if you consider N = 2n then you can define entropy as E = log2(2n) (where n is the expected number of guesses to guess the password).
This approach gives a better estimate of entropy when a password isn’t truly random. For example, if an attacker uses a list of common passwords in a brute force attack then they will be able to guess the password quite quickly if a common password is in use. Suppose that a user uses the password “password” and that this is entry number 7 on a list of common passwords used by an attacker2. The expected number of guesses to brute force this password would be 7. The entropy of this password calculated using this approach would therefore be E =log2(2×7) ≈ 3.8 bits– significantly lower than the value of 37.6 determined above for a combination of 8 lower case letters
Password strength We will use the following ratings of passwords based on their entropy:
 <35 – very weak
 35 to < 60 – weak
 60 to < 120 – strong
 120 or higher – very strong
These are somewhat arbitrary but they provide reasonable guidance to users.

Command Line Arguments
Your uqentropy program is to accept command line arguments as follows:
./uqentropy [–leetspeak] [–checkcase] [–doubleup] [–add-digits numdigits] [listfilename …]
The square brackets ([]) indicate optional arguments or groups of arguments. The italics indicate place- holders for user-supplied value arguments. An ellipsis (…) indicates the previous argument can be repeated.
uqentropy expects zero or more option arguments to follow the command name (with an associated value argument in the case of–add-digits). Option arguments can be in any order. Zero or more filenames (password lists) will follow this– though if an option argument is given then at least one listfilename argument must be provided. It is acceptable to run uqentropy with no command line arguments. It is also possible to provide one or more listfilename arguments without any option arguments.
Some examples of how the program might be run include the following:
./uqentropy
./uqentropy /usr/dict/share/words
./uqentropy --checkcase ./passwordfile.txt
./uqentropy –doubleup –leetspeak –checkcase ./passwordfile.txt
./uqentropy --leetspeak list1 list2 list3 /usr/share/dict/words
./uqentropy --add-digits 5 words.txt
The meaning of the arguments is as follows. More details on the expected behaviour of the program are 93 provided later in this document.
–checkcase – this argument specifies that we will compare each candidate password (from stdin) not just against the passwords in the given file(s) but also against any variants of those passwords obtainable by changing the case of any subset of the letters in those passwords (i.e. each letter can either be upper or lower case). If this argument is present then at least one filename must be given on the command line.
–leetspeak – this argument specifies that when we compare the candidate password against those in the password list(s), then we will also check various letter substitutions as commonly used in ‘leetspeak’. If this argument is present then at least one filename must be given on the command line.
–add-digits – this argument specifies that when we compare the candidate password against those in the password list(s), then we will also check those passwords with any combination of 1 to numdigits digits appended. If the–add-digits argument is present (with its associated value argument) then at least one filename must be given on the command line.
–doubleup – this argument specifies that we should also compare the candidate password against any combination of two passwords from the password list(s). If the–doubleup argument is present then at least one filename must be given on the command line.
listfilename … – filenames specified on the end of the command line are assumed to be the names of files containing password lists against which the candidate password(s) are to be compared. There may be zero of more of these file names specified. It can be assumed that the first (or only) file name argument does not start with the characters “–”. (If the user wants to specify a file whose name does start with “–” then they should prefix the name with “./”. Any arguments after the first file name are also assumed to be file names, even if they begin with “–”.)
Prior to doing anything else, your program must check the command line arguments for validity. If the program receives an invalid command line then it must print the (single line) message:
Usage: ./uqentropy [–leetspeak] [–checkcase] [–doubleup] [–add-digits 1…6] [listfilename …]
to standard error (with a following newline), and exit with an exit status of 13.
Invalid command lines include (but may not be limited to) any of the following:
 The --add-digits option argument is given but it is not followed by a single digit value argument which is an integer between 1 and 6 inclusive.
 One (or more) of --checkcase, --leetspeak, --add-digits (with a value), and/or --doubleup is specified but no file name is given on the command line.
 Any of the option arguments is listed more than once.
 Anunexpected argument is present.
 Any argument is the empty string.
Checking whether files exist or can be opened is not part of the usage checking (other than checking that filename values are not empty). This is checked after command line validity as described below.

File Checking
Your program must check that any password list files mentioned on the command line can be opened for reading and contain at least one valid password. If a file can not be opened for reading, then your program must print the message:
uqentropy: unable to read from file “filename”
to standard error (with a following newline) where filename is replaced by the name of the file from the command line. The double quotes must be present.
If a file can be opened for reading, then the contents of the file must be read into memory. Each file is assumed to contain a list of possible passwords where each entry is separated by any number of whitespace characters4. Whitespace characters at the start or end of the file are to be ignored. Valid passwords must contain at least one character and will never contain whitespace characters (since these separate passwords) nor any non-printable characters5.
If an openable file is found to contain any non-printable characters (besides whitespace characters that separate passwords), then your program must print the message:
uqentropy: invalid character found in file “filename”
to stderr (with a following newline) where filename is replaced by the name of the file from the command line. The double quotes must be present. This message must be printed at most once per file, even if the file contains multiple non-printable characters. (This means uqentropy can stop reading a file as soon as such an error is detected.)
If an openable file with no non-printable characters that passes the check above is found to contain no valid passwords, then your program must print the message:
uqentropy: no valid passwords found in file “filename”
to stderr (with a following newline) where filename is replaced by the name of the file from the command line. The double quotes must be present.
Every file name listed must be checked and read into memory in turn– in the order they appear on the command line. Multiple error messages may be printed– but at most one per file.
If any error occurs, then, after checking all files, your program must exit with an exit status of 15.

Program Behaviour
Program Startup
Assuming that the checks above are successful, then your program must print the following to standard output (stdout), with a newline at the end of each line:
Welcome to UQEntropy
Written by s4903603.
Enter candidate passwords to check their strength.
Your program must then repeatedly read lines from stdin (each terminated by a newline character or pending EOF). If the line (excluding any terminating newline) is a valid candidate password then it will be evaluated for its strength. Valid candidate passwords will (a) have at least one character, and (b) only contain printable characters and © not contain any whitespace characters. If the line entered is not a valid candidate password then your program must print the following message (terminated by a newline) to stderr and attempt to read another candidate password from stdin:
Invalid candidate password
If a valid candidate password is entered, then your program must then calculate the entropy of the password using the first method below, and possibly also the second method, depending on the command line arguments to the program.

Entropy Calculation 1– Based on Symbols Used
If a valid candidate password has been entered then its entropy must be calculated based on the symbols used in the password and the size of the sets they belong to. Your program must examine the candidate password to determine the type(s) of symbols used. The symbol set size must be determined based on the following table–based on the row that matches the set of symbols used in the candidate password.
Symbols Used Set Size
Digits in the range 0 to 9 only 10
Lower case letters (a to z) only 26
Upper case letters (A to Z) only 26
Non alphanumeric printable ASCII characters only, e.g., any of ! " # $ % & ’ ( ) * + ,- . / : ; < = > ? @ [ ] ˆ _ ‘ { | } ~ 32
Any combination of the above sets Sum of the set sizes
The entropy (E1) of the candidate password should then be calculated as described on page 3, i.e. E1 =log2(SL) = L×log2(S) where S is the size of the set of symbols used and L is the length of the candidate password. For example, if the candidate password is “abcd3f6h!j” then S = 68 (26 for lowercase letters plus 10 for digits plus 32 for other ASCII characters) and L = 10, so the entropy is E1 = log2(6810) = 10 ×log2(68) ≈60.9. If the candidate password is “a1B2c3%”, then the entropy is E1 = log2((26 + 26 + 10 + 32)7) = log2(947) = 7 ×log2(94) ≈ 45.9. It is best to use the second variant of the formula when computing this, i.e. E1 = L×log2(S), to avoid overflow when calculating SL.
Entropy Calculation 2– Based on Matching Passwords From List(s)
If any password lists are provided on the command line then uqentropy will check whether the candidate password can be found in the list(s) and will determine the match number, i.e., determine the number of guesses an attacker would need to make before matching the candidate password, if the attacker was using the given password list(s) and any secondary matching approach specified on the command line. If a match is found, then no further checking is performed and the entropy can then be calculated based on the match number. Attempted matching approaches are checked in the order listed here. See the Hints section for some suggestions on how to do this matching.

 Each valid entry in the list(s) is checked in turn, i.e., each valid entry in the first password list file is checked in the order that they appear in the file, followed by each valid entry in the second password list file (if specified), etc. If an exact match is found, then no further checking is performed. See below for how entropy is then calculated when a match is found.
 If no exact match has been found after checking each valid entry and–checkcase has been specified on the command line, then every valid entry in the password list(s) that contains one or more letters is checked again (in the same order as before) using all combinations of upper and lower case letters in place of the letters used in the entry. For example, if an entry in a password list file is abc123, then your program must determine whether the candidate password matches any of the following: Abc123, aBc123, abC123, ABc123, AbC123, aBC123, or ABC123. If a match is found then no further checking is performed. The match number is incremented as if all combinations of letters (other than the original) were checked, i.e. 2n −1 is added to the match number, where n is the number of letters in the password entry. (In this example, 7 would be added to the match number when this entry is checked.)
 If no match has yet been found and–add-digits has been specified on the command line (with value n), then every entry in the password list(s) that does not end in a digit is checked again (in the same order as before) using all combinations of 1 to n digits, appended to the entry. For example, if an entry in the password list file is password and–add-digits 2 is specified on the command line, then your program must determine whether the candidate password matches password0, password1, …, password9, password00, password01, …, password99. If a match is found then no further checking is performed. The match number is incremented based on the number of passwords checked. For example in this case, if the candidate password is password13, then the match number would be incremented by 24 (for 0,1,…,9,00,01,…,12,13). The match number would be incremented by 101 +102 +…+10n if no match is possible against that password entry.
 If no match has yet been found and–doubleup has been specified on the command line, then it must be checked whether the candidate password will match the concatenation of every pair of valid entries in the password list(s). For example, if the password list contains the passwords password, 123456, and abc123, then your program must determine whether the candidate password will match the values passwordpassword, password123456, passwordabc123, 123456password, 123456123456, 123456abc123, abc123password, abc123123456, or abc123abc123. The match number is incremented based on the num- ber of passwords that would need to be checked to find a match, or n2 if no matches are found (where n =number of valid entries in the password list(s)). Passwords must be assumed to be checked in the order shown here, i.e., all combinations that begin with the first entry would be checked before all combinations that begin with the second entry, etc.
 If no match has yet been found and–leetspeak has been specified on the command line, then every entry in the password list(s) that contains any of the following letters (lower or upper case) is checked again (in the same order as before) using all combinations of ‘LEETspeak’ substitutions shown in the following table:
Letter Substitute Symbol
a/A @ or 4
b/B 6 or 8
c/E 3
g/G 6 or 9
i/I 1 or !
l/L 1
o/O 0
s/S 5 or $
t/T 7 or +
x/X %
z/Z 2

For example, if an entry in a password list file is qwertyui, then your program must determine whether the candidate password matches any of the following: qw3rtyui, qwer7yui, qwer+yui, qwertyu1, qwertyu!, qw3r7yui, qw3r+yui, qw3rtyu1, qw3rtyu!, qwer7yu1, qwer7yu!, qwer+yu1, qwer+yu!, qw3r7yu1, qw3r7yu!, qw3r+yu1, qw3r+yu!. If a match is found then no further checking is performed. The match number is incremented as if all combinations of subsitutions (other than the original) were checked, i.e. 2a3b − 1 is added to the match number, where a is the number of letters in the password entry that can be substi- tuted by a single character and b is the number of letters in the password entry that can be substituted by two characters. (In this example, 17 would be added to the match number when this entry is checked because a is 1 (the ‘e’ can be substituted by one character (‘3’)) and b is 2 (both the ‘t’ and the ‘i’ can be substituted by two characters) and 21 × 32 − 1 = 17.)
If a match is found then the following message must be printed to stdout:
Candidate password would be matched on guess number N
If a match is not found then the following message must be printed to stdout:
No match would be found after checking N passwords
In both cases, the message is followed by a newline and N is replaced by the match number (the number of passwords that would have to be checked using the given combination of uqentropy option arguments and password lists). You can assume that the match number will never overflow a 64 bit unsigned integer, i.e. an unsigned long type on moss (i.e. we will not test situations where this overflow happens). Neither of these messages is printed if there are no password list filenames given on the uqentropy command line.
If a match would be found during the checks above then the entropy (E2) of the password is calculated as described on page 3, i.e., E2 = log2(2n) where n is the match number, i.e. the number of passwords that would need to be checked to find a match using the given combination of option arguments and password lists. If no match is found then the entropy E2 can not be calculated.

Calculating Overall Entropy and Password Strength
If a password match was found, i.e. E2 was able to be calculated, then the “overall entropy” (E) of the password will be the minimum of E1 and E2, i.e. E = min(E1,E2).
If no password match would be found (or no password lists or option arguments were provided on the command line), then the overall entropy (E) will just be E1 as described above.
Your program must output the overall entropy for the candidate password by printing the following message to stdout (with a trailing newline):
Password entropy: E
where E is replaced by the overall entropy, rounded down to the nearest 0.1. (One digit must always be shown after the decimal point.)
Your program must then print one of the following messages to stdout (with a trailing newline):
 Password classification: very weak – if the overall entropy is < 35
 Password classification: weak – if the overall entropy is in the range 35 to < 60
 Password classification: strong – if the overall entropy is in the range 60 to < 120
 Password classification: very strong – if the overall entropy is ≥ 120
Your program must then attempt to read another candidate password from stdin and repeat the process above.

Exiting the Program
If end-of-file (EOF6) is detected on stdin when your program goes to read a line then your program must exit. If at least one candidate password rated “strong” or “very strong” has been entered, then your program must exit with status 0. If no “strong” or “very strong” password has been entered (including the case when no valid passwords are entered), then your program must print the following message to stdout (with a terminating newline): No strong password(s) have been entered and exit with exit status 6.

Other Requirements
Your program must open and read each password list file only once7 and store its contents in dynamically 279 allocated memory. Your program must free all allocated memory before exiting, including if it exits due to a 280 usage or file error. 2502 (Your program does not have to free memory if it exits due to a signal, e.g. the interrupt 281 signal generated by pressing Ctrl-C.)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值