07. Lists & Maps

Null

For reference variables may not be pointing at any object

- Not an object

- A reserved literal value for reference type

Just like "true" or "false" for boolean type

- For reference types, default value is "null"


Local reference variables are not automatically initialized to "null"

- You must initialize them by either calling a constructor or setting them to null

- In fact, Java assigns no default value to a local variable

- FYI: Empty String is actually a String object, not null

Note: Be aware of NullPointerException


java.util.ArrayList

Implements an expandable array

- Internally, maintains an array

- No direct access to the array itself (protect data)

- Instead, it provides methods to manipulate the array

- When the array is full, it creates a new larger array and copies the data from old (smaller) array into the new (larger) array


Automatically handles issues relating to length

1. import java.util.ArrayList

2. Allocation of space: List<String> a = new ArrayList<String>();

3. Appending (adding to the end): String item = ...; a.add(item);

4. Inserting before the first element: a.add(0, item);

5. For each

6. Deleting: a.remove(i);

7. Searching: int i = a.indexOf(item); a.contains(item);


ArrayList Manages an Array

Makes an internal array of length 10 - this is the default capacity

Maintains a separate count of the number of elements referenced by the array (the user's perception of size)

If adding an element exceeds the capacity, allocate a new array that is 50% bigger and copy data from the old array into (the beginning of) the new array


java.util.LinkedList

Elements are referenced by Node objects

To add an element, a new Node object is allocated and inserted (using links in the chain) at the beginning, end or middle of the list


O(1) means an operation take time less than c

O(log n) means an operation take time less than c * log n

O(n) means an operation take time less than c * n

O(n log n) means an operation take time less than c * n * log n

O(n^2) means an operation take time less than c * n^2

* n is the number of elements and c is some constant


The List Interface

Specifies common methods that must be implemented by all lists

Implemented by ArrayList, Vector, LinkedList, etc.


Java Interfaces

A Java interface allows you to specify methods that must be implemented by a class

You can then use references to the interface, knowing the methods will be available, even though you don't know the specific class used

An Interface reference variable can reference any object that implements that interface regardless of its class type


Generics in Java

Java allows you to specify a class that manipulates instances of some type <T>

- Using type parameters

Complier now can check:

- When inserting objects of the wrong type

- Incorrect return type when getting an element

- Gives compile time error/warning messages

No cast is required for calling get method

Makes your program safer and easier to read


Before Generics (In Java 4 and before)

The Collections Framework stored Objects

You could put any object into a List

When you took something out of a List...

- You needed to cast it back into your type

- You could check types using instanceof operator


Hash Functions

Convert (compress) data into a number (usually an int)

The goal: Two different inputs should generally (and ideally) have two different outputs


java.util.HashMap

Elements (values) are accessed by keys (not index)

- It maps Keys to Values.

Fast to insert, delete and lookup

- But no order of the elements

Keep enough space

If array gets too full, reallocate the array and rehash everything


Performance Comparisons

Array/ArrayList LinkedList HashSet/HashMap

Append After Last O(1) O(1) O(1)

Insert Before First O(n) O(1) O(1)

Lookup by Position O(1) O(n) N/A

Lookup by Value O(n) O(n) O(1) (by key)

Remove Last O(1) O(1) O(1)

Remove First O(n) O(1) O(1)


Question

Among these data structures, which would be better if not best:

- Looking up student records by andrew id over and over?  HashMap

- Managing waitlist for a course?

Maintaining the order  Array/ArrayList

Adding and deleting students at both ends  LinkedList


java.util.Comparator

Pass a Comparator to sort with an alternative ordering

It's a Java Interface

- Create a class to implement it (implements Comparator<T>)

- Need to implement compare(T a, T b) method

Negative if a comes before b

0 if they are identical

Positive otherwise


Autoboxing

Writing the code to put your ints in Integers is a hassle

- Same for other primitives

In Java 5, Java will automatically convert between primitives and their Object wrapper classes

- When passing parameters or returning values

- In assignment or math expressions


ArrayList<Integer>

Just declare ArrayList<Integer>

Put in and take out ints

Autoboxing automatically does the conversions


Sample Final Exam Questions

Compare the use of ArrayLists and LinkedLists? What are the advantages of each?

ArrayList: Lookup by Position

LinkedList: Insert Before First and Remove First


What is a comparator?

It's a Java Interface

- Create a class to implement it (implements Comparator<T>)

- Need to implement compare(T a, T b) method

Negative if a comes before b

0 if they are identical

Positive otherwise


What is autoboxing? Why is it useful in Java?

Writing the code to put your ints in Integers is a hassle

- Same for other primitives

In Java 5, Java will automatically convert between primitives and their Object wrapper classes

- When passing parameters or returning values

- In assignment or math expressions


What are Java Generics? What are the advantages of using generic classes? What did Java programmers do before we had generic classes?

Generics in Java

Java allows you to specify a class that manipulates instances of some type <T>

- Using type parameters

Complier now can check:

- When inserting objects of the wrong type

- Incorrect return type when getting an element

- Gives compile time error/warning messages

No cast is required for calling get method

Makes your program safer and easier to read


Before Generics (In Java 4 and before)

The Collections Framework stored Objects

You could put any object into a List

When you took something out of a List...

- You needed to cast it back into your type

- You could check types using instanceof operator

这段代码主要是用于读取数据,并返回词列表、标记列表以及词和标记的映射关系。具体解释如下: 1. `from os.path import join`:从os.path模块中导入join函数,用于拼接文件路径。 2. `from codecs import open`:从codecs模块中导入open函数,用于以指定编码打开文件。 3. `def build_corpus(split, make_vocab=True, data_dir="./ResumeNER"):`:定义了一个名为build_corpus的函数,用于读取数据。参数split表示数据集类型(train、dev或test),参数make_vocab表示是否需要返回词和标记的映射关系,默认为True,参数data_dir表示数据所在的目录,默认为"./ResumeNER"。 4. `assert split in ['train', 'dev', 'test']`:断言split参数的取值必须是train、dev或test中的一个。 5. `word_lists = []`:初始化一个空列表用于存储词列表。 6. `tag_lists = []`:初始化一个空列表用于存储标记列表。 7. `with open(join(data_dir, split + ".char.bmes"), 'r', encoding='utf-8') as f:`:以utf-8编码打开数据文件,文件路径为data_dir目录下的split.char.bmes文件,with语句保证文件使用后自动关闭。 8. `word_list = []`:初始化一个空列表用于存储当前行的词列表。 9. `tag_list = []`:初始化一个空列表用于存储当前行的标记列表。 10. `for line in f:`:遍历文件中的每一行。 11. `if line != '\n':`:如果当前行不为空行。 12. `word, tag = line.strip('\n').split()`:将当前行按空格分隔,并去掉行末的换行符,赋值给word和tag变量。 13. `word_list.append(word)`:将word添加到当前行的词列表中。 14. `tag_list.append(tag)`:将tag添加到当前行的标记列表中。 15. `else:`:如果当前行为空行。 16. `word_lists.append(word_list)`:将当前行的词列表添加到总的词列表中。 17. `tag_lists.append(tag_list)`:将当前行的标记列表添加到总的标记列表中。 18. `word_list = []`:重置当前行的词列表为空列表。 19. `tag_list = []`:重置当前行的标记列表为空列表。 20. `if make_vocab:`:如果make_vocab参数为True。 21. `word2id = build_map(word_lists)`:调用build_map函数,将词列表作为参数,返回词和id的映射关系。 22. `tag2id = build_map(tag_lists)`:调用build_map函数,将标记列表作为参数,返回标记和id的映射关系。 23. `return word_lists, tag_lists, word2id, tag2id`:返回词列表、标记列表以及词和标记的映射关系。 24. `else:`:如果make_vocab参数为False。 25. `return word_lists, tag_lists`:返回词列表和标记列表。 26. `def build_map(lists):`:定义了一个名为build_map的函数,用于构建列表中元素和id的映射关系。参数lists表示待处理的列表。 27. `maps = {}`:初始化一个空字典,用于存储元素和id的映射关系。 28. `for list_ in lists:`:遍历待处理的列表。 29. `for e in list_:`:遍历列表中的每个元素。 30. `if e not in maps:`:如果当前元素不在映射关系中。 31. `maps[e] = len(maps)`:将当前元素和其对应的id(即映射关系中的元素个数)添加到映射关系中。 32. `return maps`:返回映射关系。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值