Guava学习之Collections ——Collection Utilities

最新推荐文章于 2022-01-12 17:36:59 发布

神蜗牛

最新推荐文章于 2022-01-12 17:36:59 发布

阅读量1k

点赞数

分类专栏： Guava

Guava 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

Collection Utilities

有经验的程序员在使用JDK Collections框架时都喜欢使用 java.util.Collections 的工具类。Guava 沿用这条线提供了更多的实用工具：适用于所有集合的静态方法。这部分是Guava中最受欢迎也是最成熟的部分。

对应于特定接口的方法以相对直观的方式分组：

Interface	JDK or Guava?	Corresponding Guava utility class
`Collection`	JDK	`Collections2`
`List`	JDK	`Lists`
`Set`	JDK	`Sets`
`SortedSet`	JDK	`Sets`
`Map`	JDK	`Maps`
`SortedMap`	JDK	`Maps`
`Queue`	JDK	`Queues`
`Multiset`	Guava	`Multisets`
`Multimap`	Guava	`Multimaps`
`BiMap`	Guava	`Maps`
`Table`	Guava	`Tables`

需要transform, filter等方法? 这些在我们的 functional 程序部分中。

Static constructors

在JDK 7之前, 构造一个新的通用集合需要像下面这样写很多冗余代码：

List<TypeThatsTooLongForItsOwnGood> list = new ArrayList<TypeThatsTooLongForItsOwnGood>();

我觉得大家都会同意我的说法。Guava提供的静态方法能够使用泛型来推断右侧的类型：

List<TypeThatsTooLongForItsOwnGood> list = Lists.newArrayList(); 
Map<KeyType, LongishValueType> map = Maps.newLinkedHashMap();

可以肯定的是，JDK 7 中的钻石操作符可以简化代码：

List<TypeThatsTooLongForItsOwnGood> list = new ArrayList<>();

但是 Guava 比这个走的更远。使用工厂方法模式，我们可以非常方便地初始化它们的起始元素集合。

Set<Type> copySet = Sets.newHashSet(elements); 
List<String> theseElements = Lists.newArrayList("alpha", "beta", "gamma");

此外, 通过命名工厂方法(Effective Java item 1)的能力, 我们可以提高初始化集合到大小的可读性：

List<Type> exactly100 = Lists.newArrayListWithCapacity(100); 
List<Type> approx100 = Lists.newArrayListWithExpectedSize(100); 
Set<Type> approx100Set = Sets.newHashSetWithExpectedSize(100);

Guava提供的精确的静态工厂方法在下面列出了相应的实用类。

提示: Guava引入的新集合类型不会暴露原始的构造函数，也不会再实用工具类中具有初始化器。相反，它们直接暴露静态工厂方法，例如：

Multiset<String> multiset = HashMultiset.create();

Iterables

只要有可能，Guava更愿意提供工具去接受Iterable 而不是 Collection. 在Google, 遇到“集合”并不常见，它实际上并不存储在主内存中，而是从数据库或其他数据中心收集的，并且在不实际获取所有元素的情况下不能支持size()之类的操作。

事实上,你可能希望在Iterables中找到所有集合的操作. 此外, 大多数terables 方法有对应得版本去接受原始的迭代器。

Iterables中 绝大多数的操作都是懒惰的：只有在绝对必要时才会提前进行支持迭代。方法返回Iterables返回的懒散计算视图，而不是在内存中显式构建集合。

Guava 12中, Iterables补充了 FluentIterable 类, 其封装了一个Iterable 并提供了一个 "fluent" 的语法。

下面是一些最常用的实用工具，尽管 Iterables 中许多更"functional"的方法将在 Guava functional idioms中描述。

General

Method	Description	See Also
`concat(Iterable<Iterable>)`	返回几个迭代器级联的懒惰视图。	`concat(Iterable...)`
`frequency(Iterable, Object)`	返回对象的出现次数。	Compare `Collections.frequency(Collection, Object)`; see `Multiset`
`partition(Iterable, int)`	返回可迭代分割成指定大小的块的不可修改视图。	`Lists.partition(List, int)`, `paddedPartition(Iterable, int)`
`getFirst(Iterable, T default)`	返回迭代的第一个元素，或者如果空的默认值。	Compare `Iterable.iterator().next()`, `FluentIterable.first()`
`getLast(Iterable)`	返回迭代元素的最后一个元素，如果空的，则使用`NoSuchElementException` 快速失败。	`getLast(Iterable, T default)`, `FluentIterable.last()`
`elementsEqual(Iterable, Iterable)`	如果迭代项以相同的顺序具有相同的元素，则返回true。	Compare `List.equals(Object)`
`unmodifiableIterable(Iterable)`	返回iterable的一个不可变视图	Compare `Collections.unmodifiableCollection(Collection)`
`limit(Iterable, int)`	返回最多可返回的指定数量元素。	`FluentIterable.limit(int)`
`getOnlyElement(Iterable)`	返回迭代中唯一的元素。如果迭代为空或具有多个元素，则快速失败。	`getOnlyElement(Iterable, T default)`

Iterable<Integer> concatenated = Iterables.concat(
  Ints.asList(1, 2, 3),
  Ints.asList(4, 5, 6));
// concatenated has elements 1, 2, 3, 4, 5, 6

String lastAdded = Iterables.getLast(myLinkedHashSet);

String theElement = Iterables.getOnlyElement(thisSetIsDefinitelyASingleton);
  // if this set isn't a singleton, something is wrong!

Collection-Like

通常，集合在其他集合上自然地支持这些操作，但不支持迭代。

当输入实际上是一个集合时，这些操作中的每一个都委托给相应的集合接口方法。例如, 如果Iterables.size 通过了一个 Collection,它将调用 Collection.size 方法代替遍历迭代器。

Method	Analogous `Collection` method	`FluentIterable` equivalent
`addAll(Collection addTo, Iterable toAdd)`	`Collection.addAll(Collection)`
`contains(Iterable, Object)`	`Collection.contains(Object)`	`FluentIterable.contains(Object)`
`removeAll(Iterable removeFrom, Collection toRemove)`	`Collection.removeAll(Collection)`
`retainAll(Iterable removeFrom, Collection toRetain)`	`Collection.retainAll(Collection)`
`size(Iterable)`	`Collection.size()`	`FluentIterable.size()`
`toArray(Iterable, Class)`	`Collection.toArray(T[])`	`FluentIterable.toArray(Class)`
`isEmpty(Iterable)`	`Collection.isEmpty()`	`FluentIterable.isEmpty()`
`get(Iterable, int)`	`List.get(int)`	`FluentIterable.get(int)`
`toString(Iterable)`	`Collection.toString()`	`FluentIterable.toString()`

FluentIterable

除了上面介绍的方法和函数式编程外，FluentIterable 还有一些方便的方法来复制到不可变集合中：

Result Type	Method
`ImmutableList`	`toImmutableList()`
`ImmutableSet`	`toImmutableSet()`
`ImmutableSortedSet`	`toImmutableSortedSet(Comparator)`

Lists

除了静态构造方法和功能性程序方法， Lists 还提供了一系列有价值的工具方法。

Method Description

partition(List, int) 返回基础列表的视图，将其划分为指定大小的块。

Method	Description
`partition(List, int)`	返回基础列表的视图，将其划分为指定大小的块。
`reverse(List)`	返回指定列表的反转视图。注意：如果该list是不可变的，则考虑使用 `ImmutableList.reverse()` .

reverse(List)

返回指定列表的反转视图。注意：如果该list是不可变的，则考虑使用

ImmutableList.reverse() .

List<Integer> countUp = Ints.asList(1, 2, 3, 4, 5);
List<Integer> countDown = Lists.reverse(theList); // {5, 4, 3, 2, 1}

List<List<Integer>> parts = Lists.partition(countUp, 2); // {{1, 2}, {3, 4}, {5}}

Static Factories

Lists 提供以下工厂方法：

Implementation	Factories
`ArrayList`	basic, with elements, from `Iterable`, with exact capacity, with expected size, from `Iterator`
`LinkedList`	basic, from `Iterable`

Sets

Sets 工具类包含有很多刺激的方法。

Set-Theoretic Operations

我们提供了多个标准集理论操作，作为参数集上的视图实现。返回的 SetView能够被用来:

直接作为一个Set , 因为实现了 Set 接口
将其拷贝进另一个可变集合中，通过 copyInto(Set)
通过 immutableCopy()做不可变拷贝

Method
`union(Set, Set)`
`intersection(Set, Set)`
`difference(Set, Set)`
`symmetricDifference(Set, Set)`

For example:

Set<String> wordsWithPrimeLength = ImmutableSet.of("one", "two", "three", "six", "seven", "eight");
Set<String> primes = ImmutableSet.of("two", "three", "five", "seven");

SetView<String> intersection = Sets.intersection(primes, wordsWithPrimeLength); // contains "two", "three", "seven"
// I can use intersection as a Set directly, but copying it can be more efficient if I use it a lot.
return intersection.immutableCopy();

Other Set Utilities

Method	Description	See Also
`cartesianProduct(List<Set>)`	返回每一个可能的列表，可以通过从每个集合中选择一个元素来获得（笛卡尔积）。	`cartesianProduct(Set...)`
`powerSet(Set)`	返回指定集合的子集。

Set<String> animals = ImmutableSet.of("gerbil", "hamster");
Set<String> fruits = ImmutableSet.of("apple", "orange", "banana");

Set<List<String>> product = Sets.cartesianProduct(animals, fruits);
// {{"gerbil", "apple"}, {"gerbil", "orange"}, {"gerbil", "banana"},
//  {"hamster", "apple"}, {"hamster", "orange"}, {"hamster", "banana"}}

Set<Set<String>> animalSets = Sets.powerSet(animals);
// {{}, {"gerbil"}, {"hamster"}, {"gerbil", "hamster"}}

Static Factories

Sets 提供一下静态工厂方法：

Implementation	Factories
`HashSet`	basic, with elements, from `Iterable`, with expected size, from `Iterator`
`LinkedHashSet`	basic, from `Iterable`, with expected size
`TreeSet`	basic, with `Comparator`, from `Iterable`

Maps

Maps 里的工具方位可谓是应有尽有。

`uniqueIndex`

Maps.uniqueIndex(Iterable, Function) 解决具有一组对象的常见情况，每个对象都具有一些惟一的属性，并且希望能够基于该属性查找那些对象。

假设我们有一组我们知道具有唯一长度的字符串，我们希望能够查找具有特定长度的字符串。

ImmutableMap<Integer, String> stringsByIndex = Maps.uniqueIndex(strings, new Function<String, Integer> () {
    public Integer apply(String string) {
      return string.length();
    }
  });

如果索引不是唯一的，请参见下面的 Multimaps.index 。

`difference`

Maps.difference(Map, Map) 允许比较两个map之间的不同。返回的 MapDifference 类将维恩图分解成：

Method	Description
`entriesInCommon()`	两个map中的entries具有匹配的键和值
`entriesDiffering()`	entries 有相同的键但是不同的值。值在这个 map 中是 `MapDifference.ValueDifference`类型，可以看到左右的值。
`entriesOnlyOnLeft()`	返回 entries ，其键在左边的map中而不在右边。
`entriesOnlyOnRight()`	返回 entries ，其键在右边的map中而不在左边。

Map<String, Integer> left = ImmutableMap.of("a", 1, "b", 2, "c", 3);
Map<String, Integer> right = ImmutableMap.of("b", 2, "c", 4, "d", 5);
MapDifference<String, Integer> diff = Maps.difference(left, right);

diff.entriesInCommon(); // {"b" => 2}
diff.entriesDiffering(); // {"c" => (3, 4)}
diff.entriesOnlyOnLeft(); // {"a" => 1}
diff.entriesOnlyOnRight(); // {"d" => 5}

`BiMap` utilities

Guava 的 BiMap工具类也包含在 Maps 类中。

`BiMap` utility	Corresponding `Map` utility
`synchronizedBiMap(BiMap)`	`Collections.synchronizedMap(Map)`
`unmodifiableBiMap(BiMap)`	`Collections.unmodifiableMap(Map)`

静态工厂

Maps提供以下静态工厂方法

Implementation	Factories
`HashMap`	basic, from `Map`, with expected size
`LinkedHashMap`	basic, from `Map`
`TreeMap`	basic, from `Comparator`, from `SortedMap`
`EnumMap`	from `Class`, from `Map`
`ConcurrentMap`	basic
`IdentityHashMap`	basic

Multisets

标准Collection 操作, 例如 containsAll, 忽略multiset中元素的数量, 仅仅关心元素是否存在于multiset 中。 Multisets 提供的方法考虑到了multisets中元素多样性的特性。

Method	Explanation	Difference from `Collection` method
`containsOccurrences(Multiset sup, Multiset sub)`	`返回true` 如果 `sub.count(o) <= super.count(o)` 对于所有的 `o`.	`Collection.containsAll` 忽略数量, 仅测试元素是否被全部包含
`removeOccurrences(Multiset removeFrom, Multiset toRemove)`	从`removeFrom中移除出现在toRemove中的所有元素。`	`Collection.removeAll` 移除所有出现过得元素，甚至在`toRemove 只出现过一次`
`retainOccurrences(Multiset removeFrom, Multiset toRetain)`	保证 `removeFrom.count(o) <= toRetain.count(o)`	`Collection.retainAll` 在`toRetain保持所有出现过得元素。`
`intersection(Multiset, Multiset)`	返回两个multisets的交叉视图；是 `retainOccurrences`的一个非破坏性方案。	没有对应的方法

Multiset<String> multiset1 = HashMultiset.create();
multiset1.add("a", 2);

Multiset<String> multiset2 = HashMultiset.create();
multiset2.add("a", 5);

multiset1.containsAll(multiset2); // returns true: all unique elements are contained,
  // even though multiset1.count("a") == 2 < multiset2.count("a") == 5
Multisets.containsOccurrences(multiset1, multiset2); // returns false

multiset2.removeOccurrences(multiset1); // multiset2 now contains 3 occurrences of "a"

multiset2.removeAll(multiset1); // removes all occurrences of "a" from multiset2, even though multiset1.count("a") == 2
multiset2.isEmpty(); // returns true

Multisets 中另外的一些工具类:

Method	Description
`copyHighestCountFirst(Multiset)`	返回迭代遍历时以降序排列的不可变副本。
`unmodifiableMultiset(Multiset)`	返回multiset的不可修改视图。
`unmodifiableSortedMultiset(SortedMultiset)`	返回sortedMultiset的不可修改视图。

Multiset<String> multiset = HashMultiset.create();
multiset.add("a", 3);
multiset.add("b", 5);
multiset.add("c", 1);

ImmutableMultiset<String> highestCountFirst = Multisets.copyHighestCountFirst(multiset);

// highestCountFirst, like its entrySet and elementSet, iterates over the elements in order {"b", "a", "c"}

Multimaps

Multimaps 提供一些通用的实用操作，需要分开解释。

`index`

作为 Maps.uniqueIndex的远亲， Multimaps.index(Iterable, Function) 用于解决希望查找某些特定共同属性的所有对象（不一定是唯一的）。

假设我们想要根据它们的长度来分组字符串。

ImmutableSet<String> digits = ImmutableSet.of(
    "zero", "one", "two", "three", "four",
    "five", "six", "seven", "eight", "nine");
Function<String, Integer> lengthFunction = new Function<String, Integer>() {
  public Integer apply(String string) {
    return string.length();
  }
};
ImmutableListMultimap<Integer, String> digitsByLength = Multimaps.index(digits, lengthFunction);
/*
 * digitsByLength maps:
 *  3 => {"one", "two", "six"}
 *  4 => {"zero", "four", "five", "nine"}
 *  5 => {"three", "seven", "eight"}
 */

`invertFrom`

因为Multimap 能够组建多个key值与一个value的对应关系以及一个key值与多个vlaue的对应关系，这个特性能够被用来做Multimap的转化。 Guava 提供的invertFrom(Multimap toInvert, Multimap dest) 方法能够帮你实现这样的功能。

注意: 如果使用的是 ImmutableMultimap, 请使用ImmutableMultimap.inverse() 代替。

ArrayListMultimap<String, Integer> multimap = ArrayListMultimap.create();
multimap.putAll("b", Ints.asList(2, 4, 6));
multimap.putAll("a", Ints.asList(4, 2, 1));
multimap.putAll("c", Ints.asList(2, 5, 3));

TreeMultimap<Integer, String> inverse = Multimaps.invertFrom(multimap, TreeMultimap.<String, Integer> create());
// note that we choose the implementation, so if we use a TreeMultimap, we get results in order
/*
 * inverse maps:
 *  1 => {"a"}
 *  2 => {"a", "b", "c"}
 *  3 => {"c"}
 *  4 => {"a", "b"}
 *  5 => {"c"}
 *  6 => {"b"}
 */

`forMap`

需要在Map上使用 Multimap 方法？ forMap(Map) 建立一个 Map 的 SetMultimap视图。该方法特别有用，例如与Multimaps.invertFrom相结合。

Map<String, Integer> map = ImmutableMap.of("a", 1, "b", 1, "c", 2);
SetMultimap<String, Integer> multimap = Multimaps.forMap(map);
// multimap maps ["a" => {1}, "b" => {1}, "c" => {2}]
Multimap<Integer, String> inverse = Multimaps.invertFrom(multimap, HashMultimap.<Integer, String> create());
// inverse maps [1 => {"a", "b"}, 2 => {"c"}]

Wrappers

Multimaps 提供了传统的包装方法，是一个基于你选择的Map与Collection实现的自定义 Multimap工具。

Multimap type	Unmodifiable	Synchronized	Custom
`Multimap`	`unmodifiableMultimap`	`synchronizedMultimap`	`newMultimap`
`ListMultimap`	`unmodifiableListMultimap`	`synchronizedListMultimap`	`newListMultimap`
`SetMultimap`	`unmodifiableSetMultimap`	`synchronizedSetMultimap`	`newSetMultimap`
`SortedSetMultimap`	`unmodifiableSortedSetMultimap`	`synchronizedSortedSetMultimap`	`newSortedSetMultimap`

自定义 Multimap 的实现可以让你再返回的Multimap中使用特定的实现。注意事项包括：

multimap持有对map与factory的完全所有权。这些对象不应该手动更新，在提供时应该为空，并且不应该使用软引用，弱引用或虚引用。
修改 Multimap后不保证map内容会变成什么样子。
在任何并发操作更新multimap时，即使map与工厂生成的实例，multimap也不是线程安全的。不过，并发读取操作将正常操作，如果需要的话，用synchronized来处理这个问题。
multimap 是可序列化的，如果map, factory, 由工厂生成的lists以及multimap的内容都是可序列化的。
Multimap.get(key) 返回的集合与Supplier返回的集合类型不同，但是如果supplier返回RandomAccess 列表，Multimap.get(key)返回的列表也将是随机访问的。

请注意，自定义 Multimap 方法希望 Supplier 参数生成新的集合。下面是编写一个由TreeMap 映射支持的LinkedList到ListMultimap 的例子。

ListMultimap<String, Integer> myMultimap = Multimaps.newListMultimap(
  Maps.<String, Collection<Integer>>newTreeMap(),
  new Supplier<LinkedList<Integer>>() {
    public LinkedList<Integer> get() {
      return Lists.newLinkedList();
    }
  });

Tables

Tables 类提供几个方便的工具。

`customTable`

相比 Multimaps.newXXXMultimap(Map, Supplier) 工具, Tables.newCustomTable(Map, Supplier<Map>) 允许实现一个任何行货列映射来指定table的实现。

// use LinkedHashMaps instead of HashMaps
Table<String, Character, Integer> table = Tables.newCustomTable(
  Maps.<String, Map<Character, Integer>>newLinkedHashMap(),
  new Supplier<Map<Character, Integer>> () {
    public Map<Character, Integer> get() {
      return Maps.newLinkedHashMap();
    }
  });