处理集合交集的小tips_集合交集碎片化如何处理-CSDN博客

本文链接：https://blog.csdn.net/Mabanana/article/details/114838395

本来想昨天来总结这篇没什么水平的文章，但是因为一些事情搞得我心情有些不佳，虽然现在也是，但是也不想拖了，拖来拖去就脱没了。

最近也是都不知道写点什么，没什么太多的输入（约等于0），就吃吃老本，再扩展一下吧。这次主要是想说一下开发过程中的对数据处理（tips：集合）。

引子👀：現在假如你遇到了一个场景，有两个集合，需要你来判断一下集合中是否有相同的元素，存在就返回true，不存在就返回false，你会怎么做？

第一个想法应该是这样吧：“暴力解决一切”，双层for循环外加flag标志位。这不是简简单单？

public static void main(String[] args) {
    Set<String> set1 = new HashSet<String>();
    Set<String> set2 = new HashSet<String>();
    set1.add("1");
    set1.add("1111");
    set1.add("2222");
    set2.add("1");
    set2.add("333");
    set2.add("111");
    Boolean flag = false;
    for (String s1 : set1) {
        for (String s2 : set2) {
            if (s1.equals(s2)) {
                flag = true;
            }
        }
        if (flag) {
            break;
        }
    }
    System.out.println(flag);
}

肯定有人会说，这也太蠢了吧，一个这个用这么多行代码，受不了受不了，集合不是都有contains方法啊，直接用外层for循环，内层contains方法，外挂一个flag标志位不得啦🤦，那么就来看看。

public static void main(String[] args) {
    Set<String> set1 = new HashSet<String>();
    Set<String> set2 = new HashSet<String>();
    set1.add("1");
    set1.add("1111");
    set1.add("2222");
    set2.add("1");
    set2.add("333");
    set2.add("111");
    Boolean flag = false;
    for (String s1 : set1) {
        if (set2.contains(s1)) {
            flag = true;
            break;
        }
    }
    System.out.println(flag);
}

确确实实，清爽了很多，没那么多for循环看着也顺眼，不过你真的以为现在就只是一个for循环吗？其实暗藏玄机，点击contains方法看一下

/**
     * Returns <tt>true</tt> if this set contains the specified element.
     * More formally, returns <tt>true</tt> if and only if this set
     * contains an element <tt>e</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;e==null&nbsp;:&nbsp;o.equals(e))</tt>.
     *
     * @param o element whose presence in this set is to be tested
     * @return <tt>true</tt> if this set contains the specified element
     */
public boolean contains(Object o) {
    return map.containsKey(o);
}

翻译一下，就是如果set中满足onull ? e null : o.equals(e)这样一个三元表达式的时候就返回true，否则返回false

大家应该会疑惑吧，哎，这边咋有一个map呢，map从哪来的，我们Ctrl+F搜索一下

/**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * default initial capacity (16) and load factor (0.75).
     */
public HashSet() {
    map = new HashMap<>();
}

/**
     * Constructs a new set containing the elements in the specified
     * collection.  The <tt>HashMap</tt> is created with default load factor
     * (0.75) and an initial capacity sufficient to contain the elements in
     * the specified collection.
     *
     * @param c the collection whose elements are to be placed into this set
     * @throws NullPointerException if the specified collection is null
     */
public HashSet(Collection<? extends E> c) {
    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
    addAll(c);
}

现在大家应该都哦哦哦哦哦，明白了吧。HashSet你可以把他理解成一个HashMap了。后边add的元素也就响应的添加到map集合中，不信来看。

/**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element <tt>e</tt> to this set if
     * this set contains no element <tt>e2</tt> such that
     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns <tt>false</tt>.
     *
     * @param e element to be added to this set
     * @return <tt>true</tt> if this set did not already contain the specified
     * element
     */
public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

现在大家应该都知道了吧，表面简简单单，其实内藏玄机。

但是这样就是最佳答案了吗？

既然觉得不行，那么久肯定还有其他方案，相信大家都遇到过这个场景的变换场景吧–求两个集合的交集，常用的方法应该就是jdk自带的retain方法吧，那么思考一下这边能不能用？？？（这都要思考，不太行啊），答案是肯定可以，那么怎么写呢？看，认真的看，仔细的看

public static void main(String[] args) {
    Set<String> set1 = new HashSet<String>();
    Set<String> set2 = new HashSet<String>();
    set1.add("1");
    set1.add("1111");
    set1.add("2222");
    set2.add("1");
    set2.add("333");
    set2.add("111");
    Boolean flag = false;
    set1.retainAll(set2);
    if (set1.size() > 0) {
        flag = true;
    }
    System.out.println(flag);
}

更少了，更少了，代码更少了哎，简单讲解一下，这种思路就是先求交集，现在的交集就是set1，然后看交集的大小，如果大于0，则说明存在交集，如果等于0就说明不存在交集，不存在小于0情况。

在这里插入图片描述

真的🆗了❓最初我也是这样想的，但是通过一番查找，找到了一个更爽的方案，都给我好好看

public static void main(String[] args) {
    Set<String> set1 = new HashSet<String>();
    Set<String> set2 = new HashSet<String>();
    set1.add("1");
    set1.add("1111");
    set1.add("2222");
    set2.add("1");
    set2.add("333");
    set2.add("111");
    boolean flag = Collections.disjoint(set1, set2);
    System.out.println(flag);
}

现在是不是贼简单，贼爽。但是你一测试就会感觉，好奇怪，为什么是这幅样子，为什么和之前的结果不一样了。

如果你有上边的困惑，说明你真的是没有好好看，“dis dis dis”重要的单词重复三遍，知道dis是啥意思不？是dis不是diss，那就是反其道而行之的意思。你是true我就得是false

对学习热忱的人现在肯定已经点到方法里边看了吧，那么我们也就点进去看看吧。

public static boolean disjoint(Collection<?> c1, Collection<?> c2) {
    // The collection to be used for contains(). Preference is given to
    // the collection who's contains() has lower O() complexity.
    Collection<?> contains = c2;
    // The collection to be iterated. If the collections' contains() impl
    // are of different O() complexity, the collection with slower
    // contains() will be used for iteration. For collections who's
    // contains() are of the same complexity then best performance is
    // achieved by iterating the smaller collection.
    Collection<?> iterate = c1;

    // Performance optimization cases. The heuristics:
    //   1. Generally iterate over c1.
    //   2. If c1 is a Set then iterate over c2.
    //   3. If either collection is empty then result is always true.
    //   4. Iterate over the smaller Collection.
    if (c1 instanceof Set) {
        // Use c1 for contains as a Set's contains() is expected to perform
        // better than O(N/2)
        iterate = c2;
        contains = c1;
    } else if (!(c2 instanceof Set)) {
        // Both are mere Collections. Iterate over smaller collection.
        // Example: If c1 contains 3 elements and c2 contains 50 elements and
        // assuming contains() requires ceiling(N/2) comparisons then
        // checking for all c1 elements in c2 would require 75 comparisons
        // (3 * ceiling(50/2)) vs. checking all c2 elements in c1 requiring
        // 100 comparisons (50 * ceiling(3/2)).
        int c1size = c1.size();
        int c2size = c2.size();
        if (c1size == 0 || c2size == 0) {
            // At least one collection is empty. Nothing will match.
            return true;
        }

        if (c1size > c2size) {
            iterate = c2;
            contains = c1;
        }
    }

    for (Object e : iterate) {
        if (contains.contains(e)) {
            // Found a common element. Collections are not disjoint.
            return false;
        }
    }

    // No common elements were found.
    return true;
}

不知道你们有没有看到重点啊，其实最最主要的也就5行代码，而且我们前边也有写过

for (Object e : iterate) {
    if (contains.contains(e)) {
        // Found a common element. Collections are not disjoint.
        return false;
    }
}