为什么 String#equals 方法在做比较时没有使用 hashCode_string的equal为什么不比较hashcode-CSDN博客

本文链接：https://blog.csdn.net/u013570834/article/details/130895024

一个疑问的引入

我之前出于优化常数项时间的考虑，想当然的认为 String#equals 会事先使用 hashCode 进行过滤

我想像中的算法是这样的

当两个 hashCode 不等时，直接返回 false（对 hash 而言，相同的输入会得到相同的输出）。此时就能避免后续的双指针比对（时间复杂度： $O (min (n, m))$ )
当两个 hashCode 相等时，考虑 hash collision（不同的输入可能会得到相同的输出）。此时后面的比对就无法避免了

也就是以下的代码

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length && this.hashCode() == anotherString.hashCode()) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])
                    return false;
                i++;
            }
            return true;
        }
    }
    return false;
}

但事实上确是

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])
                    return false;
                i++;
            }
            return true;
        }
    }
    return false;
}

也就是我先前的设计思路有问题，但不妨参考一下 HashMap#getNode

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node，显然这里是利用了 hashCode 进行过滤的
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

也就是说之前构思出来的算法应该是没有问题的，于是就有了一个疑问：为什么 String#equals 不使用 hashCode 进行第一次过滤？

一个我认为靠谱的答案

stackoverflow 高分答案

String 的 hashCode 是延迟计算的，当字符串很长时进行 hash 的开销会很大（时间复杂度 $O (N)$ ），如果是一个生命周期很短的字符串，则代价会很大
在实际实践中，大部分的字符串一般前面的字符就会不同，所以就算挨个比较也不会比较太多（如果计算 hashCode 则需要同时计算两个字符串，那么时间复杂度就会是 $O (M + N)$ ）

结合第一、二条来看，String#euqals 使用 hashCode 就显得不是很划算了

而 HashMap#put 时就会进行散列，此行为是无法避免的，所以可以利用开头所说的优化算法流程

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}