通过ArrayList源码深入理解java中Iterator迭代器的实现原理

最新推荐文章于 2024-06-01 23:39:08 发布

nayi_224

最新推荐文章于 2024-06-01 23:39:08 发布

阅读量2.5k

点赞数 3

分类专栏： java 文章标签： java java基础迭代器 Iterator

本文链接：https://blog.csdn.net/nayi_224/article/details/80102394

版权

java 专栏收录该内容

44 篇文章 1 订阅

订阅专栏

注意：本文将着重从源码的角度对Iterator的实现进行讲解，不讨论List与Iterator接口的具体使用方法。不过看懂源码后，使用也就不是什么问题了。

java中各种实现Iterator的类所具体使用的实现方法各不相同，但是都大同小异。因此本文将只通过ArrayList类源码进行分析。所以最好对ArrayList的源码有一定了解，或者至少具备相关的算法知识。

首先贴出ArrayList类中与Iterator有关的代码。（不同版本jdk可能有微小差别）

    /**
     * Returns an iterator over the elements in this list in proper sequence.
     *
     * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
     *
     * @return an iterator over the elements in this list in proper sequence
     */
    public Iterator<E> iterator() {
        return new Itr();
    }

    /**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

很明显，它是返回了一个内部类，通过这个内部类来实现。

由于需要考虑到各种各样的情况，java中的源码总是会多出很多出于完善考虑的代码，非常影响理解功能实现的本质。这个时候如果想深入理解，最好的方法就是自己实现一个简化版本，以此为基础进行分析。

这是我写的一个只实现了Iterable的简化版ArrayList。

import java.util.Iterator;

public class Nayi224ArrayList implements java.lang.Iterable{

    //ArrayList实际上是一个维护Object[]的类,elementData才是本体。这里先把数据写死。
    public Object[] elementData = {1, 2, 3, 4};
    //平常所用的 .size() 其实就是return size，也先提前写死。
    public int size = 4;

    public Iterator iterator(){
        return new IteratorImpl();
    };

    private class IteratorImpl implements java.util.Iterator{

        private int cursor; //游标。int型成员变量默认初始值为0。

        public boolean hasNext() {
            return cursor != size;
        }

        public Object next() {
            return elementData[cursor++];
        }

        public void remove() {
            /*
             * ArrayList中remove操作的核心代码的改写。
             * 完整版还包括下标越界校验，modCount自增（fail-fast），末位置空（for gc）。均已省略。
             * 
             * 
             * System.arraycopy的用法可以直接查看注释获得。
             * 
             * */
            System.arraycopy(elementData, cursor, elementData, cursor - 1, size - (cursor - 1) - 1);
        }

    }

}

所有代码都简化成了一行，即使是新手也能看懂，这里就不做多余讲解了。

在使用的时候，与原版完全一样。（remove方法只实现了一半，与原版有较大差异）

public class BlogTest {

    public static void main(String[] args) {

        Nayi224ArrayList list = new Nayi224ArrayList();

        java.util.Iterator it = list.iterator();

        while (it.hasNext())
            System.out.println(it.next());
    }

}

这就是实现迭代功能的全部核心代码。

现在开始逐步理解源码中的其他部分。

首先是有名的fase-fail机制。它的作用是在迭代的时候，如果list发生了结构上的变化，将会抛出异常。在ArrayList.Itr中主要通过这几句话来实现。

        int expectedModCount = modCount;
        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }

modCount是ArrayList所继承的属性，在对ArrayList进行结构上的修改，比如add，remove等方法时，会modCount++。

expectedModCount 是ArrayList的内部类Itr的属性，会在初始化的时候赋值为外部类的modCount。
注意：虽然是int类型，但是这并不是值传递，而是一种特殊的引用传递。这与成员内部类的特性有关。通过外部类创建的内部类会保留外部类的引用。
int expectedModCount = modCount; 的另一种比较正规的写法是
int expectedModCount = ArrayList.this.modCount;
如果用反编译工具看的话可能会是这种东西this.this$0.modCount 。它是一种地址引用，这一点是理解快速失败机制的关键。

checkForComodification()出现于Itr类中next和remove的开始部分。如果在操作一个Iterator的过程中对创建它的外部类进行了结构上的修改，将会抛出ConcurrentModificationException异常。

我经常看到有人从并发的角度来讲解Iterator的快速失败机制。这很贴近现实，但是却远离了本质。这跟多线程有什么关系呢，不过是迭代的时候ArrayList被修改了而已。（需要注意的是Itr的remove方法调用了外部类的remove方法，同样会导致modCount++）。
如果你想搞炸一个程序，只需要这4行代码

        List arr = new ArrayList(Arrays.asList(1, 2, 3, 4));
        Iterator it =arr.iterator();
        arr.add(1);
        it.next();

初始化–add–迭代–boom！

相信很多人都在无数的资料中看到过类似的一句话。

迭代器的快速失败行为无法得到保证，因为一般来说，不可能对是否出现不同步并发修改做出任何硬性保证。快速失败迭代器会尽最大努力抛出ConcurrentModificationException。

但是至少我还没看到过有谁对这段话做过解释。到底是在什么情况下，它才会即使“尽最大努力”也无法抛出异常。不过在看完源码后发现，这个问题好像确实简单到不需要做特别说明的程度。

在remove方法中

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
            //线程B进行add操作。
            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

如果线程A在执行完checkForComodification(); 后，线程B立刻执行完了一次add，由于线程A重新对expectedModCount进行了赋值，这将导致无法对这次不同步的修改抛出异常。这可能会导致意想不到的bug发生。

同样的，在next方法中

        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

如果A线程在执行完第一行后发生了阻塞，而B线程在这时正好完成了一次add和一次remove。list的结构显然改变了。A线程可能返回了B线程新添加的对象，却无法抛出异常。如果A线程继续调用next方法，它将在下一次调用时抛出异常，这与事实不符。

在Itr类中有这么一个属性
int lastRet = -1; // index of last element returned; -1 if no such
它主要出现在remove方法中。

        //初始值为-1
        int lastRet = -1; // index of last element returned; -1 if no such

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

从代码上看，它的作用就两个。一个是使刚初始化的迭代器直接调用remove时报错，另一个是在正常调用remove后再掉一次remove引发报错。个人觉得这应该是为了符合某种规范吧。

第一种报错。

        List arr = new ArrayList(Arrays.asList(1, 2, 3, 4));
        Iterator it =arr.iterator();

        it.remove();

第二种报错。

        List arr = new ArrayList(Arrays.asList(1, 2, 3, 4));
        Iterator it =arr.iterator();

        it.next();
        it.remove();
        it.remove();

重要的大概就这么多了，剩下的基本都是些校验下标越界或是一些更具体的实现。随便看看也就懂了。

nayi_224

关注

3
点赞
踩
13

收藏

觉得还不错? 一键收藏
2
评论
通过ArrayList源码深入理解java中Iterator迭代器的实现原理

注意：本文将着重从源码的角度对Iterator的实现进行讲解，不讨论List与Iterator接口的具体使用方法。不过看懂源码后，使用也就不是什么问题了。java中各种实现Iterator的类所具体使用的实现方法各不相同，但是都大同小异。因此本文将只通过ArrayList类源码进行分析。所以最好对ArrayList的源码有一定了解，或者至少具备相关的算法知识。首先贴出ArrayList类中...
复制链接

扫一扫