java堆 数据结构 堆_Java中的紧凑堆外结构/组合

java堆 数据结构 堆

在上一篇文章中,我详细介绍了代码对主内存的访问方式的含义。 从那时起,我就在Java中可以做什么以实现更可预测的内存布局提出了很多疑问。 有些模式可以使用数组支持的结构来应用,我将在另一篇文章中讨论。 这篇文章将探讨如何模拟Java中非常缺少的功能-类似于C所提供的结构数组。

结构在堆栈和堆上都非常有用。 据我所知,不可能在Java堆栈上模拟此功能。 无法在堆栈上执行此操作真是令人遗憾,因为它极大地限制了某些并行算法的性能,但这又是另一回事。

在Java中,所有用户定义的类型都必须存在于堆中。 在一般情况下,Java堆由垃圾收集器管理,但是Java进程中的堆更大。 通过引入直接ByteBuffer ,可以分配内存,该内存不会被垃圾回收器跟踪,因为本机代码可以将其用于任务,例如避免为IO向内核复制数据或从内核复制数据。 因此,一种管理结构的方法是在ByteBuffer中伪造它们,这是一种合理的方法。 这可以允许紧凑的数据表示,但是具有性能和大小限制。 例如,不可能有大于2GB的ByteBuffer,并且所有访问都经过边界检查,这会影响性能。 存在使用Unsafe的替代方法,它不仅速度更快而且不受ByteBuffer的大小限制。

我要详细介绍的方法不是传统的Java。 如果您的问题空间正在处理大数据或极高的性能,那么就会有好处。 如果您的数据集很小,并且性能不是问题,那么请立即逃避以避免陷入本机内存管理的黑暗技巧。

我将详细介绍的方法的好处是:

  1. 显着改善的性能
  2. 更紧凑的数据表示
  3. 能够处理非常大的数据集,同时避免了令人讨厌的GC暂停[1]

所有选择都有其后果。 通过采用下面详述的方法,您自己负责一些内存管理。 弄错了会导致内存泄漏,或者更糟的是,您可能使JVM崩溃! 谨慎行事...

合适的例子– 贸易数据

财务应用程序面临的一个共同挑战是捕获和处理大量的订单和交易数据。 对于该示例,我将创建一个很大的内存贸易数据表,可以对它进行分析查询。 该表将使用两种对比方法构建。 首先,我将采用传统的Java方法来创建大型数组并引用单个Trade对象。 其次,我保持用法代码相同,但是用可通过Flyweight模式操作的堆外结构数组替换大数组和Trade对象。

如果对于传统的Java方法,我使用其他一些数据结构(例如Map或Tree),则内存占用量将更大,而性能会更低。

传统的Java方法

public class TestJavaMemoryLayout
{
    private static final int NUM_RECORDS = 50 * 1000 * 1000;

    private static JavaMemoryTrade[] trades;

    public static void main(final String[] args)
    {
        for (int i = 0; i < 5; i++)
        {
            System.gc();
            perfRun(i);
        }
    }

    private static void perfRun(final int runNum)
    {
        long start = System.currentTimeMillis();

        init();

        System.out.format('Memory %,d total, %,d free\n',
                          Runtime.getRuntime().totalMemory(),
                          Runtime.getRuntime().freeMemory());

        long buyCost = 0;
        long sellCost = 0;

        for (int i = 0; i < NUM_RECORDS; i++)
        {
            final JavaMemoryTrade trade = get(i);

            if (trade.getSide() == 'B')
            {
                buyCost += (trade.getPrice() * trade.getQuantity());
            }
            else
            {
                sellCost += (trade.getPrice() * trade.getQuantity());
            }
        }

        long duration = System.currentTimeMillis() - start;
        System.out.println(runNum + ' - duration ' + duration + 'ms');
        System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost);
    }

    private static JavaMemoryTrade get(final int index)
    {
        return trades[index];
    }

    public static void init()
    {
        trades = new JavaMemoryTrade[NUM_RECORDS];

        final byte[] londonStockExchange = {'X', 'L', 'O', 'N'};
        final int venueCode = pack(londonStockExchange);

        final byte[] billiton = {'B', 'H', 'P'};
        final int instrumentCode = pack( billiton);

        for (int i = 0; i < NUM_RECORDS; i++)
        {
            JavaMemoryTrade trade = new JavaMemoryTrade();
            trades[i] = trade;

            trade.setTradeId(i);
            trade.setClientId(1);
            trade.setVenueCode(venueCode);
            trade.setInstrumentCode(instrumentCode);

            trade.setPrice(i);
            trade.setQuantity(i);

            trade.setSide((i & 1) == 0 ? 'B' : 'S');
        }
    }

    private static int pack(final byte[] value)
    {
        int result = 0;
        switch (value.length)
        {
            case 4:
                result = (value[3]);
            case 3:
                result |= ((int)value[2] << 8);
            case 2:
                result |= ((int)value[1] << 16);
            case 1:
                result |= ((int)value[0] << 24);
                break;

            default:
                throw new IllegalArgumentException('Invalid array size');
        }

        return result;
    }

    private static class JavaMemoryTrade
    {
        private long tradeId;
        private long clientId;
        private int venueCode;
        private int instrumentCode;
        private long price;
        private long quantity;
        private char side;

        public long getTradeId()
        {
            return tradeId;
        }

        public void setTradeId(final long tradeId)
        {
            this.tradeId = tradeId;
        }

        public long getClientId()
        {
            return clientId;
        }

        public void setClientId(final long clientId)
        {
            this.clientId = clientId;
        }

        public int getVenueCode()
        {
            return venueCode;
        }

        public void setVenueCode(final int venueCode)
        {
            this.venueCode = venueCode;
        }

        public int getInstrumentCode()
        {
            return instrumentCode;
        }

        public void setInstrumentCode(final int instrumentCode)
        {
            this.instrumentCode = instrumentCode;
        }

        public long getPrice()
        {
            return price;
        }

        public void setPrice(final long price)
        {
            this.price = price;
        }

        public long getQuantity()
        {
            return quantity;
        }

        public void setQuantity(final long quantity)
        {
            this.quantity = quantity;
        }

        public char getSide()
        {
            return side;
        }

        public void setSide(final char side)
        {
            this.side = side;
        }
    }
}

紧凑型堆外结构

import sun.misc.Unsafe;

import java.lang.reflect.Field;

public class TestDirectMemoryLayout
{
    private static final Unsafe unsafe;
    static
    {
        try
        {
            Field field = Unsafe.class.getDeclaredField('theUnsafe');
            field.setAccessible(true);
            unsafe = (Unsafe)field.get(null);
        }
        catch (Exception e)
        {
            throw new RuntimeException(e);
        }
    }

    private static final int NUM_RECORDS = 50 * 1000 * 1000;

    private static long address;
    private static final DirectMemoryTrade flyweight = new DirectMemoryTrade();

    public static void main(final String[] args)
    {
        for (int i = 0; i < 5; i++)
        {
            System.gc();
            perfRun(i);
        }
    }

    private static void perfRun(final int runNum)
    {
        long start = System.currentTimeMillis();

        init();

        System.out.format('Memory %,d total, %,d free\n',
                          Runtime.getRuntime().totalMemory(),
                          Runtime.getRuntime().freeMemory());

        long buyCost = 0;
        long sellCost = 0;

        for (int i = 0; i < NUM_RECORDS; i++)
        {
            final DirectMemoryTrade trade = get(i);

            if (trade.getSide() == 'B')
            {
                buyCost += (trade.getPrice() * trade.getQuantity());
            }
            else
            {
                sellCost += (trade.getPrice() * trade.getQuantity());
            }
        }

        long duration = System.currentTimeMillis() - start;
        System.out.println(runNum + ' - duration ' + duration + 'ms');
        System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost);

        destroy();
    }

    private static DirectMemoryTrade get(final int index)
    {
        final long offset = address + (index * DirectMemoryTrade.getObjectSize());
        flyweight.setObjectOffset(offset);
        return flyweight;
    }

    public static void init()
    {
        final long requiredHeap = NUM_RECORDS * DirectMemoryTrade.getObjectSize();
        address = unsafe.allocateMemory(requiredHeap);

        final byte[] londonStockExchange = {'X', 'L', 'O', 'N'};
        final int venueCode = pack(londonStockExchange);

        final byte[] billiton = {'B', 'H', 'P'};
        final int instrumentCode = pack( billiton);

        for (int i = 0; i < NUM_RECORDS; i++)
        {
            DirectMemoryTrade trade = get(i);

            trade.setTradeId(i);
            trade.setClientId(1);
            trade.setVenueCode(venueCode);
            trade.setInstrumentCode(instrumentCode);

            trade.setPrice(i);
            trade.setQuantity(i);

            trade.setSide((i & 1) == 0 ? 'B' : 'S');
        }
    }

    private static void destroy()
    {
        unsafe.freeMemory(address);
    }

    private static int pack(final byte[] value)
    {
        int result = 0;
        switch (value.length)
        {
            case 4:
                result |= (value[3]);
            case 3:
                result |= ((int)value[2] << 8);
            case 2:
                result |= ((int)value[1] << 16);
            case 1:
                result |= ((int)value[0] << 24);
                break;

            default:
                throw new IllegalArgumentException('Invalid array size');
        }

        return result;
    }

    private static class DirectMemoryTrade
    {
        private static long offset = 0;

        private static final long tradeIdOffset = offset += 0;
        private static final long clientIdOffset = offset += 8;
        private static final long venueCodeOffset = offset += 8;
        private static final long instrumentCodeOffset = offset += 4;
        private static final long priceOffset = offset += 4;
        private static final long quantityOffset = offset += 8;
        private static final long sideOffset = offset += 8;

        private static final long objectSize = offset += 2;

        private long objectOffset;

        public static long getObjectSize()
        {
            return objectSize;
        }

        void setObjectOffset(final long objectOffset)
        {
            this.objectOffset = objectOffset;
        }

        public long getTradeId()
        {
            return unsafe.getLong(objectOffset + tradeIdOffset);
        }

        public void setTradeId(final long tradeId)
        {
            unsafe.putLong(objectOffset + tradeIdOffset, tradeId);
        }

        public long getClientId()
        {
            return unsafe.getLong(objectOffset + clientIdOffset);
        }

        public void setClientId(final long clientId)
        {
            unsafe.putLong(objectOffset + clientIdOffset, clientId);
        }

        public int getVenueCode()
        {
            return unsafe.getInt(objectOffset + venueCodeOffset);
        }

        public void setVenueCode(final int venueCode)
        {
            unsafe.putInt(objectOffset + venueCodeOffset, venueCode);
        }

        public int getInstrumentCode()
        {
            return unsafe.getInt(objectOffset + instrumentCodeOffset);
        }

        public void setInstrumentCode(final int instrumentCode)
        {
            unsafe.putInt(objectOffset + instrumentCodeOffset, instrumentCode);
        }

        public long getPrice()
        {
            return unsafe.getLong(objectOffset + priceOffset);
        }

        public void setPrice(final long price)
        {
            unsafe.putLong(objectOffset + priceOffset, price);
        }

        public long getQuantity()
        {
            return unsafe.getLong(objectOffset + quantityOffset);
        }

        public void setQuantity(final long quantity)
        {
            unsafe.putLong(objectOffset + quantityOffset, quantity);
        }

        public char getSide()
        {
            return unsafe.getChar(objectOffset + sideOffset);
        }

        public void setSide(final char side)
        {
            unsafe.putChar(objectOffset + sideOffset, side);
        }
    }
}

结果

Intel i7-860 @ 2.8GHz, 8GB RAM DDR3 1333MHz, 
Windows 7 64-bit, Java 1.7.0_07
=============================================
java -server -Xms4g -Xmx4g TestJavaMemoryLayout
Memory 4,116,054,016 total, 1,108,901,104 free
0 - duration 19334ms
Memory 4,116,054,016 total, 1,109,964,752 free
1 - duration 14295ms
Memory 4,116,054,016 total, 1,108,455,504 free
2 - duration 14272ms
Memory 3,817,799,680 total, 815,308,600 free
3 - duration 28358ms
Memory 3,817,799,680 total, 810,552,816 free
4 - duration 32487ms

java -server TestDirectMemoryLayout
Memory 128,647,168 total, 126,391,384 free
0 - duration 983ms
Memory 128,647,168 total, 126,992,160 free
1 - duration 958ms
Memory 128,647,168 total, 127,663,408 free
2 - duration 873ms
Memory 128,647,168 total, 127,663,408 free
3 - duration 886ms
Memory 128,647,168 total, 127,663,408 free
4 - duration 884ms

Intel i7-2760QM @ 2.40GHz, 8GB RAM DDR3 1600MHz, 
Linux 3.4.11 kernel 64-bit, Java 1.7.0_07
=================================================
java -server -Xms4g -Xmx4g TestJavaMemoryLayout
Memory 4,116,054,016 total, 1,108,912,960 free
0 - duration 12262ms
Memory 4,116,054,016 total, 1,109,962,832 free
1 - duration 9822ms
Memory 4,116,054,016 total, 1,108,458,720 free
2 - duration 10239ms
Memory 3,817,799,680 total, 815,307,640 free
3 - duration 21558ms
Memory 3,817,799,680 total, 810,551,856 free
4 - duration 23074ms

java -server TestDirectMemoryLayout 
Memory 123,994,112 total, 121,818,528 free
0 - duration 634ms
Memory 123,994,112 total, 122,455,944 free
1 - duration 619ms
Memory 123,994,112 total, 123,103,320 free
2 - duration 546ms
Memory 123,994,112 total, 123,103,320 free
3 - duration 547ms
Memory 123,994,112 total, 123,103,320 free
4 - duration 534ms


分析

让我们将结果与上面承诺的3个好处进行比较。

1.显着改善性能

这里的证据很明确。 使用堆外结构方法要快一个数量级以上。 最极端的情况是,在Sandy Bridge处理器上进行第五次运行,我们在完成任务上的持续时间相差43.2 。 这也很好地说明了Sandy Bridge在可预测的数据访问模式方面的表现。 性能不仅明显更好,而且更加一致。 随着堆变得碎片化,从而访问模式变得更加随机,性能会下降,这在以后使用标准Java方法运行时可以看到。

2.更紧凑的数据表示

对于我们的堆外表示,每个对象需要42个字节。 如示例所示,要存储5000万个字节,我们需要2100,000,000字节。 JVM堆所需的内存是:

所需内存=总内存–可用内存–基本JVM需求

2,883,248,712 = 3,817,799,680 – 810,551,856 – 123,999,112

这意味着JVM需要多40%的内存来表示相同的数据。 产生这种开销的原因是对Java对象的引用加上对象标头的数组。 在上一篇文章中,我讨论了Java中的对象布局。

当处理非常大的数据集时,此开销可能成为重要的限制因素。

3.能够处理非常大的数据集,同时避免令人讨厌的GC暂停

上面的示例代码在每次运行之前强制执行GC循环,并且在某些情况下可以提高结果的一致性。 随时删除对System.gc()的调用,并亲自观察其中的含义。 如果运行添加以下命令行参数的测试,则垃圾收集器将详细输出发生的情况。

-XX:+ PrintGC -XX:+ PrintGCDetails -XX:+ PrintGCDateStamps -XX:+ PrintTenuringDistribution -XX:+ PrintHeapAtGC -XX:+ PrintGCApplicationConcurrentTime -XX:+ PrintGCApplicationStoppedTime -XX:+ PrintSafepointStatistics

通过分析输出,我可以看到该应用程序总共进行了29个GC循环。 通过从输出中提取指示应用程序线程何时停止的行,下面列出了暂停时间。

With System.gc() before each run
================================
Total time for which application threads were stopped: 0.0085280 seconds
Total time for which application threads were stopped: 0.7280530 seconds
Total time for which application threads were stopped: 8.1703460 seconds
Total time for which application threads were stopped: 5.6112210 seconds
Total time for which application threads were stopped: 1.2531370 seconds
Total time for which application threads were stopped: 7.6392250 seconds
Total time for which application threads were stopped: 5.7847050 seconds
Total time for which application threads were stopped: 1.3070470 seconds
Total time for which application threads were stopped: 8.2520880 seconds
Total time for which application threads were stopped: 6.0949910 seconds
Total time for which application threads were stopped: 1.3988480 seconds
Total time for which application threads were stopped: 8.1793240 seconds
Total time for which application threads were stopped: 6.4138720 seconds
Total time for which application threads were stopped: 4.4991670 seconds
Total time for which application threads were stopped: 4.5612290 seconds
Total time for which application threads were stopped: 0.3598490 seconds
Total time for which application threads were stopped: 0.7111000 seconds
Total time for which application threads were stopped: 1.4426750 seconds
Total time for which application threads were stopped: 1.5931500 seconds
Total time for which application threads were stopped: 10.9484920 seconds
Total time for which application threads were stopped: 7.0707230 seconds

Without System.gc() before each run
===================================
Test run times
0 - duration 12120ms
1 - duration 9439ms
2 - duration 9844ms
3 - duration 20933ms
4 - duration 23041ms

Total time for which application threads were stopped: 0.0170860 seconds
Total time for which application threads were stopped: 0.7915350 seconds
Total time for which application threads were stopped: 10.7153320 seconds
Total time for which application threads were stopped: 5.6234650 seconds
Total time for which application threads were stopped: 1.2689950 seconds
Total time for which application threads were stopped: 7.6238170 seconds
Total time for which application threads were stopped: 6.0114540 seconds
Total time for which application threads were stopped: 1.2990070 seconds
Total time for which application threads were stopped: 7.9918480 seconds
Total time for which application threads were stopped: 5.9997920 seconds
Total time for which application threads were stopped: 1.3430040 seconds
Total time for which application threads were stopped: 8.0759940 seconds
Total time for which application threads were stopped: 6.3980610 seconds
Total time for which application threads were stopped: 4.5572100 seconds
Total time for which application threads were stopped: 4.6193830 seconds
Total time for which application threads were stopped: 0.3877930 seconds
Total time for which application threads were stopped: 0.7429270 seconds
Total time for which application threads were stopped: 1.5248070 seconds
Total time for which application threads were stopped: 1.5312130 seconds
Total time for which application threads were stopped: 10.9120250 seconds
Total time for which application threads were stopped: 7.3528590 seconds

从输出中可以看出,垃圾回收器花费了大量时间。 当线程停止时,您的应用程序无响应。 这些测试已使用默认GC设置完成。 可以对GC进行调整以获得更好的结果,但这是一项非常熟练的工作。 我知道,即使在高吞吐量条件下,即使不在高吞吐量条件下也不施加较长的暂停时间,这可以很好地应对JVM。

在对该应用程序进行性能分析时,我可以看到大部分时间都花在分配对象并将它们提升到老一代,因为它们不适合年轻一代。 可以从计时中除去初始化成本,但这是不现实的。 如果采用传统的Java方法,则需要先建立状态,然后才能进行查询。 应用程序的最终用户必须等待状态建立和查询执行。

这个测试确实非常简单。 想象使用100 GB规模的相似数据集。

注意:当垃圾收集器压缩区域时,可以将彼此相邻的对象移开很远。 这可能导致TLB和其他缓存未命中。

关于序列化的旁注

以这种方式使用堆外结构的一个巨大好处是,如何通过简单的内存副本将它们很容易地序列化到网络或存储中,就像我在上一篇文章中所展示的那样。 这样,我们可以完全绕过中间缓冲区和对象分配。

结论

如果您愿意对大型数据集进行一些C风格的编程,则可以通过脱离堆控制Java中的内存布局。 如果这样做,那么在性能,紧凑性和避免GC问题方面的好处就非常重要。 但是,这种方法应该用于所有应用程序。 仅对于非常大的数据集或吞吐量和/或延迟的极端性能,才注意到它的优势。

我希望Java社区可以共同认识到支持在堆和堆栈上的结构的重要性。 John Rose在此领域做了出色的工作 ,定义了如何将元组添加到JVM。 他今年在JVM语言峰会上发表的有关Arrays 2.0的演讲确实值得关注。 约翰在演讲中讨论了结构数组的选择和数组结构。 如果有John提出的元组可用,则此处描述的测试可以具有可比的性能,并且是更令人愉快的编程风格。 整个结构数组可以在一个动作中分配,因此可以绕开不同代的对象副本,并以紧凑的连续方式进行存储。 这将消除此类严重的GC问题。

最近,我正在比较Java和.Net之间的标准数据结构。 在某些情况下,当.Net使用本机结构支持时,对于诸如地图和字典之类的东西,我发现.Net的性能优势是6-10倍。 让我们尽快将其纳入Java!

从结果中还可以很明显地看出,如果我们要使用Java对大数据进行实时分析,那么我们的标准垃圾收集器就需要显着改善并支持真正的并发操作。

[1] –据我所知,唯一能够很好地处理非常大的堆的JVM是Azul Zing

祝您编程愉快,别忘了分享!

参考:来自JavaJCG合作伙伴 Martin Thompson在Mechanical Sympathy博客上的紧凑型堆外结构/堆栈在Java中


翻译自: https://www.javacodegeeks.com/2012/10/compact-off-heap-structurestuples-in.html

java堆 数据结构 堆

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值