线程安全的统计工具(计数器)

最新推荐文章于 2024-04-17 17:21:27 发布

fancyerII

最新推荐文章于 2024-04-17 17:21:27 发布

阅读量1.7k

点赞数

分类专栏：数据结构与算法 java 文章标签：工具 thread null delete string object

本文链接：https://blog.csdn.net/fancyerII/article/details/7954957

版权

java 同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

数据结构与算法

6 篇文章 0 订阅

订阅专栏

问题描述

我们需要一个统计工具来统计诸如查询次数，最短和最长的查询时间，另外还需要按类别来统计，比如按城市统计查询数量等等。

我们需要它是线程安全的，因为我们会有很多线程同时更新统计数据。

简单的方案

最简单的方案是不保证线程安全，因为既然是统计数据，那么差一点点也许没有问题。

其次比较简单的翻案就是用synchronized保护所有的数据，这是首先推荐的做法，如果发现有性能问题，再看下面更加复杂的方案。

高效但复杂的方案

我们这里就假设需要非常高效并且线程安全的方案（比如需要追求极致性能或者你是性能优化狂）。

问题1 统计查询次数

这个比较简单，使用AtomicInteger或者AtomicLong就可以了。比如：

      privatefinal AtomicLong totalQuery=new AtomicLong();

        public void addTotalQuery(){

            totalQuery.incrementAndGet();

      }

问题2 统计最短查询时间

首先看不做任何同步会怎么样？

       private long minTime=Long.MAX_VALUE;

       public void updateMinTime(long time){

           if(time<minTime){

                minTime=time;

           }

       }

       public long getMinTime(){

           return minTime;

       }

比如minTime的当前值是5，然后线程1的time是3，线程2的time是4，他们同时发现自己比5小，所以都会试图把自己的值赋给minTime。

如果线程1先赋值，线程2后赋值，那么最终的值将是4，这样就记错了。

除了上面的问题，这个方法还有个bug，JVM对long的赋值不是原子的，尤其是对于32位的JVM，很可能一个long的赋值先修改高32位，然后修改低32位，那么如果

在这直接有人读取这个值，那么就会读到奇怪的结果。

另外即使没有人读，两个线程同时写一个long也会出现一个线程更新了高32位，另一个修改了低32位的情况。解决原子修改的问题可以使用volatile或者AtomicLong都可以，但是即使改成了volatile或者AtomicLong，仍然不能解决第一个带来的race condition。

那么怎么解决呢？当然加锁是一种方案

       @Guarded by this
       private long minTime=Long.MAX_VALUE;
       public void synchronized updateMinTime(long time){
           if(time<minTime){
                minTime=time;
           }
       }
       public void synchronized getMinTime(){
           return minTime;
       }

注意：不但更新需要synchronized，读取也需要synchronized。因为synchronized除了互斥的语义之外，还定义了happens-before的关系，它保证updateMinTime的结果完成后才会执行getMinTime，而且和volatile一样，它保证了可见性。

我们可以用volatile或者AtomicLong稍微优化一下：

       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void synchronized updateMinTime(long time){
           if(time<minTime){
                minTime=time;
           }
       }
       public void getMinTime(){
           return minTime;
       }

我们用synchronized来保证只有一个线程能够修改minTime，而且由于它是volatile的，所以getMinTime不会读到更新long一半的情况。

但这样只能有一个线程修改minTime，还是不是特别高效。

我们能够发现：其实大部分线程的值都不是最小的值，它发现自己的值可能比minTime大，那么根本不需要更新。那能不能下面这样优化呢？

       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void updateMinTime(long time){
           if(time<minTime){             
               synchronized(this){
                   minTime=time;
               }
           }
       }
       public void getMinTime(){
           return minTime;
       }

还是不行，因为还是可能线程1和线程2同时发现自己比较小，然后都试图更新，依然会出现上面的问题。

当然如果大家碰到过Lazy Initialization的问题可能会知道可以使用类似Double Check Lock的方法来优化。

       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void updateMinTime(long time){
           if(time<minTime){
               synchronized(this){
                  if(time<minTime){//check again
                       minTime=time;
                  }
               }
           }
       }
       public void getMinTime(){
           return minTime;
       }

这种使用double check lock的方法配合volatile的方案是没有问题的。但是明显实现起来很复杂。

我们再来仔细分析它为什么正确。

如果线程1和线程2同时发现自己比当前的minTime小，那么只有一个线程能进入synchronized代码段，这个时候再检查一下自己的值是否小于当前值。

如果当前值比自己小，那么说明在第一次判断和拿到锁这段时间内有别人修改了最小值，而且还比自己小，那么自己就没有比更新最小值了。

否则自己还是最小值，而且由于已经拿到锁，所以不会再有人能修改最小值，所以可以放心的更新最小值。

它比在外层加锁的好处是：大部分时候第一个if就发现自己不是最小值了，那么更本不需要加锁，所以很快。

我们总结一下它正确的原因：那就是它保证了条件判断(if(time<minTime))和修改(minTime=time;)是原子的，这个是通过synchronized来实现的。

除了synchronized之外，很多现代处理器都提供了一些原子的指令，比如CAS。我们可以使用这些原子的指令来实现同样的功能。

当然Java是跨平台的实现，它会考虑各种处理器，如果能够使用CAS，那么它会尽量使用，如果不能，那么使用操作系统或者其它的库来实现语义上的原子操作。

这里我们可以使用AtomicLong.compareAndSet。

	private final AtomicLong minTime=new AtomicLong(Long.MAX_VALUE);
        public void updateMinTime(long sample){
	    while (true) {
	        long curMin = minTime.get();
	        if (curMin > sample) {
	            boolean result = minTime.compareAndSet(curMin, sample);
	            if (result) break;
	        } else {
	            break;
	        }
	    }
	}

我们仔细看一下它的实现。

它在一个"死循环"里面，首先拿到“当前”的最小值，如果自己比“当前”的最小值大了，那么就不需要更新了，直接break退出。

如果自己比较小，那么就尝试 boolean result = minTime.compareAndSet(curMin, sample);

compareAndSet尝试把minTime更新为sample，前提是minTime的值是curMin。可以这样理解这条语句：

    synchronized(this){
        if(minTime==curMin){
             minTime=sample;
             return true;
        }
        return false;
     }

当然这是语义上的等价，实际上它很可能只是一条机器指令。

也就是说，我先读取“当前”最小值，然后加锁，然后判断此刻的最小值是否是前面“当前“最小值，如果是，那么说明我还是最小的值（因为我已经加锁，没人能修改）。我可以放心的更新

如果此刻的最小值不等于前面读过的最小值，那么说明这之间有人修改过了，那么我修改就可能失败（也有可能成功），那么我需要重试。

使用CAS的优点是：如果竞争不是很激烈，那么性能非常高。因为这是一种“乐观”锁，它不停的尝试获取锁（其实就是CAS），再多线程都不会导致对方等待（当然如果一直不能CAS成功就是在忙等待）。

当然缺点就是：如果竞争非常激烈，会让很多CPU时间都在CAS上，我们完全可以“阻塞”当前线程，让cpu干点别的。

问题3 按类别统计次数(线程安全的Map)

比如我们需要按城市来统计查询。那么最容易想到的自然是ConcurrentHashMap<String,Long>。但是同样更新次数时有同步的问题，所以我们可以使用AtomicLong

    private final ConcurrentHashMap<String, AtomicLong> map=new ConcurrentHashMap<String, AtomicLong>();
    public void addCount(String key,long value){
        AtomicLong counter=map.get(key);
        if(counter==null){
            counter=new AtomicLong();
            map.put(key,counter);
        }
        counter.addAndGet(value);
    }

上面的代码是有问题的，和前面一样，有可能有两个线程同时发现counter为空，同时new了一个AtomicLong，后new的线程会导致前面的值没有累计进去。

当然我们可以采样类似的方法来double check

    private final ConcurrentHashMap<String, AtomicLong> map=new ConcurrentHashMap<String, AtomicLong>();
    public void addCount(String key,long value){
        AtomicLong counter=map.get(key);
        if(counter==null){
            synchronized(this){
                if(counter==null){ //check again
                    counter=new AtomicLong();
                    map.put(key,counter);
                }
            }
        }
        counter.addAndGet(value);
    }

同样，我们也可以采样类似CAS的指令------putIfAbsent来实现block free的算法。这种算法在竞争不是非常激烈时很高效，它不会导致线程阻塞而带来切换。

	public void addCounter(String key, long count){
		if(count<=0) return;
		AtomicLong current = map.get(key);
                if(current == null) {
                    AtomicLong al=new AtomicLong();
                    current=map.putIfAbsent(key, al);
        	    if(current == null) current=al;
                }

                assert current!=null;
                current.addAndGet(count);

	}

putIfAbsent会尝试放入新的key和value，它首先检查这个key是否存在，如果存在，返回老的value，否则插入新的k-v对，并返回老的value。

如果返回值是null，说明我这个线程更新成功了，那么current就直接使用我构造的al，否则说明别人更新成功，我直接拿到了别人值。

细心的同学可能有这样的疑问：和Double Check Lock一样，会不会由于编译器或者cpu的reordering导致AtomicLong al=new AtomicLong();没有完全构造好，而其它线程拿到它呢？

我也有过这样的疑问，同时发现guava的实现和我上面的实现有类似的问题（https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1）

后来询问了一下，是没有问题的，在 ConcurrentMap 的文档里明确写了：Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread.

有兴趣的同学可以看看后面参考文献和邮件通信。

参考资料

1. http://stackoverflow.com/questions/6072040/thread-safe-implementation-of-max

2. http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm

3. http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/util/concurrent/AtomicLongMap.html

4. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentMap.html

5. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html#putIfAbsent%28K,%20V%29

6. https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1

7. http://www.cs.umd.edu/%7Epugh/java/memoryModel/DoubleCheckedLocking.html

[guava] is my counter thread safe and my question about AtomicLongMap?

Inbox

LiLi <fancyerii@gmail.com>	Fri, Jun 29, 2012 at 6:46 PM
To: guava-discuss@googlegroups.com
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
hi all I have read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html and know a little about Java memory model. I want to implement a thread safe and efficient Map<String,int> counter. I found a related question in stackoverflow http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm my implementation is : private ConcurrentHashMap<String, AtomicLong> counter=new ConcurrentHashMap<String, AtomicLong>(); public void addCount(String key, long count){ if(count<=0) return; AtomicLong current = counter.get(key); if(current == null) { current=counter.putIfAbsent( key, new AtomicLong()); if(current == null) current=counter.get(key); } assert current!=null; current.addAndGet(count); } but after I reading http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html . it seems there exists a state that may fail. if thread1 call addCount("key",2); and thread2 call addCount("key",1); at the same time. thread1 executes AtomicLong current = counter.get(key); and gets null. then thread 1 execute if(current == null) { current=counter.putIfAbsent( k ey, new AtomicLong()); as compilers/cpus may disorder, new AtomicLong() will not be null but may be well constructed. Then thread 2 call AtomicLong current = counter.get(key); it's not null but not well constructed. then it call current.addAndGet(). Will thread 2 crash if current is not well constructed? I also find similar implementation in a popular lib -- guava https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1 it may also fail like my codes. e.g. thread 1 call atomic = map.putIfAbsent(key, new AtomicLong(delta)); thread 2 get a not null atomic and call it's get(); public long addAndGet(K key, long delta) { outer: for (;;) { AtomicLong atomic = map.get(key); if (atomic == null) { atomic = map.putIfAbsent(key, new AtomicLong(delta)); if (atomic == null) { return delta; } // atomic is now non-null; fall through } for (;;) { long oldValue = atomic.get(); if (oldValue == 0L) { // don't compareAndSet a zero if (map.replace(key, atomic, new AtomicLong(delta))) { return delta; } // atomic replaced continue outer; } long newValue = oldValue + delta; if (atomic.compareAndSet( oldValue , newValue)) { return newValue; } // value changed } } } -- guava-discuss@googlegroups.com Project site: http://guava-libraries.googlecode.com This group: http://groups.google.com/group/guava-discuss This list is for general discussion. To report an issue: http://code.google.com/p/guava-libraries/issues/entry To get help: http://stackoverflow.com/questions/ask (use the tag "guava")
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original

tsuna <tsunanet@gmail.com>	Sat, Jun 30, 2012 at 12:01 AM
To: LiLi <fancyerii@gmail.com>
Cc: guava-discuss@googlegroups.com
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
On Fri, Jun 29, 2012 at 3:46 AM, LiLi <fancyerii@gmail.com> wrote: > Will thread 2 crash if current is not well constructed? No. This issue won't arise with ConcurrentHashMap because: - Either the key is already in the map, in which case putIfAbsent returns the existing entry; - Or the key is not already in the map, in which case putIfAbsent _atomically_ inserts it in the map and returns null. In the latter case, as documented by the contract of ConcurrentMap ( http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentMap.html) "placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access". In other words, because the code is guaranteed to be implemented in such a way that there is a happen-before relationship between the call to putIfAbsent in Thread 1, and the call to get in Thread 2. So Thread 2 will either see a correctly constructed AtomicLong, or it will see null and will then race with thread 1 to insert the key in putIfAbsent, which is perfectly fine as only one thread will ultimately manage to create the key. Put differently, it's as if the map was guaranteed to be implemented with one lock per key, and that the lock was used to ensure that all access to each entry in the map is done in a thread-safe manner. Of course in practice the implementation can be different, for example ConcurrentHashMap is traditionally implemented by having multiple internal hash maps, each of which is guarded by a single lock, and that lock is only acquired on writes, whereas reads rely only on volatile. Guava's LoadingCache is implemented like that too (in fact its code is actually a fork of that of ConcurrentHashMap). There are obviously many other implementations possible, most notable the one used by NonBlockingHashMap, which is completely lock-free.

is my counter thread safe?

Inbox

Li Li <fancyerii@gmail.com>	Fri, Jun 29, 2012 at 6:41 PM
To: javamemorymodel-discussion@cs.umd.edu
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
hi all I have read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html and know a little about Java memory model. I want to implement a thread safe and efficient Map<String,int> counter. I found a related question in stackoverflow http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm my implementation is : private ConcurrentHashMap<String, AtomicLong> counter=new ConcurrentHashMap<String, AtomicLong>(); public void addCount(String key, long count){ if(count<=0) return; AtomicLong current = counter.get(key); if(current == null) { current=counter.putIfAbsent( key, new AtomicLong()); if(current == null) current=counter.get(key); } assert current!=null; current.addAndGet(count); } but after I reading http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html . it seems there exists a state that may fail. if thread1 call addCount("key",2); and thread2 call addCount("key",1); at the same time. thread1 executes AtomicLong current = counter.get(key); and gets null. then thread 1 execute if(current == null) { current=counter.putIfAbsent( key, new AtomicLong()); as compilers/cpus may disorder, new AtomicLong() will not be null but may be well constructed. Then thread 2 call AtomicLong current = counter.get(key); it's not null but not well constructed. then it call current.addAndGet(). Will thread 2 crash if current is not well constructed? I also find similar implementation in a popular lib -- guava https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1 it may also fail like my codes. e.g. thread 1 call atomic = map.putIfAbsent(key, new AtomicLong(delta)); thread 2 get a not null atomic and call it's get(); public long addAndGet(K key, long delta) { outer: for (;;) { AtomicLong atomic = map.get(key); if (atomic == null) { atomic = map.putIfAbsent(key, new AtomicLong(delta)); if (atomic == null) { return delta; } // atomic is now non-null; fall through } for (;;) { long oldValue = atomic.get(); if (oldValue == 0L) { // don't compareAndSet a zero if (map.replace(key, atomic, new AtomicLong(delta))) { return delta; } // atomic replaced continue outer; } long newValue = oldValue + delta; if (atomic.compareAndSet( oldValue, newValue)) { return newValue; } // value changed } } }
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original

Pavel Rappo <pavel.rappo@gmail.com>	Fri, Jun 29, 2012 at 6:54 PM
To: Li Li <fancyerii@gmail.com>
Cc: concurrency-interest <concurrency-interest@cs.oswego.edu>
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
Hi, This 'counter.get(key)' cannot simply return "not well constructed" object. The reason is that you use 'putIfAbsent' to put it in the map. Either 'get' will return 'null' or it will return perfectly valid AtomicLong object. There's a happen-before edge between these two actions. - Show quoted text - > ______________________________ _________________ > Javamemorymodel-discussion mailing list > Javamemorymodel-discussion@cs.umd.edu > https://mailman.cs.umd.edu/mailman/listinfo/javamemorymodel-discussion > -- Sincerely yours, Pavel Rappo.
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original

Li Li <fancyerii@gmail.com>	Fri, Jun 29, 2012 at 7:11 PM
To: Pavel Rappo <pavel.rappo@gmail.com>
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
do you mean my counter is safe? putIfAbsent has a happen-before semantic? any document about this? Or only current implementation guarantee this? - Show quoted text -

Pavel Rappo <pavel.rappo@gmail.com>	Fri, Jun 29, 2012 at 7:24 PM
To: Li Li <fancyerii@gmail.com>
Cc: concurrency-interest <concurrency-interest@cs.oswego.edu>
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
1. I can't see any obvious flaws in it. 2. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html (see "Memory Consistency Properties") - Show quoted text -

Li Li <fancyerii@gmail.com>	Fri, Jun 29, 2012 at 7:30 PM
To: Pavel Rappo <pavel.rappo@gmail.com>
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
thanks. 在 2012-6-29 晚上7:24，"Pavel Rappo" < pavel.rappo@gmail.com>写道： - Show quoted text -

Jeremy Manson <jeremy.manson@gmail.com>	Mon, Jul 2, 2012 at 2:29 AM
To: Pavel Rappo <pavel.rappo@gmail.com>
Cc: Li Li <fancyerii@gmail.com>, concurrency-interest <concurrency-interest@cs.oswego.edu>
Reply \| Reply to all \| Forward \| Print \| Delete \| Show original
The counter field should be final, because another thread could read the enclosing object before it is finished being constructed (unless youve taken other precautions to prevent that. Jeremy - Show quoted text - > ______________________________ _________________ > Concurrency-interest mailing list > Concurrency-interest@cs.oswego.edu > http://cs.oswego.edu/mailman/listinfo/concurrency-interest

fancyerII

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
线程安全的统计工具(计数器)

问题描述我们需要一个统计工具来统计诸如查询次数，最短和最长的查询时间，另外还需要按类别来统计，比如按城市统计查询数量等等。我们需要它是线程安全的，因为我们会有很多线程同时更新统计数据。简单的方案最简单的方案是不保证线程安全，因为既然是统计数据，那么差一点点也许没有问题。其次比较简单的翻案就是用synchronized保护所有的数据，这是首先推荐的
复制链接

扫一扫