线程安全的统计工具(计数器)

问题描述

    我们需要一个统计工具来统计诸如查询次数,最短和最长的查询时间,另外还需要按类别来统计,比如按城市统计查询数量等等。

    我们需要它是线程安全的,因为我们会有很多线程同时更新统计数据。

简单的方案

    最简单的方案是不保证线程安全,因为既然是统计数据,那么差一点点也许没有问题。

    其次比较简单的翻案就是用synchronized保护所有的数据,这是首先推荐的做法,如果发现有性能问题,再看下面更加复杂的方案。

高效但复杂的方案

    我们这里就假设需要非常高效并且线程安全的方案(比如需要追求极致性能或者你是性能优化狂)。

 

问题1 统计查询次数 

    这个比较简单,使用AtomicInteger或者AtomicLong就可以了。比如:

      privatefinal AtomicLong totalQuery=new AtomicLong();

        public void addTotalQuery(){

            totalQuery.incrementAndGet();

      }


 

问题2 统计最短查询时间

     首先看不做任何同步会怎么样?

       private long minTime=Long.MAX_VALUE;

       public void updateMinTime(long time){

           if(time<minTime){

                minTime=time;

           }

       }

       public long getMinTime(){

           return minTime;

       }


 

     比如minTime的当前值是5,然后线程1time3,线程2time4,他们同时发现自己比5小,所以都会试图把自己的值赋给minTime

     如果线程1先赋值,线程2后赋值,那么最终的值将是4,这样就记错了。

     除了上面的问题,这个方法还有个bugJVMlong的赋值不是原子的,尤其是对于32位的JVM,很可能一个long的赋值先修改高32位,然后修改低32位,那么如果

     在这直接有人读取这个值,那么就会读到奇怪的结果。

     另外即使没有人读,两个线程同时写一个long也会出现一个线程更新了高32位,另一个修改了低32位的情况。解决原子修改的问题可以使用volatile或者AtomicLong都可以,但是即使改成了volatile或者AtomicLong,仍然不能解决第一个带来的race condition

     那么怎么解决呢?当然加锁是一种方案

       @Guarded by this
       private long minTime=Long.MAX_VALUE;
       public void synchronized updateMinTime(long time){
           if(time<minTime){
                minTime=time;
           }
       }
       public void synchronized getMinTime(){
           return minTime;
       }

     注意:不但更新需要synchronized,读取也需要synchronized。因为synchronized除了互斥的语义之外,还定义了happens-before的关系,它保证updateMinTime的结果完成后才会执行getMinTime,而且和volatile一样,它保证了可见性。

    我们可以用volatile或者AtomicLong稍微优化一下:


       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void synchronized updateMinTime(long time){
           if(time<minTime){
                minTime=time;
           }
       }
       public void getMinTime(){
           return minTime;
       }

      我们用synchronized来保证只有一个线程能够修改minTime,而且由于它是volatile的,所以getMinTime不会读到更新long一半的情况。

      但这样只能有一个线程修改minTime,还是不是特别高效。

      我们能够发现:其实大部分线程的值都不是最小的值,它发现自己的值可能比minTime大,那么根本不需要更新。那能不能下面这样优化呢?

       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void updateMinTime(long time){
           if(time<minTime){             
               synchronized(this){
                   minTime=time;
               }
           }
       }
       public void getMinTime(){
           return minTime;
       }

       还是不行,因为还是可能线程1和线程2同时发现自己比较小,然后都试图更新,依然会出现上面的问题。

        当然如果大家碰到过Lazy Initialization的问题可能会知道可以使用类似Double Check Lock的方法来优化。

       @Guarded by this
       private volatile long minTime=Long.MAX_VALUE;
       public void updateMinTime(long time){
           if(time<minTime){
               synchronized(this){
                  if(time<minTime){//check again
                       minTime=time;
                  }
               }
           }
       }
       public void getMinTime(){
           return minTime;
       }

          这种使用double check lock的方法配合volatile的方案是没有问题的。但是明显实现起来很复杂。

          我们再来仔细分析它为什么正确。

          如果线程1和线程2同时发现自己比当前的minTime小,那么只有一个线程能进入synchronized代码段,这个时候再检查一下自己的值是否小于当前值。

          如果当前值比自己小,那么说明在第一次判断和拿到锁这段时间内有别人修改了最小值,而且还比自己小,那么自己就没有比更新最小值了。

          否则自己还是最小值,而且由于已经拿到锁,所以不会再有人能修改最小值,所以可以放心的更新最小值。

          它比在外层加锁的好处是:大部分时候第一个if就发现自己不是最小值了,那么更本不需要加锁,所以很快。

           我们总结一下它正确的原因:那就是它保证了条件判断(if(time<minTime))和修改(minTime=time;)是原子的,这个是通过synchronized来实现的。

           除了synchronized之外,很多现代处理器都提供了一些原子的指令,比如CAS。我们可以使用这些原子的指令来实现同样的功能。

           当然Java是跨平台的实现,它会考虑各种处理器,如果能够使用CAS,那么它会尽量使用,如果不能,那么使用操作系统或者其它的库来实现语义上的原子操作。

           这里我们可以使用AtomicLong.compareAndSet

	private final AtomicLong minTime=new AtomicLong(Long.MAX_VALUE);
        public void updateMinTime(long sample){
	    while (true) {
	        long curMin = minTime.get();
	        if (curMin > sample) {
	            boolean result = minTime.compareAndSet(curMin, sample);
	            if (result) break;
	        } else {
	            break;
	        }
	    }
	}

           我们仔细看一下它的实现。

           它在一个"死循环"里面,首先拿到当前的最小值,如果自己比当前的最小值大了,那么就不需要更新了,直接break退出。

           如果自己比较小,那么就尝试  boolean result = minTime.compareAndSet(curMin, sample);

           compareAndSet尝试把minTime更新为sample,前提是minTime的值是curMin。可以这样理解这条语句:

    synchronized(this){
        if(minTime==curMin){
             minTime=sample;
             return true;
        }
        return false;
     }

          当然这是语义上的等价,实际上它很可能只是一条机器指令。

           也就是说,我先读取当前最小值,然后加锁,然后判断此刻的最小值是否是前面当前最小值,如果是,那么说明我还是最小的值(因为我已经加锁,没人能修改)。我可以放心的更新

           如果此刻的最小值不等于前面读过的最小值,那么说明这之间有人修改过了,那么我修改就可能失败(也有可能成功),那么我需要重试。

           使用CAS的优点是:如果竞争不是很激烈,那么性能非常高。因为这是一种乐观锁,它不停的尝试获取锁(其实就是CAS),再多线程都不会导致对方等待(当然如果一直不能CAS成功就是在忙等待)。

           当然缺点就是:如果竞争非常激烈,会让很多CPU时间都在CAS上,我们完全可以阻塞当前线程,让cpu干点别的。

问题3 按类别统计次数(线程安全的Map)

           比如我们需要按城市来统计查询。那么最容易想到的自然是ConcurrentHashMap<String,Long>。但是同样更新次数时有同步的问题,所以我们可以使用AtomicLong

    private final ConcurrentHashMap<String, AtomicLong> map=new ConcurrentHashMap<String, AtomicLong>();
    public void addCount(String key,long value){
        AtomicLong counter=map.get(key);
        if(counter==null){
            counter=new AtomicLong();
            map.put(key,counter);
        }
        counter.addAndGet(value);
    }

          上面的代码是有问题的,和前面一样,有可能有两个线程同时发现counter为空,同时new了一个AtomicLong,后new的线程会导致前面的值没有累计进去。

           当然我们可以采样类似的方法来double check

    private final ConcurrentHashMap<String, AtomicLong> map=new ConcurrentHashMap<String, AtomicLong>();
    public void addCount(String key,long value){
        AtomicLong counter=map.get(key);
        if(counter==null){
            synchronized(this){
                if(counter==null){ //check again
                    counter=new AtomicLong();
                    map.put(key,counter);
                }
            }
        }
        counter.addAndGet(value);
    }

            同样,我们也可以采样类似CAS的指令------putIfAbsent来实现block free的算法。这种算法在竞争不是非常激烈时很高效,它不会导致线程阻塞而带来切换。

	public void addCounter(String key, long count){
		if(count<=0) return;
		AtomicLong current = map.get(key);
                if(current == null) {
                    AtomicLong al=new AtomicLong();
                    current=map.putIfAbsent(key, al);
        	    if(current == null) current=al;
                }

                assert current!=null;
                current.addAndGet(count);

	}

            putIfAbsent会尝试放入新的keyvalue,它首先检查这个key是否存在,如果存在,返回老的value,否则插入新的k-v对,并返回老的value

            如果返回值是null,说明我这个线程更新成功了,那么current就直接使用我构造的al,否则说明别人更新成功,我直接拿到了别人值。

            细心的同学可能有这样的疑问:和Double Check Lock一样,会不会由于编译器或者cpureordering导致AtomicLong al=new AtomicLong();没有完全构造好,而其它线程拿到它呢?

            我也有过这样的疑问,同时发现guava的实现和我上面的实现有类似的问题(https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1

            后来询问了一下,是没有问题的,在 ConcurrentMap 的文档里明确写了:Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread.

            有兴趣的同学可以看看后面参考文献和邮件通信。

参考资料

1. http://stackoverflow.com/questions/6072040/thread-safe-implementation-of-max

2. http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm

3. http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/util/concurrent/AtomicLongMap.html

4. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentMap.html

5. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html#putIfAbsent%28K,%20V%29

6. https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1

7. http://www.cs.umd.edu/%7Epugh/java/memoryModel/DoubleCheckedLocking.html

 

[guava] is my counter thread safe and my question about AtomicLongMap?

   Inbox 
Add star 

LiLi

<fancyerii@gmail.com>
Fri, Jun 29, 2012 at 6:46 PM
To: guava-discuss@googlegroups.com
hi all
  I have read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
and know a little about Java memory model.
  I want to implement a thread safe and efficient Map<String,int>
counter. I found a related question in stackoverflow
http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm
  my implementation is :

       private ConcurrentHashMap<String, AtomicLong> counter=new
ConcurrentHashMap<String, AtomicLong>();

       public void addCount(String key, long count){
               if(count<=0) return;
               AtomicLong current = counter.get(key);
               if(current == null) {
                   current=counter.putIfAbsent(
key, new AtomicLong());
                   if(current == null) current=counter.get(key);
               }

               assert current!=null;
               current.addAndGet(count);

       }
  but after I reading
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
. it seems there exists a state that may fail.
  if thread1 call addCount("key",2); and thread2 call
addCount("key",1); at the same time.
  thread1 executes AtomicLong current = counter.get(key);  and gets null.
  then thread 1 execute
         if(current == null) {
                   current=counter.putIfAbsent( k ey, new AtomicLong());
  as compilers/cpus may disorder, new AtomicLong() will not be null
but may be well constructed.
  Then thread 2 call AtomicLong current = counter.get(key); it's not
null but not well constructed. then it call current.addAndGet().
   Will thread 2 crash if current is not well constructed?

   I also find similar implementation in a popular lib -- guava
https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1
   it may also fail like my codes.
   e.g. thread 1 call atomic = map.putIfAbsent(key, new AtomicLong(delta));
          thread 2 get a not null atomic and call it's get();

public long addAndGet(K key, long delta) {
   outer: for (;;) {
     AtomicLong atomic = map.get(key);
     if (atomic == null) {
       atomic = map.putIfAbsent(key, new AtomicLong(delta));
       if (atomic == null) {
         return delta;
       }
       // atomic is now non-null; fall through
     }

     for (;;) {
       long oldValue = atomic.get();
       if (oldValue == 0L) {
         // don't compareAndSet a zero
         if (map.replace(key, atomic, new AtomicLong(delta))) {
           return delta;
         }
         // atomic replaced
         continue outer;
       }

       long newValue = oldValue + delta;
       if (atomic.compareAndSet( oldValue , newValue)) {
         return newValue;
       }
       // value changed
     }
   }
 }

--
guava-discuss@googlegroups.com
Project site: http://guava-libraries.googlecode.com
This group: http://groups.google.com/group/guava-discuss
 
This list is for general discussion.
To report an issue: http://code.google.com/p/guava-libraries/issues/entry
To get help: http://stackoverflow.com/questions/ask (use the tag "guava")
Reply | Reply to all | Forward | Print | Delete | Show original

Add star 

tsuna

<tsunanet@gmail.com>
Sat, Jun 30, 2012 at 12:01 AM
To: LiLi <fancyerii@gmail.com>
Cc: guava-discuss@googlegroups.com
On Fri, Jun 29, 2012 at 3:46 AM, LiLi <fancyerii@gmail.com> wrote:
>    Will thread 2 crash if current is not well constructed?

No.  This issue won't arise with ConcurrentHashMap because:
  - Either the key is already in the map, in which case putIfAbsent
returns the existing entry;
  - Or the key is not already in the map, in which case putIfAbsent
_atomically_ inserts it in the map and returns null.

In the latter case, as documented by the contract of ConcurrentMap
( http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentMap.html)
"placing an object into a ConcurrentMap as a key or value
happen-before actions subsequent to the access".

In other words, because the code is guaranteed to be implemented in
such a way that there is a happen-before relationship between the call
to putIfAbsent in Thread 1, and the call to get in Thread 2.  So
Thread 2 will either see a correctly constructed AtomicLong, or it
will see null and will then race with thread 1 to insert the key in
putIfAbsent, which is perfectly fine as only one thread will
ultimately manage to create the key.

Put differently, it's as if the map was guaranteed to be implemented
with one lock per key, and that the lock was used to ensure that all
access to each entry in the map is done in a thread-safe manner.  Of
course in practice the implementation can be different, for example
ConcurrentHashMap is traditionally implemented by having multiple
internal hash maps, each of which is guarded by a single lock, and
that lock is only acquired on writes, whereas reads rely only on
volatile.  Guava's LoadingCache is implemented like that too (in fact
its code is actually a fork of that of ConcurrentHashMap).  There are
obviously many other implementations possible, most notable the one
used by NonBlockingHashMap, which is completely lock-free.

 

is my counter thread safe?

   Inbox 
Add star 

Li Li

<fancyerii@gmail.com>
Fri, Jun 29, 2012 at 6:41 PM
To: javamemorymodel-discussion@cs.umd.edu
hi all
   I have read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
and know a little about Java memory model.
   I want to implement a thread safe and efficient Map<String,int>
counter. I found a related question in stackoverflow
http://stackoverflow.com/questions/8477352/global-in-memory-counter-that-is-thread-safe-and-flushes-to-mysql-every-x-increm
   my implementation is :

        private ConcurrentHashMap<String, AtomicLong> counter=new
ConcurrentHashMap<String, AtomicLong>();

        public void addCount(String key, long count){
                if(count<=0) return;
                AtomicLong current = counter.get(key);
                if(current == null) {
                    current=counter.putIfAbsent( key, new AtomicLong());
                    if(current == null) current=counter.get(key);
                }

                assert current!=null;
                current.addAndGet(count);

        }
   but after I reading
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
. it seems there exists a state that may fail.
   if thread1 call addCount("key",2); and thread2 call
addCount("key",1); at the same time.
   thread1 executes AtomicLong current = counter.get(key);  and gets null.
   then thread 1 execute
          if(current == null) {
                    current=counter.putIfAbsent( key, new AtomicLong());
   as compilers/cpus may disorder, new AtomicLong() will not be null
but may be well constructed.
   Then thread 2 call AtomicLong current = counter.get(key); it's not
null but not well constructed. then it call current.addAndGet().
    Will thread 2 crash if current is not well constructed?

    I also find similar implementation in a popular lib -- guava
https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/util/concurrent/AtomicLongMap.java?name=v11.0-rc1
    it may also fail like my codes.
    e.g. thread 1 call atomic = map.putIfAbsent(key, new AtomicLong(delta));
           thread 2 get a not null atomic and call it's get();

public long addAndGet(K key, long delta) {
    outer: for (;;) {
      AtomicLong atomic = map.get(key);
      if (atomic == null) {
        atomic = map.putIfAbsent(key, new AtomicLong(delta));
        if (atomic == null) {
          return delta;
        }
        // atomic is now non-null; fall through
      }

      for (;;) {
        long oldValue = atomic.get();
        if (oldValue == 0L) {
          // don't compareAndSet a zero
          if (map.replace(key, atomic, new AtomicLong(delta))) {
            return delta;
          }
          // atomic replaced
          continue outer;
        }

        long newValue = oldValue + delta;
        if (atomic.compareAndSet( oldValue, newValue)) {
          return newValue;
        }
        // value changed
      }
    }
  }
Reply | Reply to all | Forward | Print | Delete | Show original

Add star 

Pavel Rappo

<pavel.rappo@gmail.com>
Fri, Jun 29, 2012 at 6:54 PM
To: Li Li <fancyerii@gmail.com>
Cc: concurrency-interest <concurrency-interest@cs.oswego.edu>
Hi,

This 'counter.get(key)' cannot simply return "not well constructed"
object. The reason is that you use 'putIfAbsent' to put it in the map.
Either 'get' will return 'null' or it will return perfectly valid
AtomicLong object. There's a happen-before edge between these two
actions.
> ______________________________ _________________
> Javamemorymodel-discussion mailing list
> Javamemorymodel-discussion@cs.umd.edu
> https://mailman.cs.umd.edu/mailman/listinfo/javamemorymodel-discussion
>


--
Sincerely yours, Pavel Rappo.
Reply | Reply to all | Forward | Print | Delete | Show original

Add star 

Li Li

<fancyerii@gmail.com>
Fri, Jun 29, 2012 at 7:11 PM
To: Pavel Rappo <pavel.rappo@gmail.com>
do you mean my counter is safe?
putIfAbsent has a happen-before semantic? any document about this? Or
only current implementation guarantee this?

Add star 

Pavel Rappo

<pavel.rappo@gmail.com>
Fri, Jun 29, 2012 at 7:24 PM
To: Li Li <fancyerii@gmail.com>
Cc: concurrency-interest <concurrency-interest@cs.oswego.edu>
1. I can't see any obvious flaws in it.
2. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html
(see "Memory Consistency Properties")

Add star 

Li Li

<fancyerii@gmail.com>
Fri, Jun 29, 2012 at 7:30 PM
To: Pavel Rappo <pavel.rappo@gmail.com>

thanks.

在 2012-6-29 晚上7:24,"Pavel Rappo" < pavel.rappo@gmail.com>写道:

Add star 

Jeremy Manson

<jeremy.manson@gmail.com>
Mon, Jul 2, 2012 at 2:29 AM
To: Pavel Rappo <pavel.rappo@gmail.com>
Cc: Li Li <fancyerii@gmail.com>, concurrency-interest <concurrency-interest@cs.oswego.edu>
The counter field should be final, because another thread could read
the enclosing object before it is finished being constructed (unless
youve taken other precautions to prevent that.

Jeremy
> ______________________________ _________________
> Concurrency-interest mailing list
> Concurrency-interest@cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值