Lecture 20: Thread Safety

1 What Threadsafe Means?

A data type or static method is threadsafe if it behaves correctly when used from multiple threads, regardless of how those threads are executed, and without demanding additional coordination from the calling code.

  • “behaves correctly” means satisfying its specification and preserving its rep invariant;
  • “regardless of how threads are executed” means threads might be on multiple processors or timesliced on the same processor;
  • “without additional coordination” means that the data type can’t put preconditions on its caller related to timing, like “you can’t call get() while set() is in progress.”

2 Strategy 1: Confinement

Confinement: avoid races on mutable data by keeping that data confined to a single thread. Don’t give any other threads the ability to read or write the data directly.

  • Local variables are always thread confined. A local variable is stored in the stack, and each thread has its own stack.
  • Be careful – the variable is thread confined, but if it’s an object reference, you also need to check the object it points to, especially when the object is mutable. We must ensure it can not be referenced from any other thread.

2.1 Avoid Global Variables

Unlike local variables, static variables are not automatically thread confined.

  • If you have static variables in your program, then you have to make an argument that only one thread will ever use them, you have to document that fact clearly.
  • Better, you should eliminate the static variables entirely.

3 Strategy 2: Immutability

We’ve said that a type is immutable if an object of the type always represents the same abstract value for its entire lifetime. But that actually allows the type the freedom to mutate its rep, as long as those mutations are invisible to clients, which is called beneficent mutation.

  • An immutable data type that uses beneficent mutation will have to make itself threadsafe using locks.

3.1 Stronger definition of immutablity

In order to be confident that an immutable data type is threadsafe without locks, we need a stronger definition of immutability:

  • No mutator methods.
  • All fields are private and final.
  • No representation exposure.
  • No mutation whatsoever of mutable objects in the rep – not even beneficent mutation.

And in Java:

  • Don’t provide “setter” methods — methods that modify fields or objects referred to by fields.
  • Make all fields final and private.
  • Don’t allow subclasses to override methods. The simplest way to do this is to declare the class as final. A more sophisticated approach is to make the constructor private and construct instances in factory methods.
  • If the instance fields include references to mutable objects, don’t allow those objects to be changed:
    • Don’t provide methods that modify the mutable objects.
    • Don’t share references to the mutable objects. Never store references to external, mutable objects passed to the constructor; if necessary, create copies, and store references to the copies. Similarly, create copies of your internal mutable objects when necessary to avoid returning the originals in your methods.

4 Strategy 3: Using Threadsafe Data Types

Strategy: Store shared mutable data in existing threadsafe data types.
For example: StringBuffer is threadsafe but StringBuilder is not.

4.1 Threadsafe Collections

private static Map<Integer,Boolean> cache =
                Collections.synchronizedMap(new HashMap<>());
  • Don’t circumvent the wrapper: Make sure to throw away references to the underlying non-threadsafe collection, and access it only through the synchronized wrapper.
  • Iterators are still not threadsafe.
for (String s: lst) { ... } // not threadsafe, even if lst is a synchronized list wrapper
  • atomic operations aren’t enough to prevent races:
if (cache.containsKey(x)) return cache.get(x);
boolean answer = BigInteger.valueOf(x).isProbablePrime(100);
cache.put(x, answer);
  • Even the isPrime() method still has potential races:
    The synchronized map ensures that containsKey() , get() , and put() are now atomic, so using them from multiple threads won’t damage the rep invariant of the map. But those three operations can now interleave in arbitrary ways with each other, which might break the invariant that isPrime needs from the cache: if the cache maps an integer x to a value f , then x is prime if and only if f is true. If the cache ever fails this invariant, then we might return the wrong result.
    So we have to argue that the races between containsKey() , get() , and put() don’t threaten this invariant.
    • The race between containsKey() and get() is not harmful because we never remove items from the cache – once it contains a result for x, it will continue to do so.
    • There’s a race between containsKey() and put() . As a result, it may end up that two threads will both test the primeness of the same x at the same time, and both will race to call put() with the answer. But both of them should call put() with the same answer, so it doesn’t matter which one wins the race, the result will be the same.

5 How to Make a Safety Argument

If you want to convince yourself and others that your concurrent program is correct, the best approach is to make an explicit argument that it’s free from races, and write it down.

  • A safety argument needs to catalog all the threads that exist in your module or program, and the data that that they use, and argue which of the four techniques you are using to protect against races for each data object or variable.

5.1 Thread Safety Arguments for Data Types

  • Confinement is not usually an option when we’re making an argument just about a data type, because you have to know what threads exist in the system and what objects they’ve been given access to.
  • Immutability example:
/** MyString is an immutable data type representing a string of characters. */
public class MyString {
    private final char[] a;
    // Thread safety argument:
    //    This class is threadsafe because it's immutable:
    //    - a is final
    //    - a points to a mutable char array, but that array is encapsulated
    //      in this object, not shared with any other object or exposed to a
    //      client
  • Just using immutable and threadsafe-mutable data types is not sufficient when the rep invariant depends on relationships between objects in the rep.

5.2 Serializability

  • What we demand from a threadsafe data type is that when clients call its atomic operations concurrently, the results are consistent with some sequential ordering of the calls. This property is called serializability.

Reference

[1] 6.005 — Software Construction on MIT OpenCourseWare | OCW 6.005 Homepage at https://ocw.mit.edu/ans7870/6/6.005/s16/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值