Trident state

最新推荐文章于 2017-08-06 21:46:26 发布

victory0508

最新推荐文章于 2017-08-06 21:46:26 发布

阅读量1k

点赞数

分类专栏： Storm 文章标签： storm Trident

Storm 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

State in Trident

Trident有对有状态数据源的抽象。state要么在topology内部如内存和HDFS中，或外部存储于数据库如 Memcached或Cassandr。在Trident API中对各种情况没有不同。

Trident 以容错方式管理state以至于 state 更新在重试和失败上是幂等的，state 更新的容错有若干层级。首先看一个实例来展示足够的策略实现exactly-once 语义，假定对stream统计并将计数存入数据库，而且假定以单值代表计数存入数据库，每次你处理一个新tuple增加此计数。一旦失败，tuples需要重来，这在state 更新上带来了问题，你并不知道在这个tuple上是否成功进行state 更新。或许你从未处理过此tuple，或许你成功处理了此tuple并增加了计数但此tuple在其他步骤上失败了，你都不应该增加计数。或许成功处理了此tuple但在存入数据库时报错，你应该更新数据库。

将计数存在数据库时，你不知道此tuple是否被处理了，因此你需要更多的信息以做出正确判断。Trident提供了如下语义，充分实现了exactly-once处理语义：

Tuples 以小批量处理 (the tutorial)
每一批量tuple给定一个唯一id，称为"transaction id" (txid)，如果此批量重做，则给定相同的 txid
State更新在批量顺序执行，批量 3的state更新需要在批量2完成之后。

根据以上原语，State实现k可以检查批量tuple是否处理了并以一致方式合理的进行state更新了，这依赖于每批量输入spouts的准确语义。有3种spouts 对应容错的3种state，均分别为 "non-transactional"、 "transactional"和 "opaque transactional"。让我们看看每种spout 可以实现何种的f fault-tolerance。

Transactional spouts

Trident以小批量处理tuples并赋予一个唯一的transactionid，spouts的特征根据每一批量tuple的内容而变化。一个transactional spout 拥有如下特征：

为一个批量赋予的txid总是相同的，一个txid对应的批量总是含有相同的tuples。
batches之间没有重叠的tuples。
每一个tuple均在batch中。

stream被划分为固定的batches且不变，storm-contrib为kafka有一个transactionalspout的实现。

那么，为什么不能总是使用transactionalspout？它们简单易于理解。一个原因是它们未必能够良好容错。比如，TransactionalTridentKafkaSpout工作方式为一个txid对应的批量将含有来自kafka某一主题的tuples。batch一旦被emitted，将来任何时间重新emitted此batch以完全相同的tuple则必须以满足transactionalspouts语义来emitted。现在假定TransactionalTridentKafkaSpoutemitted一个批量，但处理失败了，同时Kafka节点挂掉了，你将不能重新处理相同的批量，此处理将被中止。

这是"opaque transactional" spouts存在的原因– 它们在丢失源节点时是容错的并允许你实现exactly-once语义。(一旦Kafka支持镜像，这将使得transactional spouts在节点失败时是容错的，但这个特征还不存在。)

首先看看我们如何为transactionalspouts设计实现State的exactly-once语义，此State种类称为"transactional state"并利用此一事实，即一给定的txid总是与完全相同的tuple集合相关。

假定topology进行字数统计并将结果存入一key/value数据库，则key必为word而value将为计数。在数据库中看到计数结果并不足以直到此一批量tuple得到处理，而是与计数一起以一原子值表示的transactionid一起存入数据库。更新计数时，需要将数据库的transactionid与当前批量的transactionid加以比较。如果相同则不进行更新，根据强有序，你确信数据库里包含当前批量的值；如果不同则增加计数。此逻辑为对应一个txid的批量从不改变，且Trident确保以批量顺序进行state更新。

考虑此例，假定你处理包含如下批量tuple的txid3：

["man"]
["man"]
["dog"]

假定数据库当前有如下key/value对：

man => [count=3, txid=1]
dog => [count=4, txid=3]
apple => [count=10, txid=2]

与"man"相关的txid为1，既然当前txid是3，你将确信此批量tuples并未包含进数据库的计数中。因此你可以增加2并更新此txid。针对"dog"的txid与当前txid相同，因此你确信当前批量的增加值已经包含进数据库的计数，因此跳过此更新S。完成更新之后，数据库如下：

man => [count=5, txid=3]
dog => [count=4, txid=3]
apple => [count=10, txid=2]

现在看看opaquetransactional spouts并考虑如何为此类spout设计states。

Opaque transactional spouts

一个opaquetransactional spout不能保证一个txid对应的tuples保持不变，一个opaquetransactional spout 具有如下特征：

每一个tuple在唯一的批量中成功处理，但可能在此批量中失败但在后来的批量中成功。

OpaqueTridentKafkaSpout具有如上特征并在Kafkanodes失效时容错，无论何时，OpaqueTridentKafkaSpout emit一批量，它即emits从最后批量结束时的tuples开始，这保证没有tuple在多个批量中被忽略或成功处理。

opaque transactional spouts不可能在数据库transaction id与本批量transaction id相同的情况下忽略state更新，因为在state更新之间batch已经发生变化。

我们能做的就是在数据库中存储更多的state，而不是在数据库中存储一个值和transaction，你将存储一个值、transactionid和前一个值。让我们再看一个数据库中的计数，假定你批量的部分计数为2并应用state更新，假定数据库如下：

{ value = 4,
  prevValue = 1,
  txid = 2
}

假定当前txid为3，不同于数据库。此时，你设置"prevValue"为"value"且将此部分计数加到"value"上，更新txid，则新数据库的值如下：

{ value = 6,
  prevValue = 4,
  txid = 3
}

假定当前txid为2，与数据库相同。我们知道数据库的值包含一个来自当前txid前一批量的更新，但是批量已经不同，因此加以忽略。此时，将你的部分计数增加到"prevValue"并计数新的"value"，则数据库如下：

{ value = 3,
  prevValue = 1,
  txid = 2
}

由Trident提供的批量的强有序，一旦Trident转移至新的批量进行state更新，将不能回退到前一批量，既然opaque transactional spouts 保证批量间没有重叠，则每个tuple在一个批量中被成功处理，你可以安全的在前一值上进行更新。

Non-transactional spouts

Non-transactional spouts在任何批量中都不提供保证。因此它是至多一次处理，批量失败后，其tuple并不重试；或是至少一次处理，tuples可在多个批量中成功处理，不能为此类spout实现exactly-once语义。

Summary of spout and state types

下图是使能exactly-once信息语义的spouts / states组合：

Spouts vs States

Opaque transactional states 具有最强的容错，但它以在数据库中存储txid和两个values为代价。Transactional states在数据库中存储少量的state，但仅工作于transactional spouts。最后，non-transactionalstates在数据库中存储最少的state但不能实现exactly-once语义。

你选择何种state和spout需要在容错和存储成本之间进行权衡，最后总你的应用要求将决定哪种组合适合你。

State APIs

你看到了实现exactly-once语义的指标，Trident做的比较好，它在State中内部化了所有的容错逻辑。a作为使用者，你不需要处理txid比较、在数据库中存储多个值或其他类似的事情。代码如下：

TridentTopology topology = new TridentTopology();        
TridentState wordCounts =
      topology.newStream("spout1", spout)
        .each(new Fields("sentence"), new Split(), new Fields("word"))
        .groupBy(new Fields("word"))
        .persistentAggregate(MemcachedState.opaque(serverLocations), new Count(), new Fields("count"))                
        .parallelismHint(6);

管理opaquetransactional state全部逻辑内部化为对MemcachedState.opaque的调用，另外，updates是批量自动完成以最小化数据库的访问时间。

base State 接口有每个方法：

public interface State {
    void beginCommit(Long txid); // can be null for things like partitionPersist occurring off a DRPC stream
    void commit(Long txid);
}

告诉一个state更新什么开始，什么时候结束，每次给定txid。Trident并不假定state如何工作，哪个方法更新它，什么方法读取它。

假定你有一个保存用户位置信息的数据库，你想要由Trident能访问它。则你的State实现要有get/set用户信息的方法：

public class LocationDB implements State {
    public void beginCommit(Long txid) {    
    }

    public void commit(Long txid) {    
    }

    public void setLocation(long userId, String location) {
      // code to access database and set location
    }

    public String getLocation(long userId) {
      // code to get location from database
    }
}

为Trident提供一个StateFactory，能为Trident任务内的State对象创建实例，则LocationDB的StateFactory如下：

public class LocationDBFactory implements StateFactory {
   public State makeState(Map conf, int partitionIndex, int numPartitions) {
      return new LocationDB();
   } 
}

Trident提供QueryFunction接口编写查询state源的Trident operations ，提供StateUpdater接口编写更新state源的Tridentoperations。比如，写一个"QueryLocation"在LocationDB中查询用户的位置。以在topology如何使用它开始，假定topology消费一个userids的输入流：

TridentTopology topology = new TridentTopology();
TridentState locations = topology.newStaticState(new LocationDBFactory());
topology.newStream("myspout", spout)
        .stateQuery(locations, new Fields("userid"), new QueryLocation(), new Fields("location"))

QueryLocation 的实现如下：

public class QueryLocation extends BaseQueryFunction<LocationDB, String> {
    public List<String> batchRetrieve(LocationDB state, List<TridentTuple> inputs) {
        List<String> ret = new ArrayList();
        for(TridentTuple input: inputs) {
            ret.add(state.getLocation(input.getLong(0)));
        }
        return ret;
    }

    public void execute(TridentTuple tuple, String location, TridentCollector collector) {
        collector.emit(new Values(location));
    }    
}

QueryFunction分两个步骤执行，首先，Trident收集一批量读操作并将他们传给batchRetrieve，batchRetrieve将获得多个userids。BatchRetrieve将返回一个等于输入tuples等大的结果列表，结果列表的第一个元素对应第一个输入tuple的结果。尽管它查询一次LocationDB，但并不利用批处理。LocationDB更好的写法如下：

public class LocationDB implements State {
    public void beginCommit(Long txid) {    
    }

    public void commit(Long txid) {    
    }

    public void setLocationsBulk(List<Long> userIds, List<String> locations) {
      // set locations in bulk
    }

    public List<String> bulkGetLocations(List<Long> userIds) {
      // get locations in bulk
    }
}

然后重写QueryLocationfunction如下：

public class QueryLocation extends BaseQueryFunction<LocationDB, String> {
    public List<String> batchRetrieve(LocationDB state, List<TridentTuple> inputs) {
        List<Long> userIds = new ArrayList<Long>();
        for(TridentTuple input: inputs) {
            userIds.add(input.getLong(0));
        }
        return state.bulkGetLocations(userIds);
    }

    public void execute(TridentTuple tuple, String location, TridentCollector collector) {
        collector.emit(new Values(location));
    }    
}

上述代码更有效的减少数据库的访问时间。

为更新state，确定使用StateUpdater接口，这里的StateUpdater用新位置信息更新LocationDB：

public class LocationUpdater extends BaseStateUpdater<LocationDB> {
    public void updateState(LocationDB state, List<TridentTuple> tuples, TridentCollector collector) {
        List<Long> ids = new ArrayList<Long>();
        List<String> locations = new ArrayList<String>();
        for(TridentTuple t: tuples) {
            ids.add(t.getLong(0));
            locations.add(t.getString(1));
        }
        state.setLocationsBulk(ids, locations);
    }
}

在Tridenttopology使用上述操作：

TridentTopology topology = new TridentTopology();
TridentState locations = 
    topology.newStream("locations", locationsSpout)
        .partitionPersist(new LocationDBFactory(), new Fields("userid", "location"), new LocationUpdater())

partitionPersist operation 更新state源，StateUpdater接收State和更新此State的一批量tuples。代码在输入tuple中结合userids和locations并批量更新至State。

partitionPersist返回TridentState对象表示被Trident topology更新的locationdb，你可以在stateQuery 操作中使用这个state。给定StateUpdaters一个TridentCollector，Tuplesemitted给这个collector形成"new values stream"，如果你要在数据库中执行更新计数的话，你可以emit更新的计数至这个流，然后在TridentState#newValuesStream继续处理这些"new values stream"。

persistentAggregate

Trident 有另一种States更新方法称persistentAggregate，它曾用于流式单词计数的例子。如下：

TridentTopology topology = new TridentTopology();        
TridentState wordCounts =
      topology.newStream("spout1", spout)
        .each(new Fields("sentence"), new Split(), new Fields("word"))
        .groupBy(new Fields("word"))
        .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))

persistentAggregate是在partitionPersist之上的额外抽象，知道如何进行Trident汇总并用于state源的更新，既然这是一个分组的stream,Trident 希望state实现"MapState"接口。分组fields将为state的key，而汇总结果将为state的values。 "MapState" 接口如下：

public interface MapState<T> extends State {
    List<T> multiGet(List<List<Object>> keys);
    List<T> multiUpdate(List<List<Object>> keys, List<ValueUpdater> updaters);
    void multiPut(List<List<Object>> keys, List<T> vals);
}

当你在非分组流上做汇总(全局汇总)，Trident希望State对象实现"Snapshottable"接口：

public interface Snapshottable<T> extends State {
    T get();
    T update(ValueUpdater updater);
    void set(T o);
}

MemoryMapState和MemcachedState均实现上述两种接口。

Implementing Map States

Trident 使得MapState实现非常容易，为你完成几乎所有的工作。OpaqueMap、TransactionalMap和NonTransactionalMap类分别实现全部的容错逻辑。简单为这些类提供一个 IBackingMap接口实现，知道如何为对应的KV实现multiGets 和multiPuts。IBackingMap如下：

public interface IBackingMap<T> {
    List<T> multiGet(List<List<Object>> keys); 
    void multiPut(List<List<Object>> keys, List<T> vals); 
}

OpaqueMap以OpaqueValue作为值调用multiPut，TransactionalMap将以TransactionalValue作为值，NonTransactionalMaps将通过topology传递对象。Trident也提供CachedMap类自动实现KVmap的LRU缓存。最后，Trident 提供SnapshottableMap类将MapState转换为Snapshottable 对象，将全局汇总存储为一个固定key。

MemcachedState实现可让我们知道这些utilities可以聚合在一起实现更性能的MapState，MemcachedState允许你在opaquetransactional、transactiona和non-transactional语义之间做出选择。

victory0508

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Trident state

State in TridentTrident含有 has first-class abstractions for reading from and writing to stateful sources. The state can either be internal to the topology – e.g., kept in-memory and backed by HDFS
复制链接

扫一扫

专栏目录