Trident state

最新推荐文章于 2024-09-29 08:44:52 发布

radar1985

最新推荐文章于 2024-09-29 08:44:52 发布

阅读量1.1k

点赞数

文章标签： tuples 数据库 database cassandra processing 存储

原文地址 https://github.com/nathanmarz/storm/wiki/Trident-state

Trident是一个很好的关于读取或写入有状态资源的抽象。state可以是内部的topology（例如内存中和被HDFS支持的），或者是外部的存储在如memcache或cassandra的数据库中。这些情况在使用Trident API时，是没有区别的。

Trident使用一种容错机制来管理state，使得state的更新独立于资源的失败和重新发送。这也Trident topologies好像对每个消息只做一次处理的原因。

当对state进行更新的时候，可能会有很多层次的容错。再接触这些知识之前，我们先看一个例子来说明实现恰好一次的语义的技巧的必要性。假设你正在对一个流数据进行计数聚集，并且你将计数数据存储在一个数据库。现在假设你存一个数在数据库，来表示计数数值。每次你处理一个新的tuple你对计数数值进行增加。

当失败发生，tuple将会被重发。这带来一个问题，当进行state更新时（或者相关的任何事情），你不知道是否在之前曾经成功更新过这个tuple。

也许你之前从未处理过这个tuple，这种情况下你可以增加计数。也许你已经处理过这个tuple并且也成功的增加了计数，但是这个tuple在其他步骤中处理失败了。这种情况下，你不应该再增加计数了。或者也许你已经看到这个tuple但是在更新数据库的时候得到一个错误，这时，你应该更新数据库。

在存储数据到数据库中时，你不知道这个tuple是否之前已经被处理过了。所以你需要更多的信息来作出正确选择。Trident提供了以下语义，他们是一组高效的实现一次处理的语义：

1 使用小的批处理来处理tuple。

2 每个tuple的批处理都有唯一的 "transaction id" (txid)。如果这个批处理重新执行，将会准确的给出一样的txid。

3 State更新是通过有序的批处理完成的。也就是，批处理3对应的状态更新一定会在批处理2更新成功之后再去执行。

基于这些原语，你的State实现可以检查到tuple批处理是否已经执行过了，并且以一致的方式采取适当的行动更新state。你采取的方式依赖于每个批处理中的输入spouts提供的准确原语。有三种可能的spouts支持容错：无事物的，事务性的，非透明事物。同样的，有3种可能的state支持容错：无事物的，事务性的，非透明事物。让我们来一下每种spout类型并且我们可以在相应的类型上实现什么样的容错。

Transactional spouts

记住，Trident处理tuple是以一个个拥有唯一transaction id的小型批处理来实现的。不同属性的spout为batch提供不同的保证。事务型spout拥有以下属性：

1 批处理的txid会一直保持相同。重新执行批处理的txid会与第一次执行该批处理时的txid完全相同。

2 tuple在批处理之间是不会有重叠或交集的。一个tuple只会在一个batch中。

3 所有tuple都在batch中。没有一个tuples例外。

这是一个非常简单易懂的spout类型。数据流被分成不会改变的批处理。trident-kafka 有一个事务型spout用于Kafka。

你可能想知道——为什么你不经常使用事务型spout？这个很容易理解。一个原因是，对于你的计算容错不是必要的。比如，

the way TransactionalTridentKafkaSpout works is the batch for a txid will contain tuples from all the Kafka partitions for a topic. Once a batch has been emitted, any time that batch is re-emitted in the future the exact same set of tuples must be emitted to meet the semantics of transactional spouts. Now suppose a batch is emitted from TransactionalTridentKafkaSpout, the batch fails to process, and at the same time one of the Kafka nodes goes down. You're now incapable of replaying the same batch as you did before (since the node is down and some partitions for the topic are not unavailable), and processing will halt.

这就是为什么会有不透明事务spouts-它的容错是对于丢失源数据节点仍然可以实现一次性执行的处理语义。你将会在后面的章节了解到它。

一个侧面说明，如果kafka支持复制，那么就需要使用支持节点容错的事务型spout，但是这个功能目前还没有。

在我们开始了解不透明事务型spout之前。我们先来看下如何设计一个拥有一次性语义的事务型spouts的state实现。state类型被称作事务状态 "transactional state"，并且利用了任意给定的tuples相关的txid不会改变这一特性。

假设准备进行一个word count 计算，并且你希望将结果保存在一个 key/value的数据库。KEY是word，value是语句中出现的数量。我们已经看到如果只保存count数将无法确定一个批处理是否完成。因此，我们应该存储将count和transaction id作为一个原子数值存储起来。当我们更新count的时候，比较数据库中的txid和当前batch的txid。如果他们相同，根据强顺序执行性，我们可以跳过这次更新。如果他们不同，则增加count的值。这个逻辑是可行的，因为batch的txid是永远不会改变的，Trident保证state的更新是完全按照batch的顺序来完成的。

Consider this example of why it works. Suppose you are processing txid 3 which consists of the following batch of tuples:

考虑下这个例子是如何运行的。假设你正在处理txid3的事务，事务中batch的tuple为

["man"]
["man"]
["dog"]

假设数据库中有如下的key/value对：

man => [count=3, txid=1]
dog => [count=4, txid=3]
apple => [count=10, txid=2]

"man"相关的 txid 是 txid 1. 因为当前的txid 为3,你可以确定这个batch没有被加入到这个count中。所以你可以继续执行将man的count数加上2并且更新 txid 为3. 看另一个key/value对, "dog" 的txid是3，与当前的txid相同. 你可以确定这个batch已经被执行，所以你可以跳过此次更新。完成更新后，数据库中的内容如下:

man => [count=5, txid=3]
dog => [count=4, txid=3]
apple => [count=10, txid=2]

Opaque transactional spouts

正如之前的描述，不透明事务spout不能保证tuple的batch txid一直保持不变。不透明事务spout有以下的属性：

每个tuple都会被在一个batch中进行成功的处理。但是，这个tuple有可能是在一个batch处理失败后，在之后的一个batch中处理成功。

OpaqueTridentKafkaSpout is a spout that has this property and is fault-tolerant to losing Kafka nodes. Whenever it's time for OpaqueTridentKafkaSpout to emit a batch, it emits tuples starting from where the last batch finished emitting. This ensures that no tuple is ever skipped or successfully processed by multiple batches.

使用不透明事务spou，在处理batch时不再会因为当前事务txid与数据库中的txid相同而跳过state的更新。因为这个batch有可能在两次state更新期间发生了改变。

What you can do is store more state in the database. Rather than store a value and transaction id in the database, you instead store a value, transaction id, and the previous value in the database. Let's again use the example of storing a count in the database. Suppose the partial count for your batch is "2" and it's time to apply a state update. Suppose the value in the database looks like this:

{ value = 4,
  prevValue = 1,
  txid = 2
}

Suppose your current txid is 3, different than what's in the database. In this case, you set "prevValue" equal to "value", increment "value" by your partial count, and update the txid. The new database value will look like this:

{ value = 6,
  prevValue = 4,
  txid = 3
}

Now suppose your current txid is 2, equal to what's in the database. Now you know that the "value" in the database contains an update from a previous batch for your current txid, but that batch may have been different so you have to ignore it. What you do in this case is increment "prevValue" by your partial count to compute the new "value". You then set the value in the database to this:

{ value = 3,
  prevValue = 1,
  txid = 2
}

This works because of the strong ordering of batches provided by Trident. Once Trident moves onto a new batch for state updates, it will never go back to a previous batch. And since opaque transactional spouts guarantee no overlap between batches – that each tuple is successfully processed by one batch – you can safely update based on the previous value.