various Sequence to Sequence Model

1. A basic LSTM  encoder-decoder.

Encoder:

X 是 input sentence. C 是encoder 产生的最后一次的hidden state, 记作 Context Vector.

\[C=LSTM(X).\]

Decoder:

每次的输出值就是下一次的输入值, 第一次的输入值就是 encoder 产生的 Context Vector. Encoder最后输出的 hidden state 通常用来初始化 Decoder的 $y_{0}$.

基本公式:

\[y_{0} = LSTM(s_{0}, C);\]

$C$ 就是encoder 产生的 context vector.
\[y_t = LSTM(s_{t-1}, y_{t-1});\]

$s$ 是LSTM的 hidden state 状态 LSTM ($h$ and $c$).

\[s_t=[h_t,c_t]\]

 

2. A basic LSTM  encoder-decoder with peek.

Encoder部分与上面相同。Decoder部分,每次的输入值为${s_{t-1},y_{t-1},C}$. 这边的peek value就是 每次迭代的时候都将 Context Vector作为输入。

初始化: \[y(0) = LSTM(s0, C, C)\]

每次的迭代公式: \[y(t) = LSTM(s(t-1), y(t-1), C)\]

 




转载于:https://www.cnblogs.com/ZJUT-jiangnan/p/5414732.html

Flax is a popular deep learning library built on top of JAX, a high-performance scientific computing library for Python. It provides an easy-to-use API for defining and training neural network models, while leveraging the speed and efficiency of JAX's Just-In-Time (JIT) compilation and automatic differentiation. In the context of Flax, a model typically refers to a class or a set of functions that define the architecture of a neural network. It includes layers, activation functions, and parameters that are learned during training. Flax supports various types of models, such as feedforward networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers, and more. Here are some key aspects of the Flax Model: 1. **Structured State**: Flax uses a structured state format, where all learnable parameters are stored in a single object, making it easier to manage and apply weight updates. 2. **Functional API**: The library encourages functional programming style, allowing users to create complex models using compositions of simple functions, which makes code more modular and testable. 3. **Module System**: Flax uses a hierarchical module system that allows you to create and reuse sub-modules, enabling code reusability and organization. 4. **Modularity**: Models are composed of individual modules, each with their own forward pass function, making it simple to experiment with different architectures. 5. **Dynamic Shapes**: Flax handles variable-size inputs and dynamic shapes efficiently, which is crucial for sequence modeling tasks.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值