序列模型开发者的福音Lingvo: A TensorFlow Framework for Sequence Modeling

原文地址:https://medium.com/tensorflow/lingvo-a-tensorflow-framework-for-sequence-modeling-8b1d6ffba5bb?linkId=63952201

github:https://github.com/tensorflow/lingvo

colab:https://colab.research.google.com/github/tensorflow/lingvo/blob/master/codelabs/introduction.ipynb

 

Lingvo is the international language Esperanto word for “language”. This naming alludes to the roots of the Lingvo framework — it was developed as a general deep learning framework using TensorFlow with a focus on sequence models for language-related tasks such as machine translation, speech recognition, and speech synthesis.

Internally, the framework gained traction and the number of researchers using it ballooned. As a result, there are now dozens of published papers with state-of-the-art results produced using Lingvo with more to come. Supported architectures range from traditional RNN sequence models to Transformer models and models that include VAE components. To show our support of the research community and encourage reproducible research effort, we have open-sourced the framework and are starting to release the models used in our papers

 

Lingvo was built with collaborative research in mind, and promotes code reuse by sharing the implementation of common layers across different tasks. In addition, all layers implement the same common interface and are laid out in the same way. Not only does this produce cleaner and more understandable code, it makes it extremely simple to apply improvements someone else made for a different task to your own task. Enforcing this consistency does come at the cost of requiring more discipline and boilerplate, but Lingvo attempts to minimize this to ensure fast iteration time during research.

Another aspect of collaboration is sharing reproducible results. Lingvo provides a centralized location for checked-in model hyperparameter configurations. Not only does this serve to document important experiments, it gives others an easy way to reproduce your results by training an identical model.

def Task(cls):
  p = model.AsrModel.Params()
  p.name = 'librispeech'

  # Initialize encoder params.
  ep = p.encoder
  # Data consists 240 dimensional frames (80 x 3 frames), which we
  # re-interpret as individual 80 dimensional frames. See also,
  # LibrispeechCommonAsrInputParams.
  ep.input_shape = [None, None, 80, 1]
  ep.lstm_cell_size = 1024
  ep.num_lstm_layers = 4
  ep.conv_filter_shapes = [(3, 3, 1, 32), (3, 3, 32, 32)]
  ep.conv_filter_strides = [(2, 2), (2, 2)]
  ep.cnn_tpl.params_init = py_utils.WeightInit.Gaussian(0.001)
  # Disable conv LSTM layers.
  ep.num_conv_lstm_layers = 0

  # Initialize decoder params.
  dp = p.decoder
  dp.rnn_cell_dim = 1024
  dp.rnn_layers = 2
  dp.source_dim = 2048
  # Use functional while based unrolling.
  dp.use_while_loop_based_unrolling = False

  tp = p.train
  tp.learning_rate = 2.5e-4
  tp.lr_schedule = lr_schedule.ContinuousLearningRateSchedule.Params().Set(
      start_step=50000, half_life_steps=100000, min=0.01)

  # Setting p.eval.samples_per_summary to a large value ensures that dev,
  # devother, test, testother are evaluated completely (since num_samples for
  # each of these sets is less than 5000), while train summaries will be
  # computed on 5000 examples.
  p.eval.samples_per_summary = 5000
  p.eval.decoder_samples_per_summary = 0

  # Use variational weight noise to prevent overfitting.
  p.vn.global_vn = True
  p.train.vn_std = 0.075
  p.train.vn_start_step = 20000

  return p

An example of a task configuration in Lingvo. Hyperparameters for each experiment is configured in its own class separate from the code that builds the network and checked into version control.Source

 

While Lingvo started out with a focus on NLP, it is inherently very flexible, and models for tasks such as image segmentation and point cloud classification have been successfully implemented using the framework. Distillation, GANs, and multi-task models are also supported. At the same time, the framework does not compromise on speed, and features an optimized input pipeline and fast distributed training. Finally, Lingvo was put together with an eye towards easy productionization, and there is even a well-defined path towards porting models for mobile inference.

To jump straight into the code, check out our github page and the codelab. To learn more details about Lingvo or some of the advanced features it supports, see our paper.

 

知乎: https://zhuanlan.zhihu.com/albertwang

 

 

微信公众号:

https://i-blog.csdnimg.cn/blog_migrate/5509f60f875d387159a310532cc257dd.png

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值