ansys电力变压器模型_变压器模型……一切是如何开始的?

本文介绍了ansys电力变压器的模型,并追溯了Transformer模型在人工智能,特别是机器学习领域的起源和发展,探讨了从传统的电力工程模拟到现代深度学习模型的转变。
摘要由CSDN通过智能技术生成

ansys电力变压器模型

Transformer models have revolutionised the field of Natural Language Processing but, how did it all start? To understand current state-of-the-art architectures and genuinely appreciate why these models became a breakthrough in the field, we must go even further in time, where NLP as we know it started: when we first introduced Neural Networks in NLP.

变压器模型彻底改变了自然语言处理领域,但是,这一切是如何开始的呢? 为了了解当前的最新体系结构并真正理解为什么这些模型成为该领域的突破,我们必须走得更远,我们所知道的NLP始于此:当我们在NLP中首次引入神经网络时。

The introduction of Neural models to NLP found ways to overcome challenges that traditional methods couldn’t solve. One of the most remarkable advances were Sequence-to-Sequence models: Such models generate an output sequence by predicting one word at a time. Sequence-to-Sequence models encode the source text to reduce ambiguity and achieve context-awareness.

将神经模型引入NLP发现了克服传统方法无法解决的挑战的方法。 最引人注目的进步之一是序列到序列模型:此类模型通过一次预测一个单词来生成输出序列。 序列到序列模型对源文本进行编码,以减少歧义并实现上下文感知。

In any language task, context plays an essential role. To understand what words mean, we have to know something about the situation where they are used. Seq2Seq models achieve context by looking at a token level: previous word/sentences to generate the next words/sentences. The introduction to this representation of context embedded in space had multiple advantages such as avoiding data sparsity due to similar context data being mapped close to each other and providing a way to generate synthetic data.

在任何语言任务中,上下文都起着至关重要的作用。 要理解单词的含义,我们必须了解使用它们的情况。 Seq2Seq模型通过查看令牌级别来实现上下文:前一个单词/句子生成下一个单词/句子。 嵌入到空间中的上下文表示的介绍具有多个优点,例如避免了由于相似的上下文数据彼此靠近映射而导致的数据稀疏性,并提供了一种生成合成数据的方法。

However, context in language is very sophisticated. Most times, you can’t find context by only focusing on the previous sentence. There is a need for long range dependencies to achieve context awareness. Seq2Seq models work with Recurrent Neural Networks: LSTM or GRU. These networks have memory mechanisms to regulate the flow of information when processing sequences to achieve a “long term memory.” Despite this, if a sequence is long enough, they’ll have a hard time carrying information from earlier time steps to later ones.

但是,语言中的上下文非常复杂。 多数情况下,仅关注上一个句子就无法找到上下文。 为了实现上下文感知,需要长期依赖 。 Seq2Seq模型可用于递归神经网络:LSTM或GRU。 这些网络具有存储机制,可在处理序列以实现“长期存储”时调节信息流。 尽管如此,如果一个序列足够长,他们将很难将信息从较早的时间步长传送到较晚的时间步长。

RNNs will fall short when trying to process entire paragraphs of text. They suffer from the vanishing gradient problem. Gradients are values used to update the weights of a neural network and thus, learn. The vanishing gradient problem occurs when the gradient shrinks as it backpropagates through

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值