nlp顶级期刊_2020年顶级NLP库

nlp顶级期刊

自然语言处理 (Natural Language Processing)

Natural Language Processing has been one of the most researched fields in deep learning in 2020, mostly due to its rising popularity, future potential, and support for a wide variety of applications.

自然语言处理一直是2020年深度学习领域研究最多的领域之一,这主要是由于其日益普及,未来的潜力以及对各种应用程序的支持。

If you have played around with deep learning before, you probably know conventional deep learning frameworks such as Tensorflow, Keras, and Pytorch. Assuming that you know these basic frameworks, this tutorial is dedicated to briefly guide you with other useful NLP libraries that you can learn and use in 2020. Depending on what you want to do, you might be able to take away a few names of the tools that interest you or didn't know exist!

如果你已经用深度学习之前发挥各地,你可能知道传统的深度学习框架,如TensorflowKerasPytorch 。 假定您了解这些基本框架,本教程将专门为您简要指导您在2020年可以学习和使用的其他有用的NLP库。根据您想做的事,您也许可以删除一些您感兴趣或不知道的工具存在!

通用框架 (General Frameworks)

艾伦 (AllenNLP)

Image for post
Source) 来源 )
  • Popularity: ⭐⭐⭐⭐

    人气:⭐⭐⭐⭐
  • Official Website: https://allennlp.org/

    官方网站: https : //allennlp.org/

  • Github: https://github.com/allenai/allennlp

    GitHub: https : //github.com/allenai/allennlp

  • Explanation: AllenNLP is a general framework for deep learning for NLP, established by the world-famous Allen Institute for AI Lab. It contains state-of-the-art reference models that you can start implementing fast. It also supports a wide variety of tasks and datasets so there is no worry about that. It also includes a lot of cool demos that you can check out to see if you want to learn and use this framework!

    说明:AllenNLP是用于NLP的深度学习的通用框架,该框架由举世闻名的Allen AI实验室研究所建立 。 它包含最新的参考模型,您可以开始快速实施它们。 它还支持各种各样的任务和数据集,因此不必担心。 它还包括许多很酷的演示,您可以查看这些演示,以了解是否要学习和使用此框架!

Fairseq (Fairseq)

Image for post
Source) 来源 )

Fast.ai (Fast.ai)

  • Popularity: ⭐⭐⭐⭐

    人气:⭐⭐⭐⭐
  • Official Website: http://docs.fast.ai/

    官方网站: http : //docs.fast.ai/

  • Github: https://github.com/fastai/fastai

    GitHub: https : //github.com/fastai/fastai

  • Explanation: Fast.ai is built to make deep learning accessible to people without technical backgrounds through its free online courses and also easy-to-use software library. In fact, it’s co-founder Jeremy Howard just published (Aug. 2020) a completely new book called Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD, which it’s title is pretty self-explanatory. In the Fast.ai library, they have a specified Text section, which is for anything related to NLP. They have super high-level abstractions and easy implementations for NLP data preprocessing, model construction, training, and evaluation. I really recommend Fast.ai to anyone who prefers practice over theory and wants to solve a problem fast.

    说明:Fast.ai的构建旨在通过其免费的在线课程和易于使用的软件库,使没有技术背景的人们可以进行深度学习。 实际上,它的联合创始人杰里米·霍华德(Jeremy Howard)刚刚出版(2020年8月)一本全新的书,名为“使用Fastai和PyTorch进行代码深度学习:没有博士学位的AI应用程序” ,其标题是不言而喻的。 在Fast.ai库中,它们具有指定的Text节 ,该用于与NLP相关的任何内容。 它们具有用于NLP数据预处理,模型构建,培训和评估的超高级抽象,并且易于实现。 我真的向那些偏爱实践而不是理论并希望快速解决问题的人推荐Fast.ai。

前处理 (Preprocessing)

空间 (Spacy)

  • Popularity: ⭐⭐⭐⭐⭐

    人气:⭐⭐⭐⭐⭐
  • Official Website: https://spacy.io/

    官方网站: https//spacy.io/

  • Github: https://github.com/explosion/spaCy

    GitHub: https : //github.com/explosion/spaCy

  • Explanation: Spacy is the most popular text preprocessing library and most convenient one that you will ever find out there. It contains lots of easy-to-use functions for tokenization, part-of-speech tagging, named entity recognition, and much more. It also supports 59+ languages and several pretrained word vectors that you can get you started fast!

    说明:Spacy是最受欢迎的文本预处理库,您将在其中找到最方便的一种。 它包含许多易于使用的功能,用于标记化,词性标记,命名实体识别等。 它还支持59种以上的语言和几种预训练的单词向量,您可以快速入门!

NLTK (NLTK)

  • Popularity: ⭐⭐⭐⭐⭐

    人气:⭐⭐⭐⭐⭐
  • Official Website: https://www.nltk.org/

    官方网站: https : //www.nltk.org/

  • Github: https://github.com/nltk/nltk

    GitHub: https : //github.com/nltk/nltk

  • Explanation: Similar to Spacy, it is another popular preprocessing library for modern NLP. Its function ranges from tokenization, stemming, tagging, to parsing and semantic reasoning. Personally, NLTK is my favorite preprocessing library of choice because I just like how easy NLTK is. It just gets the job done, and fast.

    说明:与Spacy相似,它是现代NLP的另一个流行的预处理库。 它的功能范围从标记化,词干提取,标记到解析和语义推理。 就我个人而言,NLTK是我最喜欢的预处理库,因为我喜欢NLTK多么容易。 它只是完成工作,而且速度很快。

火炬文本 (TorchText)

  • Popularity: ⭐⭐⭐⭐

    人气:⭐⭐⭐⭐
  • Official Website: https://torchtext.readthedocs.io/en/latest/

    官方网站: https : //torchtext.readthedocs.io/en/latest/

  • Github: https://github.com/pytorch/text

    GitHub: https : //github.com/pytorch/text

  • Explanation: TorchText is officially supported by Pytorch, and hence grew popularity. It contains convenient data processing utilities to process and prepare them in batches before you feed them into your deep learning framework. I use TorchText quite a lot for loading in my train, validation, and test datasets to do tokenization, vocab construction, and create iterators, which can be used later on by dataloaders. It really comes in as a handy tool that handles all the hefty work for you in a few simple lines. You can also easily use pretrained word embeddings, like Word2Vec or FastText, for your datasets, easily. You can see how I use TorchText by looking at my BERT Text Classification Using Pytorch article.

    说明:TorchText得到Pytorch的正式支持,因此越来越受欢迎。 它包含便利的数据处理实用程序,可在批量处理和准备它们之前将其输入到深度学习框架中。 我将TorchText大量用于加载训练,验证和测试数据集,以进行标记化,vocab构造和创建迭代器,这些稍后可被数据加载器使用。 它确实是一个方便的工具,可以用几行简单的代码为您处理所有繁重的工作。 您还可以轻松地为数据集轻松使用经过预训练的单词嵌入,例如Word2Vec或FastText。 通过查看我的使用Pytorch进行的BERT文本分类,可以了解我如何使用TorchText

变形金刚 (Transformers)

拥抱的脸 (Huggingface)

Image for post
Source) 来源 )
  • Popularity: ⭐⭐⭐⭐⭐

    人气:⭐⭐⭐⭐⭐
  • Official Website: https://huggingface.co/

    官方网站: https : //huggingface.co/

  • Github: https://github.com/huggingface/transformers

    GitHub: https : //github.com/huggingface/transformers

  • Explanation: This is the most popular library out there that implements a wide variety of transformers, from BERT and GPT-2 to BART and Reformer. I use it on a daily basis, and from my own experience, their code readability and documentation are crispy clear. In their official github repo, they even organized their python scripts by different tasks, such as language modelling, text generation, question answering, multiple choice, etc. They have built-in scripts for running the baseline transformers for each of these tasks, so it’s really convenient to use them!

    说明:这是最流行的库,它实现了从BERT和GPT-2到BART和Reformer的各种转换器。 我每天都使用它,根据我自己的经验,它们的代码可读性和文档清晰易读。 在他们的官方github存储库中 ,他们甚至通过不同的任务来组织python脚本,例如语言建模,文本生成,问题回答,多项选择等。他们具有内置的脚本,用于为每个任务运行基线转换器,因此使用它们真的很方便!

具体任务 (Specific Tasks)

Gensim (Gensim)

  • Popularity: ⭐⭐⭐

    人气:⭐⭐⭐
  • Official Website: https://radimrehurek.com/gensim/

    官方网站: https : //radimrehurek.com/gensim/

  • Github: https://github.com/RaRe-Technologies/gensim

    GitHub: https : //github.com/RaRe-Technologies/gensim

  • Task: Topic Modeling, Text Summarization, Semantic Similarity

    任务:主题建模,文本摘要,语义相似度
  • Explanation: Gensim is a high-end, industry-level software for topic modeling of a specific piece of text. It is very robust, platform-independent, and scalable. I used it when I was doing my internship at an AI startup where we want to judge the semantic similarity between two newspaper articles. There’s a really simple function call that allows you to do just that and return their similarity score, so it’s extremely handy!

    说明:Gensim是用于特定文本主题建模的高端行业级软件。 它非常健壮,与平台无关且可扩展。 当我在一家AI初创公司实习时,我用它来判断两个报纸文章之间的语义相似性。 有一个非常简单的函数调用,使您可以执行此操作并返回其相似性分数,因此非常方便!

OpenNMT (OpenNMT)

  • Popularity: ⭐⭐⭐

    人气:⭐⭐⭐
  • Official Website: https://opennmt.net/

    官方网站: https//opennmt.net/

  • Github: https://github.com/OpenNMT/OpenNMT-py

    GitHub: https : //github.com/OpenNMT/OpenNMT-py

  • Task: Machine Translation

    任务:机器翻译
  • Explanation: OpenNMT is a convenient and powerful tool for the machine translation and sequence learning tasks. It contains highly configurable models and training procedures that make it a very simple framework to use. I have coworkers who would recommend using OpenNMT for different kinds of sequence learning tasks because it’s open-source and simple.

    说明:OpenNMT是用于机器翻译和序列学习任务的便捷而强大的工具。 它包含高度可配置的模型和培训过程,使其成为一个非常简单的框架。 我有一些同事建议使用OpenNMT进行各种类型的序列学习任务,因为它是开源且简单的。

帕拉 (ParlAI)

Image for post
Source) 来源 )
  • Popularity: ⭐⭐⭐

    人气:⭐⭐⭐
  • Official Website: https://parl.ai/

    官方网站: https : //parl.ai/

  • Github: https://github.com/facebookresearch/ParlAI

    GitHub: https : //github.com/facebookresearch/ParlAI

  • Task: Task-Oriented Dialogue, Chit-chat Dialogue, Visual Question Answering

    任务:面向任务的对话,聊天对话,视觉问答
  • Explanation: ParlAI is Facebook’s #1 framework for sharing, training, and testing dialogue models for different kinds of dialogue tasks. It provides an all-in-one environment for supporting a wide variety of reference models, pretrained models, datasets, etc. Unlike most of the other tools on this list, ParlAI requires some level of coding and machine learning expertise, if you want to customize things on your own. In other words, it’s a bit more complicated to use but nevertheless a great tool to use if you’re into dialogue.

    说明:ParlAI是Facebook的#1框架,用于共享,培训和测试用于各种对话任务的对话模型。 它提供了一个支持多种参考模型,预训练模型,数据集等的一体化环境。与该列表中的大多数其他工具不同,ParlAI需要一定水平的编码和机器学习专业知识,如果您要自行定制东西。 换句话说,它使用起来有点复杂,但是如果您要进行对话,它还是一个很好的工具。

深帕夫洛夫 (DeepPavlov)

Image for post
Source) 来源 )
  • Popularity: ⭐⭐⭐

    人气:⭐⭐⭐
  • Official Website: http://deeppavlov.ai/

    官方网站: http : //deeppavlov.ai/

  • Github: https://github.com/deepmipt/DeepPavlov

    GitHub: https : //github.com/deepmipt/DeepPavlov

  • Task: Task-Oriented Dialogue, Chit-chat Dialogue

    任务:面向任务的对话,聊天对话
  • Explanation: An alternative to ParlAI, I would say DeepPavlov is more for application and deployment rather than research, although you could definitely still do quite a lot of customization with DeepPavlov. I would argue that DeepPavlov to ParlAI is like Tensorflow to Pytorch. DeepPavlov is a framework mainly for chatbots and virtual assistants development, as it provides all the environment tools necessary for a production-ready and industry-grade conversational agent. I have used it once during a hackathon, fine-tuning a conversational agent to the restaurant domain (so that users can check the menu and order the food they want), and the end result works like a charm!

    说明:除了ParlAI之外,我想说DeepPavlov更适合于应用程序和部署,而不是用于研究,尽管您仍然可以使用DeepPavlov进行很多自定义。 我会认为,DeepPavlov到ParlAI就像Tensorflow到Pytorch。 DeepPavlov是主要用于聊天机器人和虚拟助手开发的框架,因为它提供了生产就绪和行业级对话代理所需的所有环境工具。 我在黑客马拉松期间使用过一次,将会话代理微调到餐厅域(以便用户可以检查菜单并订购所需的食物),最终结果就像一个魅力!

翻译自: https://towardsdatascience.com/top-nlp-libraries-to-use-2020-4f700cdb841f

nlp顶级期刊

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值