使用lstm进行股票分析_使用lstm与julia进行基本的情绪分析

最新推荐文章于 2024-06-28 12:46:38 发布

杨_明

最新推荐文章于 2024-06-28 12:46:38 发布

阅读量518

点赞数 2

文章标签： java python linux

原文链接：https://medium.com/@EmoryRaphael/basic-sentiment-analysis-with-julia-using-lstm-e12d4754ee6b

版权

该博客介绍如何利用LSTM（长短期记忆网络）进行基本的股票情绪分析，结合Julia语言进行实战操作。

摘要由CSDN通过智能技术生成

使用lstm进行股票分析

The idea of this post is to make an introduction to sentiment analysis using Julia, a language design to high performance, and have a similar syntax with Python.

这篇文章的目的是介绍使用Julia的情感分析，Julia是一种高性能的语言设计，并且与Python具有类似的语法。

Sentiment analysis has grown over the scenario of artificial intelligence in the last years, bring changes in how to collect information about the perception of the user to a certain product, treat patients, discover diseases, etc. Many datasets have been used by researchers to measure their performance, so for this post, we are using IMDB’s dataset which contains reviews from users.

在过去的几年中，情感分析在人工智能的场景中得到了发展，它改变了如何在某种产品上收集有关用户感知的信息，治疗患者，发现疾病等方面的变化。研究人员使用了许多数据集来进行测量他们的表现，因此对于本篇文章，我们使用的是IMDB的数据集，其中包含来自用户的评论。

There are packages in Julia that provide a pre-processed dataset. For this article, we will use the dataset provide by CorpusLoader.To load the dataset we just need a simple command:dataset_train_pos = load(IMDB("train_pos"))

Julia中有一些提供了预处理数据集的软件包。对于本文，我们将使用CorpusLoader提供的数据集。要加载数据集，我们只需要一个简单的命令：dataset_train_pos = load(IMDB(“ train_pos”))

Some variations of this command is passing as parameters: “train_pos”, “train_neg”, “test_pos”, “test_neg”. This will give you part of this dataset already labels.

此命令的某些变体作为参数传递：“ train_pos”，“ train_neg”，“ test_pos”，“ test_neg”。这将为您提供此数据集已带有标签的一部分。

dataset_test_pos = load(IMDB("test_pos"))
dataset_train_neg = load(IMDB("train_neg"))
dataset_test_neg = load(IMDB("test_neg"))

Let’s transform this into a single array of tokens

让我们将其转换为单个令牌数组

julia> using Base.Iterators

julia> docs = collect(take(dataset_train_pos, 2)

This will transform or dataset into an array of arrays ( Array{Array{String,1}}, in summary, we got a list of sentences tokenized. But within the tokens, we can found stopwords. In the next step let’s remove them.

这将把数据集或数据集转换成一个数组数组(Array {Array {String，1}}，总的来说，我们得到了标记化的句子列表。但是在标记中，我们可以找到停用词。下一步，我们将其删除。

停用词 (Stopwords)

Stopwords are words that don’t aggregate value to the sentences, some of them are: is, like, as … and many others.

停用词是不会为句子增加价值的词ÿ

最低0.47元/天解锁文章

杨_明

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
使用lstm进行股票分析_使用lstm与julia进行基本的情绪分析

使用lstm进行股票分析The idea of this post is to make an introduction to sentiment analysis using Julia, a language design to high performance, and have a similar syntax with Python.这篇文章的目的是介绍使用Julia的情感分析，Juli...
复制链接

扫一扫