用50行Python代码对股市新闻进行情感分析

最新推荐文章于 2025-03-30 17:36:00 发布

Python学研大本营

最新推荐文章于 2025-03-30 17:36:00 发布

阅读量829

点赞数 1

文章标签： python 数学建模开发语言

本文链接：https://blog.csdn.net/weixin_39915649/article/details/131068607

版权

用 Python 进行股票新闻情绪分析

微信搜索关注《Python学研大本营》，加入读者群，分享更多精彩

情感分析是一种用于从文本中提取主观信息的技术。在股票新闻的上下文中，情绪分析可用于了解有关特定股票的新闻文章的整体情绪，这可以帮助您做出更明智的决策。

在本教程中，我们将使用 Python 和一些流行的库对股票新闻进行情绪分析。

第 1 步：收集数据

第一步是收集数据。有多种 API 提供对财经新闻的访问，例如 NewsAPI、Bloomberg API 和 Yahoo Finance API。在本教程中，我们将使用 NewsAPI。

要使用 NewsAPI，您需要在他们的网站上注册一个 API 密钥。https://newsapi.org/。它是免费的，大约需要 30 秒。拥有 API 密钥后，您可以使用requestsPython 中的库向 API 发出请求并检索新闻文章。

以下是如何使用 NewsAPI 检索有关 Apple 的新闻文章的示例：

import requests

api_key = 'YOUR API KEY'
url = f'https://newsapi.org/v2/everything?q=Apple&apiKey={api_key}'

response = requests.get(url)
data = response.json()

articles = data['articles']
print(articles)

此代码检索有关股票“Apple”的新闻文章并将它们存储在articles变量中，然后以 json 格式输出文章。您可以修改q参数以搜索有关不同股票的新闻文章。

第 2 步：预处理数据

现在我们知道如何检索新闻文章，您需要在执行情感分析之前对数据进行预处理。这涉及将文本转换为小写并从新闻文章中删除任何不相关的信息，例如停用词和标点符号。

我们将使用该nltk库执行预处理。在使用之前，您需要安装该库并下载停用词语料库。

此函数将新闻文章作为输入并返回文本的预处理版本：

import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

def preprocess_text(text):
    # Convert to lowercase
    text = text.lower()

    # Remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))

    # Tokenize text
    tokens = word_tokenize(text)

    # Remove stop words
    stop_words = set(stopwords.words('english'))
    tokens = [token for token in tokens if token not in stop_words]

    # Rejoin tokens into a string
    text = ' '.join(tokens)

    return text

第 3 步：进行情绪分析

一旦数据经过预处理，我们就可以对新闻文章进行情感分析。我们将使用该vaderSentiment库执行情绪分析。该库专为社交媒体文本的情感分析而设计，已被证明在推文和新闻标题等简短的非正式文本上表现良好。

您需要在使用之前安装该库：

!pip install vaderSentiment

此函数将经过预处理的新闻文章作为输入，并返回介于 -1（负面情绪）和 1（正面情绪）之间的情绪分数：

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

def get_sentiment(text):
    analyzer = SentimentIntensityAnalyzer()
    scores = analyzer.polarity_scores(text)
    sentiment = scores['compound']
    return sentiment

第 4 步：分析数据

现在我们有了收集和预处理新闻文章文本并执行情感分析的函数，我们可以使用这些函数来检索新闻文章，对其进行预处理并执行情感分析：

def get_stock_news_sentiment(api_key, stock):
    # Make request to NewsAPI
    url = f'https://newsapi.org/v2/everything?q={stock}&apiKey={api_key}'
    response = requests.get(url)
    data = response.json()
    sentiments = []
    for article in data['articles']:
        # Preprocess text
        text = article['title'] + ' ' + article['description']
        text = preprocess_text(text)
        
        # Perform sentiment analysis
        sent = get_sentiment(text)
        sentiments.append(sent)
    # Calculate average sentiment
    if len(sentiments) > 0:
        avg_sentiment = sum(sentiments) / len(sentiments)
    else:
        avg_sentiment = 0

    return avg_sentiment

此代码检索有关 Apple 和 Google 的新闻文章，对其进行预处理，执行情绪分析，并打印平均情绪分数：

api_key = 'YOUR_API_KEY'

apple_sentiment = get_news_sentiment(api_key, 'AAPL')
google_sentiment = get_news_sentiment(api_key, 'GOOGL')

print(f'Average sentiment for Apple: {apple_sentiment:.2f}')
print(f'Average sentiment for Google: {google_sentiment:.2f}')

示例输出：Apple 的平均情绪：0.23 Google 的平均情绪：0.19

请注意，情绪分数介于 -1 和 1 之间，其中负分表示负面情绪，正分表示正面情绪，接近 0 的分数表示中性情绪。

以下是完整代码：

import requests
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

def preprocess_text(text):
    # Convert to lowercase
    text = text.lower()

    # Remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))

    # Tokenize text
    tokens = word_tokenize(text)

    # Remove stop words
    stop_words = set(stopwords.words('english'))
    tokens = [token for token in tokens if token not in stop_words]

    # Rejoin tokens into a string
    text = ' '.join(tokens)

    return text

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

def get_sentiment(text):
    analyzer = SentimentIntensityAnalyzer()
    scores = analyzer.polarity_scores(text)
    sentiment = scores['compound']
    return sentiment

def get_stock_news_sentiment(api_key, stock):
    # Make request to NewsAPI
    url = f'https://newsapi.org/v2/everything?q={stock}&apiKey={api_key}'
    response = requests.get(url)
    data = response.json()
    sentiments = []
    for article in data['articles']:
        # Preprocess text
        text = article['title'] + ' ' + article['description']
        text = preprocess_text(text)
        
        # Perform sentiment analysis
        sent = get_sentiment(text)
        sentiments.append(sent)
    # Calculate average sentiment
    if len(sentiments) > 0:
        avg_sentiment = sum(sentiments) / len(sentiments)
    else:
        avg_sentiment = 0

    return avg_sentiment

api_key = 'YOUR API KEY'

apple_sentiment = get_stock_news_sentiment(api_key, 'AAPL')
google_sentiment = get_stock_news_sentiment(api_key, 'GOOGL')

print(f'Average sentiment for Apple: {apple_sentiment:.2f}')
print(f'Average sentiment for Google: {google_sentiment:.2f}')