BrainEditor-CSDN博客

原创 ValueError: invalid literal for int() with base 10: ‘2021-04-05T00:00:00.000+00:00‘

经过haggle with gpt 3.5 一番之后，它终于能给出了正确的代码，如下。将如标题中的时间提取出来是，书上给的代码是。

2024-04-29 22:03:52 178

原创文本清理代码快速查找(copy and paste series)

文本清理代码快速查找(copy and paste series)去掉标点text = re.sub("[\s+.!/_,$%^(+"’]+|[+——！，。？、~@#￥%……&（）]+",“”,text)新的改变我们对Markdown编辑器进行了一些功能拓展与语法支持，除了标准的Markdown编辑器功能，我们增加了如下几点新功能，帮助你用它写博客：全新的界面设计，将会...

2020-02-10 14:01:03 438

原创网页爬虫权威指南 (chap1-2)（web scraping with python, 2e. by Ryan Mitchell)

Chapter 1 Begining to Scrapefrom urllib.request import urlopenhtml = urlopen(‘http://www.chinadaily.com.cn/a/202002/07/WS5e3c81dea310128217275978**.html’**)print(html.read())from urllib.request im...

2020-02-07 11:33:21 469

转载中文情感分析数据

情感分析资源大全（语料、词典、词嵌入、代码）原创 ...

2020-02-06 20:54:01 5201

转载中文数据集

中文NLP语料整理新闻文本...

2020-02-06 20:40:57 1549 1

原创博客搬家

准备达里搬运各大网站的博客到此处。颤抖吧！

2020-02-06 20:23:33 110

转载史上最全数据集网站汇总

如果你是一个初学者，你每完成一个新项目后自身能力都会有极大的提高，如果你是一个有经验的数据科学专家，你已经知道这里所蕴含的价值。本文将为您提供一个网站/资源列表，从中你可以使用数据来完成你自己的数据项目，甚至创造你自己的产品。一.如何使用这些资源?如何使用这些数据源是没有限制的，应用和使用只受...

2020-02-06 19:53:02 1012

原创 (转发）免费数据集下载（持续更新中...）

刚刚知道这个网站，记录下·https://blog.csdn.net/alec1987/article/details/69388699自然语言处理RCV1英语新闻数据20news 英语新闻数据First Quora Release Question Pairs 问答数据JRC Names各国语言专有实体名称Multi-Domain Sentiment V2.0LETOR 信息检索...

2020-02-06 19:33:04 610

原创 Recipe 6-3. Next Word Prediction

in this section, we will build an LSTM model to learn sequences of wordsfrom email data. We will use this model to predict the next word.file_content = pd.read_csv(‘spam.csv’, encoding = “ISO-8859-1...

2020-02-06 11:17:20 593

原创 Recipe 6-2. Classifying Text with Deep Learning

from unlocking text data with machine learning and deep learning using pythonProblemWe want to build a text classification model using CNN, RNN, and LSTM.SolutionThe approach and NLP pipeline woul...

2020-02-06 10:52:41 283

转载 CHAPTER 6 Deep Learning for NLP

In this chapter, we will implement deep learning for NLP:Recipe 1. Information retrieval using deep learningRecipe 2. Text classification using CNN, RNN, LSTMRecipe 3. Predicting the next word/sequ...

2020-02-06 08:57:40 305

原创 Step 6-3 Query enhancement/expansion

It is very important to understand the possible synonyms of the entities tomake sure search results do not miss out on potential relevance. Say, forexample, men’s shoes can also be called as male sh...

2020-02-06 08:53:34 189

原创 Recipe 5-5. Clustering Documents

Document clustering yet again includes similar steps, so let’s have a look atthem:TokenizationStemming and lemmatizationRemoving stop words and punctuationComputing term frequencies or TF-IDFCl...

2020-02-06 08:22:08 251

转载 text summerization :treerank and feature-based

Import BeautifulSoup and urllib libraries to fetch data from Wikipedia.from bs4 import BeautifulSoupfrom urllib.request import urlopenFunction to get data from Wikipediadef get_only_text(url):pag...

2020-02-06 07:54:38 420

原创 Text summarization

Text summarization is the process of making large documents into smallerones without losing the context, which eventually saves readers time. Thiscan be done using different techniques like the foll...

2020-02-05 22:08:43 189

原创 text processing

Import librariesfrom nltk.corpus import stopwordsfrom textblob import TextBlobfrom textblob import WordLower casing and removing punctuationsdf[‘Text’] = df[‘Text’].apply(lambda x: " “.join(x.low...

2020-02-05 20:42:39 411

转载 CHAPTER 4 Advanced Natural Language Processing

标题CHAPTER 4 Advanced Natural Language Processinghttps://doi.org/10.1007/978-1-4842-4267-4_4In this chapter, we are going to cover various advanced NLP techniquesand leverage machine learning algori...

2020-02-04 16:30:32 164

原创 fasttext for word nlp

from gensim.models import FastTextfrom sklearn.decomposition import PCAfrom matplotlib import pyplot#Example sentencessentences = [[‘I’, ‘love’, ‘nlp’],[‘I’, ‘will’, ‘learn’, ‘nlp’, ‘in’, ‘2’,‘mo...

2020-02-04 16:23:37 200

原创慢慢学习着用吧

unlocking Text Data with Machine learning & Deep Learning Using Pythononly a few lines for now, more later when i am more farmiliar with this shit.But to train these models, it requires a huge...

2020-02-04 16:19:50 253

复制运行了 kaggle 上的ted talk 项目除了最后的那及部分其他已经通关

再自己的电脑上复制运行了 kaggle 上的ted talk 项目。除了最后的那几部分由于无法连接外网，自己的机器配置也达不到外，其他简单的数据描述相关分析，简单的可视化等已经通关。是以记之

2024-05-01

hugging*face follow up

2024-05-20

hugging-face 模型无法安装

2024-05-13

UnboundLocalError: cannot access local variable 就没有一个专家能解决这个问题吗？

2024-04-29

UnboundLocalError: cannot access local variable

2024-04-21

python 代码运行 NameError

2024-04-18

git bash 的make 又在搞事情

2024-04-12

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

weixin_45514087的博客

原创 ValueError: invalid literal for int() with base 10: ‘2021-04-05T00:00:00.000+00:00‘

原创复制粘贴学NLP

原创文本清理代码快速查找(copy and paste series)

原创网页爬虫权威指南 (chap1-2)（web scraping with python, 2e. by Ryan Mitchell)

转载中文情感分析数据

转载中文数据集

原创博客搬家

转载史上最全数据集网站汇总

原创 (转发）免费数据集下载（持续更新中...）

原创 Recipe 6-3. Next Word Prediction

原创 Recipe 6-2. Classifying Text with Deep Learning

转载 CHAPTER 6 Deep Learning for NLP

原创 Step 6-3 Query enhancement/expansion

原创 Recipe 5-5. Clustering Documents

转载 text summerization :treerank and feature-based

原创 Text summarization

原创 text processing

转载 CHAPTER 4 Advanced Natural Language Processing

原创 fasttext for word nlp

原创慢慢学习着用吧

复制运行了 kaggle 上的ted talk 项目除了最后的那及部分其他已经通关

hugging*face follow up

hugging-face 模型无法安装

UnboundLocalError: cannot access local variable 就没有一个专家能解决这个问题吗？

UnboundLocalError: cannot access local variable

python 代码运行 NameError

git bash 的make 又在搞事情

复制运行了 kaggle 上的ted talk 项目 除了最后的那及部分 其他已经通关

hugging*face follow up

hugging-face 模型无法安装

UnboundLocalError: cannot access local variable 就没有一个专家能解决这个问题吗？

UnboundLocalError: cannot access local variable

python 代码运行 NameError

git bash 的make 又在搞事情

复制运行了 kaggle 上的ted talk 项目除了最后的那及部分其他已经通关