![](https://img-blog.csdnimg.cn/20201014180756919.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
天池比赛
宝友你好
这个作者很懒,什么都没留下…
展开
-
天池新闻文本分类_机器学习解决方案
Bag of words表示from sklearn.feature_extraction.text import CountVectorizercorpus = [ 'This is the first document.', 'This document is the second document.', 'And this is the third one.', 'Is this the first document?',]vectorizer = CountV转载 2021-09-14 08:26:27 · 70 阅读 · 0 评论 -
天池新闻文本分类_FastText解决方案
Baseimport pandas as pdfrom sklearn.metrics import f1_scoreimport fasttext# 转换为fasttext需要的格式train_df = pd.read_csv('./data/train_set.csv', sep='\t', nrows=1500)train_df['label_ft'] = '__label__' + train_df['label'].astype(str)# 前10000个转换作为训练集train原创 2021-09-13 20:20:16 · 632 阅读 · 0 评论