每日新闻预测经融市场变化
TF-IDF + SVM标准版
数据集下载:https://www.kaggle.com/aaron7sun/stocknews#Combined_News_DJIA.csv
从discription可以看出下载下来的三个文件
Combined_News_DJIA.csv是已经处理之后的文件,而DJIA是股市每日情况,RedditNews是每日新闻热度Top25
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import roc_auc_score
data = pd.read_csv('input/Combined_News_DJIA.csv')
因为要考虑整个25条新闻,所以我们把Top25直接全部连接在一起,组成data新的一列叫combined_news
data['combined_news'] = data.filter(regex = ("Top.*")).apply(lambda x:''.join(str(x.values)),axis = 1)
data.filter(regex = (“Top.*”))过滤出Top25的列,然后按照后面apply的组合方式,组成一整大列,然后data就一共有28列了
.apply(lambda)对数据使用函数的操作:https://blog.csdn.net/m0_37712157/article/details/84331493
按照题目所说的划分测试集和训练集:
train = data[data['Date'] < '2015-01-01']
test = data[data['Date'] > '2014-12-31']
TF-IDF方法处理数据
###TF-IDF
#TF-IDF 评估一个词在文件中的重要程度,倾向于过滤掉常用的词语,保留重要的词语
feature_extraction = TfidfVectorizer()
X_train = feature_extraction.fit_transform(train['combined_news'].values)
#fit_transform相当于先fit再transform,即先进行数据拟合,再进行标准化处理
#fit_transform返回的是降维之后的结果,而且是对列压缩的
X_test = feature_extraction.transform(test['combined_news'].values)
#测试集上我们只需要对数据标准化不需要拟合,所以只用transform
#transform只能在fit之后调用,否则会报错
y_train = train['Label'].values
y_test = test['Label'].values
#其实不加.values完全可以,但这里是为了统一数据结构,让他们都变成矩阵的形式
这里用TF-IDF处理了train和test的‘combined’列,不同的是train的数据要fit_transform,而test的数据只需要transform
接下来用SVM训练模型,并用处理好的数据放进去训练
svm使用:https://blog.csdn.net/qq_41577045/article/details/79859902
###SVC
clf = SVC(probability = True,kernel = 'rbf')
clf.fit(X_train,y_train)
用训练好的模型预测测试集上的数据,用题目规定的AUC作为评估
predictions = clf.predict_proba(X_test)#返回的是为0和为1的概率
print('ROC-AUC yields: ' + str(roc_auc_score(y_test, predictions[:,1])))
ROC-AUC yields: 0.5731406810035843
预测的结果只有0.573,这个结果就是相当低
注意roc-auc的使用:
from sklearn.metrics import roc_auc_score
score = roc_auc_score(y_test, prediction[:,1])
关于ROC和AUC参考:
https://blog.csdn.net/shenxiaoming77/article/details/72627882
各种评估函数的使用:
https://blog.csdn.net/qq_16095417/article/details/79590455
尝试:对放进TF-IDF的数据进行文本预处理
我们这样直接把文本放进TF-IDF,虽然简单方便,但是还是不够严谨的。 我们可以把原文本做进一步的处理。
文本预处理
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import roc_auc_score
data = pd.read_csv('input/Combined_News_DJIA.csv')
data['combined_news'] = data.filter(regex = ("Top.*")).apply(lambda x:''.join(str(x.values)),axis = 1)
train = data[data['Date'] < '2015-01-01']
test = data[data['Date'] > '2014-12-31']
###文本处理
#小写化low(),同时去掉这些符号
X_train = train['combined_news'].str.lower().str.replace('"','').str.replace("'",'').str.split()
X_test = test['combined_news'].str.lower().str.replace('"','').str.replace("'",'').str.split()
#print(X_test[1611])
print(len(X_test[1611]))
#删除停止词
from nltk.corpus import stopwords
stop = stopwords.words('english')
#删除数字词,输入的str里面有数字就会Ture,没有数字就False
#hasNumbers('I have 3 apples')就会返回Ture
#hasNumbers('she is beautiful')就会返回False
import re
def hasNumbers(inputString):
return bool(re.search(r'\d',inputString))
#lemma处理,处理单复数,时态等变化的词
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
def check(word):
#如果需要这个单词,则True
#如果应该除去,则False
#要除去的词是数字,停止词
if word in stop:
return False
elif hasNumbers(word):
return False
else:
return True
X_train = X_train.apply(lambda x: [wordnet_lemmatizer.lemmatize(item) for item in x if check(item)])
X_test = X_test.apply(lambda x: [wordnet_lemmatizer.lemmatize(item) for item in x if check(item)])
#print(X_test[1611])
print(len(X_test[1611]))
可以看出这样处理完之后,原本X_test[1611]有507个词,现在只有329个词了
得出的X_train[i],X_test[i]是list类的词,而sklearn的输入只能是string类型的,所以得把list再变回去:
#因为外部库,比如sklearn 只支持string输入,所以我们把调整后的list再变回string
X_train = X_train.apply(lambda x:' '.join(x))
X_test = X_test.apply(lambda x:' '.join(x))
重新训练:
###训练模型
#将处理之后的文本丢到TF-IDF里面
feature_extraction = TfidfVectorizer(lowercase = False)
X_train = feature_extraction.fit_transform(X_train.values)
X_test = feature_extraction.transform(X_test.values)
#训练
y_train = train['Label']
y_test = test['Label']
clf = SVC(probability = True,kernel = 'rbf')
clf.fit(X_train,y_train)
predictions = clf.predict_proba(X_test)
print('ROC-AUC yields: ' + str(roc_auc_score(y_test,predictions[:,1])))
ROC-AUC yields: 0.465809811827957
发现更差了,哎,弄巧成拙
造成如此的原因有几种:
• 数据点太少,自然语言处理,几千条数据就是非常小的了
在大量的数据下,标准的文本预处理流程还是需要的,以提高机器学习的准确度。
• One-Off result
我们到现在都只是跑了一次而已。如果我们像前面的例子一样,用Cross Validation来玩这组数据,说不定我们会发现,分数高的clf其实是overfitted了的。
所以,最好是要给自己的clf做好CV验证。
for gamma in [0.5,1,1.5]:
clf = SVC(probability = True,kernel = 'rbf',C = gamma)
clf.fit(X_train,y_train)
predictions = clf.predict_proba(X_test)
print('ROC-AUC yields: ' + str(roc_auc_score(y_test,predictions[:,1])))
改变参数C,发现,其实gamma在这三个值的时候,分数都是一样的,这是一个overfit的结果
改进版 用word2vec
word2vec:
这是一种常见的可以用于训练词向量的模型工具。常见的做法是,我们先用word2vec在公开数据集上预训练词向量,加载到自己的模型中,对词向量进行调整,调整成适合自己数据集的词向量。
我们想要训练词向量,可以先去训练一个语言模型,然后将模型中对应的参数,作为词向量。从任务形式上看,我们是在训练语言模型,而实际上我们最终的目标是想得到词向量,我们更关心的是这个词向量合不合理。
Word2vec根据上下文之间的出现关系去训练词向量,有两种训练模式,Skip Gram和CBOW(constinuous bags of words),其中Skip Gram根据目标单词预测上下文,CBOW根据上下文预测目标单词,最后使用模型的部分参数作为词向量。
参考
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import roc_auc_score
data = pd.read_csv('input/Combined_News_DJIA.csv')
data['combined_news'] = data.filter(regex = ("Top.*")).apply(lambda x:''.join(str(x.values)),axis = 1)
train = data[data['Date'] < '2015-01-01']
test = data[data['Date'] > '2014-12-31']
X_train = train[train.columns[2:]]#取第三列之后的所有列,就是新闻内容
corpus = X_train.values.flatten().astype(str)
X_train = X_train.values.astype(str)
X_train = np.array([' '.join(x) for x in X_train])
X_test = test[test.columns[2:]]
X_test = X_test.values.astype(str)
X_test = np.array([' '.join(x) for x in X_test])
#这一次与上一次的不同在于这一次X_train,X_test的每一项都是25条新闻空格连接
#上一次X_train,X_test每一条25条数据空格连接
y_train = train['Label'].values
y_test = test['Label'].values
接下来将corpus,X_train,X_test token之后做预处理:
#token
from nltk.tokenize import word_tokenize
corpus = [word_tokenize(x) for x in corpus]
X_train = [word_tokenize(x) for x in X_train]
X_test = [word_tokenize(x) for x in X_test]
#预处理,对这些词进行一些处理
#停止词
from nltk.corpus import stopwords
stop = stopwords.words('english')
# 数字
import re
def hasNumbers(inputString):
return bool(re.search(r'\d', inputString))
# 特殊符号
def isSymbol(inputString):
return bool(re.match(r'[^\w]', inputString))
# lemma
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
def check(word):
"""
如果需要这个单词,则True
如果应该去除,则False
"""
word= word.lower()
if word in stop:
return False
elif hasNumbers(word) or isSymbol(word):
return False
else:
return True
# 把上面的方法综合起来
def preprocessing(sen):
res = []
for word in sen:
if check(word):
# 这一段的用处仅仅是去除python里面byte存str时候留下的标识。。之前数据没处理好,其他case里不会有这个情况
word = word.lower().replace("b'", '').replace('b"', '').replace('"', '').replace("'", '')
res.append(wordnet_lemmatizer.lemmatize(word))
return res
corpus = [preprocessing(x) for x in corpus]
X_train = [preprocessing(x) for x in X_train]
X_test = [preprocessing(x) for x in X_test]
用Word2Vec来构建模型:
#NLP模型建立,用Word2Vec
from gensim.models.word2vec import Word2Vec
model = Word2Vec(corpus, size=128, window=5, min_count=5, workers=4)
#corpus在这里充当语料库
#size表示的是每个单词化成的向量维数,每个单词表示为一个128维的向量
然后训练出来的model是一个类似于字典形式的,可以查询每个单词的对应向量:
model['ok']
__main__:1: DeprecationWarning: Call to deprecated `__getitem__` (Method will be removed in 4.0.0, use self.wv.__getitem__() instead).
Out[39]:
array([-0.01584208, -0.02958466, 0.5098639 , -0.1369552 , -0.4654954 ,
-0.01174712, 0.04270953, 0.09887518, -0.05058822, -0.22362761,
-0.23739608, 0.03234359, 0.37663117, -0.05731624, -0.00294153,
-0.00754576, -0.4275943 , 0.1519821 , -0.01311634, -0.03069001,
-0.23286118, -0.05427589, 0.2264771 , 0.2418582 , -0.0813219 ,
0.20979899, 0.12932701, 0.00176668, 0.06962099, 0.06487346,
-0.12961076, 0.04568287, -0.27894038, 0.2907577 , -0.03701821,
0.10349165, 0.03985967, -0.46352533, -0.5269771 , -0.11418458,
-0.01191396, -0.23144543, 0.06827563, 0.2793093 , -0.24930142,
-0.33098182, 0.18044183, 0.16709165, 0.23222122, -0.00637116,
-0.28560513, 0.34036222, -0.19116068, 0.22979097, 0.02301663,
0.25902802, 0.1584345 , -0.0042284 , 0.10659724, 0.40967578,
-0.11445937, 0.01494275, -0.20579734, 0.22310092, -0.22266337,
0.5386355 , -0.289691 , -0.01439595, 0.00652405, -0.37024835,
-0.2523714 , -0.03786295, 0.31479907, -0.16672017, -0.33562368,
-0.02529937, 0.18756555, 0.04788949, 0.17027655, -0.1520183 ,
0.07840852, 0.02838556, 0.39173317, -0.22047943, -0.22331487,
-0.12048835, -0.01034423, -0.04054227, 0.1320308 , 0.15630914,
-0.13549636, 0.13755727, -0.35442704, -0.10574141, 0.16559574,
-0.0106852 , 0.2397672 , 0.09315544, 0.02499787, 0.06461917,
-0.17273149, -0.18527 , 0.33998254, -0.00428575, -0.35996175,
-0.09236343, -0.10349338, 0.01007131, 0.13169143, 0.00914242,
-0.09374845, 0.00852427, -0.07854536, 0.11373628, 0.12150054,
0.0197126 , -0.09240922, -0.32800335, 0.15026633, 0.36717716,
0.07932977, -0.02188378, 0.06241872, -0.14118454, -0.171273 ,
-0.10940692, -0.26989475, -0.06113369], dtype=float32)
注意,查询的值只能是出现在这个语料库里面的,这里的语料库之外的词是查询不到的
也有这样一种情况,就是X_test中的单词,有不存在于训练集中的单词,这种单词没有向量表示,我们把他们用0向量(128维)表示。
取一条text中所有词对应的向量,取平均值(这个方法比较粗糙,也可以采用更优化的求文本向量的方法,如CNN):
#先拿到全部的vocabulary
vocab = model.wv.vocab
#得到任意text的vect
def get_vector(word_list):#得到了一个取得任意word list平均vector值得方法
res = np.zeros([128])
count = 0
for word in word_list:
if word in vocab:
res += model[word]
count += 1
return res/count
print(get_vector(['hello','from','the','other','side']))
可以得到一条被处理过的文本的向量
[ 3.81215563e-02 1.60741026e-02 1.51827719e-01 -1.51812445e-01
-2.83342870e-01 5.76332767e-02 4.32331905e-02 1.52466558e-03
2.75554582e-03 -1.55956358e-01 -1.78120277e-01 1.58450831e-01
1.72938515e-01 -6.75016994e-02 3.03653711e-02 2.84758339e-02
-1.21951547e-01 -3.11275320e-03 7.20096345e-02 -1.17870593e-01
-1.60197796e-02 -8.26392701e-02 1.74774529e-01 1.82438767e-01
5.89463893e-02 1.89234236e-01 1.50538161e-01 6.19931632e-02
2.08232243e-01 -4.04080779e-02 -1.93165269e-01 -9.05749485e-02
-5.33487953e-02 1.47080104e-01 -2.11483567e-02 5.02497188e-02
2.37282212e-02 -2.46209647e-01 -1.83873730e-01 -1.90260783e-02
1.02788117e-01 -5.20426525e-02 -1.51936505e-03 9.34389398e-02
-1.39375853e-01 -6.35695422e-02 1.08566893e-01 1.80821008e-01
6.71343513e-02 9.40395726e-02 -6.20883529e-02 1.35063146e-01
-1.08306713e-01 1.12932120e-01 7.72196474e-03 2.44231404e-01
1.54974364e-01 1.14956126e-04 1.10672988e-01 1.20839390e-01
-1.01233070e-01 9.20014586e-02 -4.23621003e-02 1.18135101e-01
-2.12540417e-01 1.79671003e-01 -1.53954467e-01 -8.98403373e-02
1.09216817e-01 -3.44587697e-01 -2.05374193e-01 -1.67598637e-02
1.32400394e-01 -1.42446084e-01 -1.38313826e-01 -1.85300048e-02
-2.90162921e-02 7.94883467e-02 8.30683358e-02 -3.18877667e-02
-9.62589984e-04 5.79078370e-02 1.71565850e-01 4.66199432e-02
-4.94520711e-02 -5.89664638e-02 1.01068559e-01 -9.59870270e-02
1.03908917e-01 4.09132067e-02 -6.32275219e-02 3.88595828e-02
-2.46873001e-01 -4.81784699e-02 6.72939970e-02 -2.64043996e-02
1.52736382e-01 -3.76823525e-02 2.38678415e-02 3.52616726e-02
-7.66372282e-02 -1.36089351e-01 1.65853684e-01 3.39706495e-02
-2.29481491e-01 -9.13679834e-02 3.77271744e-02 -3.87361014e-02
1.06059315e-01 -1.20156247e-01 3.83753364e-03 7.71909682e-02
3.99199488e-02 1.05765648e-01 1.20971320e-01 6.91625569e-03
1.29097551e-02 -1.26498283e-01 1.08987674e-01 2.43091170e-01
-3.03347883e-02 -2.31382639e-02 9.65377958e-02 -1.17338355e-01
-2.02882424e-02 -7.22090930e-02 -8.35210342e-02 -1.24551871e-01]
把所有的X_train,X_test都表示出来
###把所有文本数据都处理成向量
wordlist_train = X_train
wordlist_test = X_test
X_train = [get_vector(x) for x in X_train]
X_test = [get_vector(x) for x in X_test]
print(X_train[10])
用SVR模型训练,并进行交叉验证:
###建立SVM模型
#因为这里的128维数是连续型的数值,比较适合连续函数的分类方法
from sklearn.svm import SVR#SVC:SVM Classification SVG:SVM Regression
from sklearn.model_selection import cross_val_score
params = [0.1,0.5,1,3,5,7,10,12,16,20,25,30,35,40,50,60]
test_scores = []
for param in params:
clf = SVR(gamma = param)
test_score = cross_val_score(clf,X_train,y_train,cv = 3,scoring = 'roc_auc')
test_scores.append(np.mean(test_score))
import matplotlib.pyplot as plt
plt.plot(params,test_scores)
plt.title('Param vs CV AUC Score')
发现最好结果在gamma在20到30之间继续在这个区间取值交叉验证:

最后决定用gamma = 24
clf = SVR(gamma = 24)
clf.fit(X_train,y_train)
predictions = clf.predict(X_test)
print('ROC-AUC yields: ' + str(roc_auc_score(y_test,predictions)))
ROC-AUC yields: 0.45463709677419356
模型选用CNN神经网络
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
import pandas as pd
import numpy as np
from sklearn.metrics import roc_auc_score
data = pd.read_csv('input/Combined_News_DJIA.csv')
data['combined_news'] = data.filter(regex = ("Top.*")).apply(lambda x:''.join(str(x.values)),axis = 1)
train = data[data['Date'] < '2015-01-01']
test = data[data['Date'] > '2014-12-31']
X_train = train[train.columns[2:]]#取第三列之后的所有列,就是新闻内容
corpus = X_train.values.flatten().astype(str)
#直接把每条新闻当成一句话,flatten()得到一个list,每一项都是一条新闻
X_train = X_train.values.astype(str)
X_train = np.array([' '.join(x) for x in X_train])
X_test = test[test.columns[2:]]
X_test = X_test.values.astype(str)
X_test = np.array([' '.join(x) for x in X_test])
#这一次与上一次的不同在于这一次X_train,X_test的每一项都是25条新闻空格连接
#上一次X_train,X_test每一条25条数据空格连接
y_train = train['Label'].values
y_test = test['Label'].values
#token
from nltk.tokenize import word_tokenize
corpus = [word_tokenize(x) for x in corpus]
X_train = [word_tokenize(x) for x in X_train]
X_test = [word_tokenize(x) for x in X_test]
#预处理,对这些词进行一些处理
#停止词
from nltk.corpus import stopwords
stop = stopwords.words('english')
# 数字
import re
def hasNumbers(inputString):
return bool(re.search(r'\d', inputString))
# 特殊符号
def isSymbol(inputString):
return bool(re.match(r'[^\w]', inputString))
# lemma
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
def check(word):
"""
如果需要这个单词,则True
如果应该去除,则False
"""
word= word.lower()
if word in stop:
return False
elif hasNumbers(word) or isSymbol(word):
return False
else:
return True
# 把上面的方法综合起来
def preprocessing(sen):
res = []
for word in sen:
if check(word):
# 这一段的用处仅仅是去除python里面byte存str时候留下的标识。。之前数据没处理好,其他case里不会有这个情况
word = word.lower().replace("b'", '').replace('b"', '').replace('"', '').replace("'", '')
res.append(wordnet_lemmatizer.lemmatize(word))
return res
corpus = [preprocessing(x) for x in corpus]
X_train = [preprocessing(x) for x in X_train]
X_test = [preprocessing(x) for x in X_test]
#NLP模型建立,用Word2Vec
from gensim.models.word2vec import Word2Vec
model = Word2Vec(corpus, size=128, window=5, min_count=5, workers=4)
#corpus在这里充当语料库
#size表示的是每个单词化成的向量维数,每个单词表示为一个128维的向量
#这里只是得到了每个单词的向量,我们并不能得到一条文本text的向量
#存储处理好的X_train,X_test,和用corpus训练好的Word2Vec模型
import pickle
with open('X_test.pickle','wb') as file1:
pickle.dump(X_test,file1)
with open('X_train.pickle','wb') as file2:
pickle.dump(X_train,file2)
with open('word2vec_model.pickle','wb') as file3:
pickle.dump(model,file3)
由于我现在的环境里没有装keras,所以将数据保存下来,在另一个环境中使用
打开另一个装了神经网络的环境,将数据加载进来:
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
import pandas as pd
import numpy as np
from sklearn.metrics import roc_auc_score
data = pd.read_csv('input/Combined_News_DJIA.csv')
data['combined_news'] = data.filter(regex = ("Top.*")).apply(lambda x:''.join(str(x.values)),axis = 1)
train = data[data['Date'] < '2015-01-01']
test = data[data['Date'] > '2014-12-31']
y_train = train['Label'].values
y_test = test['Label'].values
import pickle
with open('X_train.pickle','rb') as file1:
X_train = pickle.load(file1)
with open('X_test.pickle','rb') as file2:
X_test = pickle.load(file2)
with open('word2vec_model.pickle','rb') as file3:
model = pickle.load(file3)
X_train,X_test,model,y_test,y_train还是需要的
模型:
###用model把每天的新闻前256个单词转成256个128维的向量,就是一个256*128的矩阵
#说明,对于每天的新闻,我们会考虑前256个单词。不够的我们用[000000]补上
# vec_size 指的是我们本身vector的size
def transform_to_matrix(x, padding_size=256, vec_size=128):
res = []
for sen in x:
matrix = []
for i in range(padding_size):
try:
matrix.append(model[sen[i]].tolist())
except:
# 这里有两种except情况,
# 1. 这个单词找不到
# 2. sen没那么长
# 不管哪种情况,我们直接贴上全是0的vec
matrix.append([0] * vec_size)
res.append(matrix)
return res
X_train = transform_to_matrix(X_train)
X_test = transform_to_matrix(X_test)
#X_train变成了1611个矩阵
#X_test变成了378个矩阵
#变成np数组,便于处理
X_train = np.array(X_train)
X_test = np.array(X_test)
print('X_train.shape:',X_train.shape)
print('X_test.shape:',X_test.shape)
#转化成cnn能处理的4D张量
X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1], X_train.shape[2])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1], X_test.shape[2])
print('X_train.reshape',X_train.shape)
print('X_test.reshape',X_test.shape)
###开始建网络
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Convolution2D,MaxPooling2D
from keras.layers.core import Dense,Dropout,Activation,Flatten
#set parameters
batch_size = 32
n_filter = 16
filter_length = 4
nb_epoch = 5
n_pool = 2
#新建sequential模型
model = Sequential()
model.add(Convolution2D(n_filter,(filter_length,filter_length),activation = 'relu',input_shape = (1,256,128),data_format = 'channels_first'))
model.add(Convolution2D(n_filter,(filter_length,filter_length),activation = 'relu'))
model.add(MaxPooling2D(pool_size = (n_pool,n_pool)))
model.add(Dropout(0.25))
model.add(Flatten())
#接上一个ANN
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('softmax'))
#compile模型
model.compile(loss = 'mse',optimizer = 'adadelta',metrics = ['accuracy'])
#fit
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
verbose=0)
score = model.evaluate(X_test, y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
model.add(Convolution2D(n_filter,(filter_length,filter_length),activation = ‘relu’,input_shape = (1,256,128),data_format = ‘channels_first’))
这一句中data_format = ‘channels_first’非常重要,没有加这一句一直报错:
(op: 'Conv2D') with input shapes: [?,1,256,128], [4,4,128,16].
参考
这通常是因为通道数chanel的位置放置错误
keras框架同时支持tensorflo和theano两种格式,这里将X_train,X_test变成了:
X_train.reshape (1611, 1, 256, 128)
X_test.reshape (378, 1, 256, 128)
的格式了,显然是用theano约定,所以应该把channels前置,channel_first
如果是tensorflow的约定,一般是(1611,256,128,1)
如果是theano的约定,一般是(1611,1,256,128)
如rgb三通道图片,samples = 128时:
tensorflow约定:(128,256,256,3)
theano约定:(128,3,256,256)
最后的结果是:
Test score: 0.4920634942710715
Test accuracy: 0.5079365098287189