python这门语言的弱类型特性,写程序的人爽了,读程序的人心情复杂,用一个例子说明吧:
最近从github上下的一个seq2seq程式,其中获取data的代码如下:
data = pd.read_csv("./data/neural_network_patent_query.csv")
data.head()
training_dict, word_idx, idx_word, sequences = get_data("./data/neural_network_patent_query.csv", training_len = 50)
我现在要改动这堆代码,所以要知道training dict、word idx、idx word、sequences的类型。F12点到get data
def get_data(file, filters="!"%;[\]^_`{|}~ ", training_len=50,
lower=False):
"""Retrieve formatted training and validation data from a file"""
data = pd.read_csv(file, parse_dates=["patent_date"]).dropna(subset = ["patent_abstract"])
abstracts = [format_sequence(a) for a in list(data["patent_abstract"])]
word_idx, idx_word, num_words, word_counts, texts, sequences, features, labels = make_sequences(
abstracts, training_len, lower, filters)
X_train, X_valid, y_train, y_valid = create_train_valid(features, labels, num_words)
training_dict = {"X_train": X_train, "X_valid": X_valid,
"y_train": y_train, "y_valid": y_valid}
return training_dict, word_idx, idx_word, sequences
??????
首先这data = pd.read_csv写了两遍(虽然这个不怪python),,,
接下来这个X train、X valid、y train、y valid又是个啥?
make sequences又是个啥?
fotmat sequences又是个啥?
create train valid又是个啥?
而且vscode无法对这个东西给出类型提示,type()打出来又是numpy.array,我折腾了一个早上才弄懂它想说什么。
总结:如果在C#等王道语言里,这种[4个参数的返回值],是需要用个类来封装的,这大大提升了可读性;此外C#等语言可以直接得到返回值的类型,不像python那样强制var