1. 直接运行shell 脚本会出现问题 bash run.sh
直接复制脚本在shell中运行
2. 注意写路径 ‘\’ 的问题,多写或少写都会有问题
3. 参数do_train,do_eval和do_predict分别控制了是否进行训练,评估和预测,可以按需将其设置为True或者False,但至少要有一项设为True
4. 内存不够
- 对于参数max_seq_length, train_batch_size 越小, 内存使用越小
- 对于使用的预训练集 BERT-Base 使用内存比 BERT-Large小
- 使用不同的优化器也会造成一定的影响
max_seq_length: The released models were trained with sequence lengths
up to 512, but you can fine-tune with a shorter max sequence length to save
substantial memory. This is controlled by the max_seq_length flag in our
example code.
train_batch_size: The memory usage is also directly proportional to
the batch size.
5. 读取文件
pandas读取CSV ,结果类似Excel
path='/home/weibo_classification/'
pd_all = pd.read_csv(path + 'weibo