1. 出问题代码
model = TFBertForSequenceClassification.from_pretrained("bert-base-chinese", num_labels=2, from_pt=True)
2. 报错信息
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\transformers\modeling_tf_utils.py", line 1292, in from_pretrained
missing_keys, unexpected_keys = load_tf_weights(model, resolved_archive_file, load_weight_prefix)
File "D:\Anaconda\lib\site-packages\transformers\modeling_tf_utils.py", line 471, in load_tf_weights
with h5py.File(resolved_archive_file, "r") as f:
File "D:\Anaconda\lib\site-packages\h5py\_hl\files.py", line 406, in __init__
fid = make_fid(name, mode, userblock_size,
File "D:\Anaconda\lib\site-packages\h5py\_hl\files.py", line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (truncated file: eof = 242385056, sblock->base_addr = 0, stored_eof = 478309336)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\pycharm\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 1448, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "D:\pycharm\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "F:/09-code/11-自然语言处理与知识图谱/pypro/chapters03/demo02_Bert实现文本分类.py", line 33, in <module>
model = TFBertForSequenceClassification.from_pretrained("bert-base-chinese", num_labels=2)
File "D:\Anaconda\lib\site-packages\transformers\modeling_tf_utils.py", line 1294, in from_pretrained
raise OSError(
OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True.
3. 报错分析
The error you’re encountering seems to be related to loading weights from an H5 file for the TFBertForSequenceClassification
model in TensorFlow. Let’s break down the error message to understand the problem better and create a step-by-step plan to resolve it.
-
Error Analysis:
- Truncated File: The error
OSError: Unable to open file (truncated file: eof = 242385056, sblock->base_addr = 0, stored_eof = 478309336)
suggests that the H5 file you’re trying to load is truncated or incomplete. This could happen due to an interrupted download or a file corruption issue. - Loading Weights Error: The second part of the error,
OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True
, indicates a problem with loading the weights. It suggests that if you’re trying to load a model trained in PyTorch into TensorFlow, you should use thefrom_pt=True
argument.
- Truncated File: The error
-
Step-by-Step Plan:
- Step 1: Verify the Integrity of the H5 File:
- Ensure that the file is completely and correctly downloaded. You might need to redownload the file if it’s corrupted.
- Step 2: Check Model Source:
- Determine if the model you’re trying to load is originally a PyTorch model. If yes, use the
from_pt=True
argument in thefrom_pretrained
method.
- Determine if the model you’re trying to load is originally a PyTorch model. If yes, use the
- Step 3: Reinstall/Update Dependencies:
- Sometimes, such errors can be caused by outdated or corrupted installations. Consider reinstalling or updating
transformers
andh5py
libraries.
- Sometimes, such errors can be caused by outdated or corrupted installations. Consider reinstalling or updating
- Step 4: Check TensorFlow Version:
- Ensure that your TensorFlow version is compatible with the version used for the model. Incompatibility can lead to such errors.
- Step 1: Verify the Integrity of the H5 File:
-
Default Assumptions:
- If no additional information is provided, I will assume that the H5 file is intended to be compatible with TensorFlow and that you have a stable internet connection for downloading files.
-
Additional Information:
- It would be helpful to know where you obtained the H5 file and if it’s a custom-trained model or a pre-trained model provided by Hugging Face or another source.
-
Confirmation:
- Please confirm if the above plan seems appropriate, or if you have any specific details to add regarding the source of the model or the setup of your environment.
4. 修改后代码, 问题解决
为from_pretrained
方法新增了from_pt=True
model = TFBertForSequenceClassification.from_pretrained("bert-base-chinese", num_labels=2, from_pt=True)