使用pandas和numpy分析excel时语句pd.read_excel报错

ModuleNotFoundError                       Traceback (most recent call last)
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\compat\_optional.py:135, in import_optional_dependency(name, extra, errors, min_version)
    134 try:
--> 135     module = importlib.import_module(name)
    136 except ImportError:

File ~\AppData\Local\Programs\Python\Python310\lib\importlib\__init__.py:126, in import_module(name, package)
    125         level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File <frozen importlib._bootstrap>:1050, in _gcd_import(name, package, level)

File <frozen importlib._bootstrap>:1027, in _find_and_load(name, import_)

File <frozen importlib._bootstrap>:1004, in _find_and_load_unlocked(name, import_)

ModuleNotFoundError: No module named 'xlrd'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
Cell In[2], line 1
----> 1 table1 = pd.read_excel("./升学46.xlsx", "c4g")
      2 table1.head()

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\excel\_base.py:495, in read_excel(io, sheet_name, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, date_format, thousands, decimal, comment, skipfooter, storage_options, dtype_backend, engine_kwargs)
    493 if not isinstance(io, ExcelFile):
    494     should_close = True
--> 495     io = ExcelFile(
    496         io,
    497         storage_options=storage_options,
    498         engine=engine,
    499         engine_kwargs=engine_kwargs,
    500     )
    501 elif engine and engine != io.engine:
    502     raise ValueError(
    503         "Engine should not be specified when passing "
    504         "an ExcelFile - ExcelFile already has the engine set"
    505     )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\excel\_base.py:1567, in ExcelFile.__init__(self, path_or_buffer, engine, storage_options, engine_kwargs)
   1564 self.engine = engine
   1565 self.storage_options = storage_options
-> 1567 self._reader = self._engines[engine](
   1568     self._io,
   1569     storage_options=storage_options,
   1570     engine_kwargs=engine_kwargs,
   1571 )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\excel\_xlrd.py:45, in XlrdReader.__init__(self, filepath_or_buffer, storage_options, engine_kwargs)
     33 """
     34 Reader using xlrd engine.
     35 
   (...)
     42     Arbitrary keyword arguments passed to excel engine.
     43 """
     44 err_msg = "Install xlrd >= 2.0.1 for xls Excel support"
---> 45 import_optional_dependency("xlrd", extra=err_msg)
     46 super().__init__(
     47     filepath_or_buffer,
     48     storage_options=storage_options,
     49     engine_kwargs=engine_kwargs,
     50 )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\compat\_optional.py:138, in import_optional_dependency(name, extra, errors, min_version)
    136 except ImportError:
    137     if errors == "raise":
--> 138         raise ImportError(msg)
    139     return None
    141 # Handle submodules: if we have submodule, grab parent module from sys.modules

ImportError: Missing optional dependency 'xlrd'. Install xlrd >= 2.0.1 for xls Excel support Use pip or conda to install xlrd.

这个错误表明在你的Python环境中缺少了名为xlrd的库,而pandas在读取Excel文件时需要依赖这个库。你可以通过使用pip来安装xlrd库,方法如下:

1、打开cmd

2、输入pip install xlrd

3、再运行代码,无报错,出现表格

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
这段代码存在几个问题: 1. 缺少导入 `sklearn.model_selection.train_test_split` 的语句,这个函数用于划分数据集。 2. `datas(labeled_size=0.3,test_size=0.1,stratified=False,shuffle=True,random_state=0,default_transforms=True)` 中的 `labeled_size` 和 `test_size` 参数应该使用 `train_size` 和 `test_size` 代替。 3. `datas` 是一个 DataFrame 对象,应该使用 `train_test_split` 函数对其进行划分,而不是将其作为参数传递给 `dataset`。 修改后的代码如下: ``` import pandas as pd import numpy as np from sklearn.model_selection import train_test_split pd.set_option('display.max_columns', None) # 所有列 pd.set_option('display.max_rows', None) # 所有行 data = pd.read_excel('半监督数据.xlsx') X = data.drop(columns=['label']) # 特征矩阵 y = data['label'] # 标签列 # 划分数据集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, stratify=None, shuffle=True, random_state=0) # 划分带标签数据集 labeled_size = 0.3 n_labeled = int(labeled_size * len(X_train)) indices = np.arange(len(X_train)) unlabeled_indices = np.delete(indices, y_train.index[:n_labeled]) X_unlabeled = X_train.iloc[unlabeled_indices] y_unlabeled = y_train.iloc[unlabeled_indices] X_labeled = X_train.iloc[y_train.index[:n_labeled]] y_labeled = y_train.iloc[y_train.index[:n_labeled]] ``` 这里将数据集划分为带标签数据集和无标签数据集,只对带标签数据集进行训练。如果需要同使用带标签数据集和无标签数据集进行训练,可以使用半监督学习的算法,例如标签传播算法和自训练算法。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值