raise PDFEncryptionError('Unknown algorithm: param=%r' % param) pdfminer.pdfdocument.PDFEncryptionE...

使用pdfminer遇到的pdf文件加密问题:

raise PDFEncryptionError('Unknown algorithm: param=%r' % param)

pdfminer.pdfdocument.PDFEncryptionError: Unknown algorithm: param={'CF': {'StdCF': {'Length': 16, 'CFM': /AESV2, 'AuthEvent': /DocOpen}}, 'O': '}\xe2>\xf1\xf6\xc6\x8f\xab\x1f"O\x9bfc\xcd\x15\xe09~2\xc9\\x87\x03\xaf\x17f>\x13\t^K\x99', 'Filter': /Standard, 'P': -1548, 'Length': 128, 'R': 4, 'U': 'Kk>\x14\xf7\xac\xe6\x97\xb35\xaby!\x04|\x18(\xbfN^Nu\x8aAd\x00NV\xff\xfa\x01\x08', 'V': 4, 'StmF': /StdCF, 'StrF': /StdCF}

 

原因:这个pdf文件有密码,但密码是空字符串,所以必须要解密一下才可以做解析

 

解决方案:

from subprocess import call
# pdf_filename代表源文件路径, pdf_copy_filename代表解密后的文件路径, 密码为''
call('qpdf --password=%s --decrypt %s %s' % ('', pdf_filename, pdf_copy_filename), shell=True)

 之后直接解析pdf_copy_filename文件即可!

 

转载于:https://www.cnblogs.com/LiCheng-/p/7550282.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Sure! Here's one possible implementation of the mean-shift algorithm: ```python def mean_shift(xs: np.ndarray, num_iter: int = 50, k_type: str = 'rbf', bandwidth: float = 0.1) -> np.ndarray: """ Implement the mean-shift algorithm :param xs: a set of samples with size (N, D), where N is the number of samples, D is the dimension of features :param num_iter: the number of iterations :param k_type: the type of kernels, including 'rbf', 'gate', 'triangle', 'linear' :param bandwidth: the hyperparameter controlling the width of rbf/gate/triangle kernels :return: the estimated means with size (N, D) """ N, D = xs.shape means = xs.copy() for i in range(num_iter): for j, x in enumerate(xs): # compute the kernel weights if k_type == 'rbf': w = np.exp(-0.5 * np.sum((x - means) ** 2, axis=1) / bandwidth ** 2) elif k_type == 'gate': w = np.exp(-np.sqrt(np.sum((x - means) ** 2, axis=1)) / bandwidth) elif k_type == 'triangle': w = np.maximum(0, 1 - np.sqrt(np.sum((x - means) ** 2, axis=1)) / bandwidth) elif k_type == 'linear': w = np.sum((x - means) * xs, axis=1) / (np.linalg.norm(x - means, axis=1) * np.linalg.norm(x)) else: raise ValueError("Unsupported kernel type: {}".format(k_type)) # update the mean by weighted average means[j] = np.sum(w[:, np.newaxis] * xs, axis=0) / np.sum(w) return means ``` In this implementation, we start with the initial means being the same as the input samples, and then iteratively update the means by computing the kernel weights between each sample and the current means, and then computing the weighted average of the samples using these weights. The kernel can be selected using the `k_type` parameter, and the kernel width can be controlled by the `bandwidth` parameter. The number of iterations can be controlled by the `num_iter` parameter. Finally, the estimated means are returned.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值