小蜗牛的梦想-CSDN博客

原创 NLP·Pytorch

由于刚开始写pytorch的代码，有很多函数的用法都是第一次使用，因此在这里总结下来，方面后续查看。

2022-11-02 09:57:55 244 1

自然语言处理-基于预训练模型的方法-读书笔记第一章绪论自然语言处理自然语言处理的基本概念自然语言处理面临的8个难点：语言的抽象性组合性歧义性进化性非规范性主观性知识性难移植性自然语言处理任务分类1.1自然语言处理的概念1.自然语言处理（NLP）主要研究用于计算机理解和生成自然语言的各种理论和方法，常被称为计算语言学（CL）2.AI发展：运算智能——>感知智能——>认知智能1.2 自然语言处理的难点计算机处理自然语言的困难在于：自然语言的高度抽象

2022-05-14 14:24:54 400

原创扎心的hanlp的安装过程

安装hanlp真的是最扎心的过程，因为它总是在出错，网上的攻略总是需要我去下载新的东西好吧，总结一下：1.下载Anaconda2.安装好Anaconda以后就可以在其命令行里面进行运行这两条命令condainstall-cconda-forgejpype1pipinstallpyhanlp等待的时间真的是超级漫长，而且，学校的网速还不淡定。不过我早上下载超级慢，...

2019-06-24 16:55:09 697

原创 Python安装jieba库的具体步骤

1.在具体地址上下载jieba——第三方中文分词函数库下载地址：https://pypi.org/project/jieba/#files2.在命令提示符里面输入命令来安装jieba库首先定位到jieba的setup.py文件的上级文件的地方，然后输入>python setup.py install然后如果出现各种情况就算是安装好了接着用一个例子来测试一下是否装好了...

2018-05-17 15:00:43 125392 36

原创 ACM-并查集- How Many Tables-HDU - 1213

D - How Many TablesToday is Ignatius' birthday. He invites a lot of friends. Now it's dinner time. Ignatius wants to know how many tables he needs at least. You have to notice that not all the f

2017-07-16 20:54:36 371

automatically maintaining wrappers for semi-structured web sources

需要的可以下载，只有PDF文件，不包含源代码。摘要如下： A substantial subset of the web data follows some kind of underlying structure. Nevertheless, HTML does not contain any schema or semantic information about the data it represents. A program able to provide software applications with a structured view of those semi-structured web sources is usually called a wrapper. Wrappers are able to accept a query against the source and return a set of structured results, thus enabling applications to access web data in a similar manner to that of information from databases. A significant problem in this approach arises because web sources may experiment changes that invalidate the current wrappers. In this paper, we present novel heuristics and algorithms to address this problem. Our approach is based on collecting some query results during wrapper operation. Then, when the source changes, they are used to generate a set of labeled examples that are then provided as input to a wrapper induction algorithm able to regenerate the wrapper. We have tested our methods in several real-world web data extraction domains, obtaining high accuracy in all the steps of the process.

2020-04-06

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

qq_39525042的博客

原创 NLP·Pytorch

原创自然语言处理-基于预训练模型的方法-读书笔记（一）

原创扎心的hanlp的安装过程

原创 Python安装jieba库的具体步骤

原创 ACM-并查集- How Many Tables-HDU - 1213

automatically maintaining wrappers for semi-structured web sources

空空如也

原创 NLP·Pytorch

原创 自然语言处理-基于预训练模型的方法-读书笔记（一）

原创 扎心的hanlp的安装过程

原创 Python安装jieba库的具体步骤

原创 ACM-并查集- How Many Tables-HDU - 1213

automatically maintaining wrappers for semi-structured web sources

空空如也

原创自然语言处理-基于预训练模型的方法-读书笔记（一）

原创扎心的hanlp的安装过程