An Embarrassingly Easy but Strong Baseline for Nested Named Entity Recognition

原文链接:

https://aclanthology.org/2023.acl-short.123.pdf

ACL 2023

介绍

        问题

        基于span来解决嵌套ner任务的范式,大多都是先对span进行枚举,然后对每个span进行分类,实际就是得到一个分数矩阵,矩阵中每个元素表示一个span(比如矩阵中的n行m列,对应着span(token_n, token_m))。作者认为这种方法忽略了span与sapn之间的空间信息。

        IDEA 

        在矩阵中,每个span与其周围的span在原句中都是比较接近的,存在一定的空间语义信息。因此作者提出使用CNN来对span之间的空间信息进行建模。

方法

         整体来说,首先对span进行枚举,然后通过Biaffine decoder得到一个三维的特征矩阵,在此基础上使用CNN来进行卷积,在span与span之间进行交互,丰富span的表征,最后对其进行分类。整体结构如下图所示:

Span-based Representation

         使用一个预训练模型(比如BERT)来得到输入句子的word embedding,对于分词后的token,使用max-pooling来得到这个word的词嵌入:

        然后使用一个多头的Biaffine decoder来得到每个span的分数矩阵R:

CNN on Feature Matrix

         使用CNN来对span与其周围的span之间的交互进行建模,

        这里由于句子中的token数量不同,导致分数矩阵R的大小会不同,为了进行批量计算,在矩阵中使用0来进行padding。

Output

        使用一个mlp来得到相应的预测对数:

        模型的损失函数是一个分类二值交叉熵:

实验

         在ACE2004和ACE2005这两个数据集上进行实验,结果如下所示:

         在genia数据集上进行了实验(预训练模型使用BioBERT-base),结果如下图所示:

        为了研究为什么CNN有利于嵌套ner任务,作者将实体分为两类:嵌套实体(nest ner)和非嵌套实体(flat ner)。作者设计了 4 个指标 NEPR(flat entity precision)、NERE(flat entity recall)、FEPR(nested entity precision) 和 FERE(nested entity recall): 

结论

        论文想法很简单,使用了卷积来对不同的span进行交互,使其能够学习到周围span的信息,但是其实从实验结果来看,加了CNN的效果并没有很大的提升。但将卷积利用到NER任务中,也浅算一个创新点吧,或许可以考虑不止在span与span之间进行卷积。 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Every so often a book comes along that makes you ask yourself, "Gee, when was the last time I had my eyes checked?" David M. Beazley's Python: Essential Reference is just such a book. Condensing thousands of pages of Python online documentation into a compact 319-page softcover, Beazley and his editors used the old-college trick (often performed in reverse) of dickering with the font size to meet a putative page-limit requirement. The result is a truly condensed product fit for the occularly well-adjusted (nota bene). Beazley's subject is Python, a full-featured, freely-redistributable, POSIX-compliant (platforms include Linux, Unix, Macintosh, and Windows) scripting language that is based on object-oriented design principles. As advertised, Beazley's source release (1.5.2) is available from an unfortunately slow server at www.python.org. The installation under Linux (Redhat 5.2) proceeded without incident. Beazley holds true to his catalogic purpose: fully 230 pages are formatted as technical appendices and indices covering the standard litany: built-in function syntax, database features, OS-level interfaces, Internet interfaces, and compiling/profiling/debugging. All references are fully annotated and illustrated with example source code that runs from a couple of lines to a couple of pages. In lock step with competing scripting languages, Python is extensible and embeddable in C and C++, and with blitzkrieg efficiency, Beazley summarizes these crucial practical issues in the final 30 pages. Python users who are tired of chasing questions through hyperlinked online documents will benefit from the expansive random-access index. Python the book captures the orderliness of Python the language. Beazley begins with an 86-page précis of Python in the fashion of Kernighan and Ritchie: too brief for a newbie tutorial but enough to propel old hands into a scripting language that aspires to the elegance of a compiled language. Indeed, it is a byte-compiling language. The line bytecode=compile("some_python_script",'','exec')) creates 'bytecode' as a token executed by exec bytecode. But a five-minute investigation through Beazley's book does not describe how 'bytecode' can be written into a separate executable file. If writing the byte-compiled code to a file is not possible, Python suffers from the limitations of other scripting languages: the executable is the source and cannot be hidden from the user, at least not without some difficulty. Despite its extensibility, embeddability, and pleasing architecture, Python is like other scripting languages: appropriate for solving small nonproprietary problems. Those familiar with more established scriptors like Perl may ask, "Why Python?" Unlike Perl, Python is a product of the fully object-oriented (OO) era, and its constructs reflect design principles that aspire beyond keystroke shortcuts of the succinct-but-often-arcane Perl. Python creator Guido van Rossum cleansed Perl's idiosyncracies and objectified basic data structure, data manipulations, and I/O. With Python, OO is so intrinsic that learning Python is equivalent to learning OO. The same cannot be said of Perl. Unfortunately, comparisons with other languages are missing from Beazley's book. Van Rossum, in an embarrassingly self-serving foreword, preemptively asserts that we readers need "neither evangelizing nor proselytizing"--after all, we already own the book--but we do need galvanizing and we don't find it. Specifically, we need a response to the oft-repeated wisdom that new computer languages are only worth learning if they teach us to organize our thinking along new lines. Scripting languages, however, are for quick and dirty projects: quick to write, easy to hack, and ultimately disposable. The essential tension created by van Rossum and friends is between the elegance of object-oriented principles and the utility of a quick-hacked script. Sadly, the tension remains unresolved in Beazley's reference. There is little to convince us that Python has earned its place in the firmament by changing our thinking. But Beazley has given us much to get us going if we have already taken the leap of faith. --Peter Leopold --This text refers to an out of print or unavailable edition of this title. From Library Journal Though Python is a relatively new programming language, it has quite a significant audience owing to its sensible syntax. An active user of Python since 1996, Beazley provides ample information on the fundamentals of versions 2.0 and 2.1, including syntax, functions, operators, classes, and libraries. This is first and foremost a reference, so he avoids lengthy discussions of Python's superiority. Peppered with good code samples and featuring a companion web site with more extensive pieces, this title should be on hand in larger libraries. Copyright 2001 Reed Business Information, Inc.
Docker-in-Action.pdf In 2011, I started working at Amazon.com. In that first week my life was changed as I learned how to use their internal build, dependency modeling, and deployment tool- ing. This was the kind of automated management I had always known was possible but had never seen. I was coming from a team that would deploy quarterly and take 10 hours to do so. At Amazon I was watching rolling deployments push changes I had made earlier that day to hundreds of machines spread all over the globe. If big tech firms had an engineering advantage over the rest of the corporate landscape, this was it. Early in 2013, I wanted to work with Graphite (a metrics collection and graphing suite). One day I sat down to install the software and start integrating a personal proj- ect. At this point I had several years of experience working with open source applica- tions, but few were as dependent on such large swaths of the Python ecosystem. The installation instructions were long and murky. Over the next several hours, I discov- ered many undocumented installation steps. These were things that might have been more obvious to a person with deeper Python ecosystem knowledge. After pouring over several installation guides, reading through configuration files, and fighting an epic battle through the deepest parts of dependency hell, I threw in the towel. Those had been some of the least inspiring hours of my life. I wanted nothing to do with the project. To make matters worse, I had altered my environment in a way that was incompatible with other software that I use regularly. Reverting those changes took an embarrassingly long time. I distinctly remember sitting at my desk one day in May that year. I was between tasks when I decided to check Hacker News for new ways to grow my skillset. Articles about a technology called Docker had made the front page a few times that week. That evening I decided to check it out. I hit the site and had the software installed within a few minutes. I was running Ubuntu on my desktop at home, and Docker only had two dependencies: LXC and the Linux kernel itself. Licensed to Stephanie Bernal <[email protected]> PREFACE xiv Like everyone else, I kicked the tires with a “Hello, World” example, but learned little. Next I fired up Memcached. It was downloaded and running in under a minute. Then I started WordPress, which came bundled with its own M y SQL server. I pulled a couple different Java images, and then Python images. Then my mind flashed back to that terrible day with Graphite. I popped over to the Docker Index (this was before Docker Hub) and did a quick search. The results came back, and there it was. Some random user had created a Graphite image. I pulled it down and created a new container. It was running. A simple but fully configured Graphite server was running on my machine. I had accomplished in less than a minute of download time what I had failed to do with several hours a few months earlier. Docker was able to demonstrate value with the simplest of examples and minimum effort. I was sold. Over the next week, I tried the patience of a close friend by struggling to direct our conversations toward Docker and containers. I explained how package management was nice, but enforcing file system isolation as a default solved several management problems. I rattled on about resource efficiency and provisioning latency. I repeated this conversation with several other colleagues and fumbled through the container story. Everyone had the same set of tired questions, “Oh, it’s like virtualization?” and “Why do I need this if I have virtual machines?” The more questions people asked, the more I wanted to know. Based on the popularity of the project, this is a story shared by many. I began including sessions about Docker when I spoke publicly. In 2013 and 2014, only a few people had heard of Docker, and even fewer had actually tried the software. For the most part, the crowds consisted of a few skeptical system administrator types and a substantial number of excited developers. People reacted in a multitude of ways. Some were pure rejectionists who clearly preferred the status quo. Others could see problems that they experienced daily solved in a matter of moments. Those peo- ple reacted with an excitement similar to mine. In the summer of 2014, an associate publisher with Manning called me to talk about Docker. After a bit more than an hour on the phone he asked me if there was enough content there for a book. I suggested that there was enough for a few books. He asked me if I was interested in writing it, and I became more excited than I had been for some time. That fall I left Amazon.com and started work on Docker in Action. Today, I'm sitting in front of the finished manuscript. My goal in writing this book was to create something that would help people of mixed backgrounds get up to speed on Docker as quickly as possible, but in such a way that they understand the underlying mechanisms. The hope is that with that knowledge, readers can under- stand how Docker has been applied to certain problems, and how they might apply it in their own use-cases.
零样本学习是一种重要的机器学习方法,用于处理那些没有被训练过的类别。一种尴尬地简单的零样本学习方法是使用属性向量来表示类别,而不是直接从训练数据中学习类别之间的关系。属性向量是一个描述类别特征的向量,可以用来衡量一个物体或概念的属性。通过使用属性向量,我们可以将类别表示为在属性空间中的点,进而进行零样本学习。这种方法的好处是可以在没有训练数据的情况下,根据已知的属性向量来推断新类别的特征。 具体而言,我们可以使用属性向量来表示图像的类别。例如,在处理动物分类问题时,我们可以用一个包含了“有四条腿”、“毛茸茸”等属性的向量来描述不同动物的特征。然后,我们可以将这些属性向量应用到零样本学习中,通过计算新图像与不同类别属性向量之间的相似度来判断图像所属的类别。这种方法的优势在于不需要额外的训练数据,只需从属性向量中提取特征并进行简单的计算即可完成零样本学习。 尽管这种方法可能显得太过简单,但它却可以在一定程度上解决零样本学习的问题。当我们面临没有训练数据的新类别时,使用属性向量来进行零样本学习是一种简单而有效的方法。当然,这种方法也有一些局限性,比如需要准确的属性向量和属性空间的定义,但它无疑为零样本学习提供了一种简单而实用的解决方案。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值