DBLP学习总结

 

1. 什么是dblp

DBLP——Digital Bibliography & Library Project的缩写DBLP是计算机领域内对研究的成果以作者为核心的一个计算机类英文文献的集成数据库系统,按年代列出了作者的科研成果。包括国际期刊和会议等公开发表的论文。DBLP没有提供对中文文献的收录和检索功能,国内类似的权威期刊及重要会议论文集成检索系统有C-DBLP

这个项目是德国特里尔大学的Michael Ley负责开发和维护。它提供计算机领域科学文献的搜索服务,但只储存这些文献的相关元数据,如标题,作者,发表日期等。和一般流行的情况不同,DBLP并没有使用数据库而是使用XML存储元数据

DBLP的官方网址: https://dblp.uni-trier.de/

2. DBLP数据集网址简介

数据集可下载网址:https://dblp.uni-trier.de/xml/

以下是对网址中各个文件的介绍:

CHANGES.txt:该文件列举了DBLP格式上近期的变化。

README.txt:DBLP相关的版权和许可说明文件。

dblp.dtd:XML文件的格式说明文件。

dblp.xml.gz:DBLP数据集文件。

docu/:相关的文献资料。

release/:XML数据集发布版本。

3. DBLP数据集详细说明

DBLP数据集是XML文件,相关XML格式文件的学习访问XML教程。在dblp.xml中,根元素是<dblp>,,根元素有多个子元素,但是元素的深度不超过3,XML文件部分实例如下:

      

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE dblp SYSTEM "dblp.dtd">

<dblp>

[...]

<article key="journals/cacm/Gentry10" mdate="2010-04-26">

<author>Craig Gentry</author>

<title>Computing arbitrary functions of encrypted data.</title>

<pages>97-105</pages>

<year>2010</year>

<volume>53</volume>

<journal>Commun. ACM</journal>

<number>3</number>

<ee>http://doi.acm.org/10.1145/1666420.1666444</ee>

<url>db/journals/cacm/cacm53.html#Gentry10</url>

</article>

[...]

<inproceedings key="conf/focs/Yao82a" mdate="2011-10-19">

<title>Theory and Applications of Trapdoor Functions (Extended Abstract)</title>

<author>Andrew Chi-Chih Yao</author>

<pages>80-91</pages>

<crossref>conf/focs/FOCS23</crossref>

<year>1982</year>

<booktitle>FOCS</booktitle>

<url>db/conf/focs/focs82.html#Yao82a</url>

<ee>http://doi.ieeecomputersociety.org/10.1109/SFCS.1982.45</ee>

</inproceedings>

[...]

<www mdate="2004-03-23" key="homepages/g/OdedGoldreich">

<author>Oded Goldreich</author>

<title>Home Page</title>

<url>http://www.wisdom.weizmann.ac.il/~oded/</url>

</www>

[...]

</dblp>

该实例的树形结构为:

https://dblp.org/faq/xmlstructure.png

在dblp中,有两种类型的记录,一种是Publication records,一种是Person records,这里给出Publication records的类型。Person records以及详细DBLP的XML记录元素见https://dblp.org/faq/16154937

      

由于我们研究工作关注于DBLP数据中的Publication records,以下陈述主要是对Publication的研究。在每一种Publication record中,我们可以看到许多元素(例如author、editor、title),我们对常见的{authoreditortitlebooktitleyearjournalurleepublishercrossrefseriesschool}这些元素感兴趣,下面内容会逐一介绍。首先我们给出Publication records在DBLP的XML数据集的记录格式,表1给出每一种Publication record在XML数据集中能见到的所有元素和同一种Publication record中必有的元素;表2每一种Publication记录例举1-3个记录的例子(我们的例子都来自于DBLP的XML数据集),这些例子尽会包含该种Pulication格式中见到的元素,以下表格中用同一种颜色标记的为一种Publication记录中能见到的所有元素,例如:表格2中举了三个article例子,第一个例子中(article1)我们可以看到元素有{author,title,year,journal,ee,url,crossref,url},第二个例子(article2)与第一个例子的区别为在这个记录中会有booktitle这个元素,第三个例子(article3)中有Publisher这个元素,这三个例子包含了XML数据集中article记录能够见到的所有元素。

表1

Publication record

同一父标签所具备的所有元素

同一父标签必有的元素

artilce

'url', 'author', 'ee', 'publisher', 'year', 'booktitle', 'crossref', 'editor', 'journal', 'title'

'title'

inproceedings

'url', 'author', 'ee', 'year', 'booktitle', 'crossref', 'editor', 'title'

'title', 'url', 'year', 'booktitle'

book

'school', 'author', 'ee', 'publisher', 'booktitle', 'crossref', 'title', 'url', 'year', 'series', 'editor'

'title', 'year

proceedings

'url', 'author', 'ee', 'publisher', 'year', 'booktitle', 'editor', 'series', 'journal', 'title'

'title', 'year'

inclollection

'url', 'author', 'ee', 'publisher', 'year', 'booktitle', 'crossref', 'title'

'title', 'url', 'year', 'booktitle',

phdthesis

'school', 'url', 'author', 'ee', 'publisher', 'year', 'series', 'title'

'title', 'year'

mastersthesis

'school', 'author', 'ee', 'year', 'title'

'title', 'year', 'school', 'author'

www

'url', 'author', 'ee', 'year', 'crossref', 'editor', 'title'

 

表2

article1

author

<article mdate=2017-05-28 key=journals/tcci/Begier11>

<author>Barbara Begier</author>

<title>Quality Assessment of an Expert System: An Instrument of Regular Feedback from Users.</title>

<pages>199-214</pages>

<year>2011</year>

<volume>3</volume>

<journal>Trans. Computational Collective Intelligence</journal>

<ee>https://doi.org/10.1007/978-3-642-19968-4_10</ee>

<crossref>journals/tcci/2011-3</crossref>

<url>db/journals/tcci/tcci3.html#Begier11</url>

</article>

title

year

journal

ee

crossref

url

article2

author

<article mdate=2017-05-28 key=journals/tcsb/TalcottD06>

<author>Carolyn L. Talcott</author>

<author>David L. Dill</author>

<title>Multiple Representations of Biological Processes.</title>

<pages>221-245</pages>

<year>2006</year>

<crossref>journals/tcsb/2006-6</crossref>

<booktitle>Trans. Computational Systems Biology</booktitle>

<ee>https://doi.org/10.1007/11880646_10</ee>

<url>db/journals/tcsb/tcsb6.html#TalcottD06</url>

</article>

title

year

crossref

booktitle

ee

url

article3

editor

<article mdate=2017-06-08 key=tr/ibm/IWBS112 publtype=informal>

<editor>Thomas Wetter</editor>

<editor>Rolf Engelbrecht</editor>

<editor>Reinhold Haux</editor>

<editor>Frank Puppe</editor>

<editor>Hans Vo=szlig=</editor>

<title>Wissensbasierte Systeme in der Medizin: GMDS/GI, Abstracts des 1. gemeinsamen Workshops der AG Expertensysteme der GMDS und der FG Diagnostik und Klassifikation im GI-Fachausschu=szlig= 1.5, 29.-30. M=auml=rz 1990, Heidelberg</title>

<journal>IWBS Report</journal>

<volume>112</volume>

<year>1990</year>

<publisher>IBM Germany Science Center, Institute for Knowledge Based Systems</publisher>

</article>

title

year

journal

Publisher

 

Inproceedings1

editor

<inproceedings mdate=2016-02-19 key=conf/uss/MartinS02>

<editor>David M. Martin Jr.</editor>

<author>Andrew Schulman</author>

<title>Deanonymizing Users of the SafeWeb Anonymizing Service.</title>

<pages>123-137</pages>

<year>2002</year>

<crossref>conf/uss/2002</crossref>

<booktitle>USENIX Security Symposium</booktitle>

<ee>http://www.usenix.org/publications/library/proceedings/sec02/martin.html</ee>

<url>db/conf/uss/uss2002.html#MartinS02</url>

</inproceedings>

author

title

year

crossref

booktitle

ee

url

Proceedings1

editor

<proceedings mdate=2019-05-14 key=journals/tcci/2014-14>

<editor>Ngoc Thanh Nguyen</editor>

<title>Transactions on Computational Collective Intelligence XIV</title>

<year>2014</year>

<publisher>Springer</publisher>

<series href=db/series/lncs/index.html>Lecture Notes in Computer Science</series>

<volume>8615</volume>

<ee>https://doi.org/10.1007/978-3-662-44509-9</ee>

<isbn>978-3-662-44508-2</isbn>

<booktitle>Trans. Computational Collective Intelligence</booktitle>

<url>db/journals/tcci/tcci14.html</url>

</proceedings>

title

year

publisher

series

ee

booktitle

url

proceedings2

editor

<proceedings mdate=2019-10-19 key=conf/stm/2010>

<editor>Jorge Cu=eacute=llar</editor>

<editor>Gilles Barthe</editor>

<editor>Alexander Pretschner</editor>

<author>Javier L=oacute=pez 0001</author>

<title>Security and Trust Management - 6th International Workshop, STM 2010, Athens, Greece, September 23-24, 2010, Revised Selected Papers</title>

<volume>6710</volume>

<year>2011</year>

<publisher>Springer</publisher>

<series href=db/series/lncs/index.html>Lecture Notes in Computer Science</series>

<ee>https://doi.org/10.1007/978-3-642-22444-7</ee>

<isbn>978-3-642-22443-0</isbn>

<booktitle>STM</booktitle>

<url>db/conf/stm/stm2010.html</url>

</proceedings>

title

author

title

year

publisher

series

ee

url

book1

author

<book mdate=2019-05-14 key=series/lncs/Lamprecht13>

<author>Anna-Lena Lamprecht</author>

<title>User-Level Workflow Design - A Bioinformatics Perspective.</title>

<year>2013</year>

<pages>1-202</pages>

<publisher>Springer</publisher>

<series href=db/series/lncs/index.html>Lecture Notes in Computer Science</series>

<volume>8311</volume>

<school>Dortmund University of Technology</school>

<isbn>978-3-642-45388-5</isbn>

<isbn>978-3-642-45389-2</isbn>

<ee>https://doi.org/10.1007/978-3-642-45389-2</ee>

<ee>http://d-nb.info/1044795522</ee>

</book>

title

year

publisher

school

ee

series

Book2

editor

<book mdate=2019-08-06 key=series/lncs/3864>

<editor>Yang Cai 0002</editor>

<editor>Julio Abascal</editor>

<title>Ambient Intelligence in Everyday Life - Foreword by Emile Aarts</title>

<year>2006</year>

<crossref>series/lncs/3864</crossref>

<publisher>Springer</publisher>

<series href=db/series/lncs/index.html>Lecture Notes in Computer Science</series>

<volume>3864</volume>

<ee>https://doi.org/10.1007/11825890</ee>

<isbn>978-3-540-37785-6</isbn>

<booktitle>Ambient Intelligence in Everyday Life</booktitle>

<url>db/series/lncs/lncs3864.html</url>

</book>

title

year

crossref

publisher

series

ee

booktitle

url

Incollection1

author

<incollection mdate=2019-06-27 key=books/mk/minker88/Kanellakis88>

<author>Paris C. Kanellakis</author>

<title>Logic Programming and Parallel Complexity</title>

<pages>547-585</pages>

<booktitle>Foundations of Deductive Databases and Logic Programming.</booktitle>

<publisher href=db/publishers/mkp.html>Morgan Kaufmann</publisher>

<year>1988</year>

<crossref>books/mk/Minker88</crossref>

<url>db/books/collections/minker88.html#Kanellakis88</url>

<ee>https://doi.org/10.1016/b978-0-934613-40-8.50018-x</ee>

</incollection>

title

booktitle

publisher

year

crossref

url

ee

Phdthesis1

author

<phdthesis mdate=2018-08-08 key=phd/dnb/Saif17>

<author>Hassan Saif</author>

<title>Semantic sentiment analysis in social streams.</title>

<year>2017</year>

<pages>1-286</pages>

<publisher>IOS Press</publisher>

<school>The Open University, UK</school>

<series href=db/series/ssw/index.html>Studies on the Semantic Web</series>

<volume>30</volume>

<isbn>978-3-89838-726-2</isbn>

<ee>http://d-nb.info/1135222851</ee>

<url>db/series/ssw/index.html</url>

</phdthesis>

title

year

publisher

school

ee

url

Phdthesis2

author

<phdthesis mdate=2019-10-04 key=series/lnbip/Weber09>

<author orcid=0000-0002-4833-5921>Ingo M. Weber</author>

<title>Semantic Methods for Execution-level Business Process Modeling - Modeling Support Through Process Verification and Service Composition</title>

<volume>40</volume>

<year>2009</year>

<pages>3-231</pages>

<school>Karlsruhe Institute of Technology, Germany</school>

<publisher>Springer</publisher>

<series href=db/series/lnbip/index.html>Lecture Notes in Business Information Processing</series>

<isbn>978-3-642-05084-8</isbn>

<isbn>978-3-642-05085-5</isbn>

<ee>https://doi.org/10.1007/978-3-642-05085-5</ee>

<ee>http://nbn-resolving.de/urn:nbn:de:1111-20091214234</ee>

<ee>http://d-nb.info/998030961</ee>

<ee>http://d-nb.info/999067885</ee>

<ee>https://www.wikidata.org/entity/Q58196259</ee>

</phdthesis>

title

year

school

publisher

ee

series

mastersthesis1

author

<mastersthesis mdate=2018-06-13 key=ms/Vollmer2006>

<author>Stephan Vollmer</author>

<title>Portierung des DBLP-Systems auf ein relationales Datenbanksystem und Evaluation der Performance.</title>

<year>2006</year>

<school>Diplomarbeit, Universit=auml=t Trier, FB IV, Informatik</school>

<ee>http://dbis.uni-trier.de/Diplomanden/Vollmer/vollmer.shtml</ee>

</mastersthesis>

title

year

school

ee

www1

author

<www mdate=2018-03-17 key=homepages/31/6654>

<author>Mike Jensen</author>

<title>Home Page</title>

<url>https://www.internethalloffame.org/inductees/mike-jensen</url>

</www>

title

url

www2

crossref

<www mdate=2018-02-19 key=homepages/01/2835>

<crossref>homepages/152/4195</crossref>

<title>Home Page</title>

</www>

title

www3

editor

<www mdate=2019-05-27 key=www/org/w3/TR/query-datamodel>

<editor>Mary F. Fernandez</editor>

<editor>Jonathan Robie</editor>

<title>XML Query Data Model</title>

<year>2001</year>

<ee>http://www.w3.org/TR/query-datamodel</ee>

</www>

title

year

ee

 

4. Publication records的bibtex类型说明

article

期刊或杂志上的一篇文章

book

有确定出版社的书籍

inclollection

一本书中有自己题目的一部分

inproceedings

会议论文集中的一篇文章

proceedings

会议论文集

mastersthesis

硕士论文

phdthesis

博士论文

 

5. Publication records的元素介绍

以下内容来自https://blog.csdn.net/kite1988/article/details/5186628

       常见元素   

             

Records element

简介

title

论文题目,记录的唯一元素.该标签可能存在以下子标签{sub,sup,I,tt,ref}

author

论文的作者,格式中作者的顺序与论文开头作者的顺序一致.

editor

编辑者

booktitle

会议或者研讨会的简称

year

发行日期,格式为4数字

crossref

在article和inproceedings中可能存在,它代表一种链接关系,通过该标签把article,inproceedings与proceeedings关联起来,即通过crossref可以找到收录该论文的论文集。(和记录的key值有关

journal

期刊名称

school

作者学校

publisher

出版社

series

出版物系列参考

ee

电子版链接

url

DBLP网页链接

 

 

      不常见元素

page

论文的页码

volume

出版物发布地的原始卷

number

发布发布的源的编号

month

发行月份

cdrom

PDF电子出版物

note

会议论文集中的一篇文章的笔记

chapter

incollection的章节

 

 

针对这些不常见的元素,我们给出一些Publication records,以供参考。

article

author

<article>

<author>Rita Ley</author>

<author>Markus Casper</author>

<author>Hugo Hellebrand</author>

<author>Ralf Merz</author>

<title>Catchment classification by runoff behaviour with self-organizing maps (SOM)</title>

<journal>Hydrology and Earth System Sciences</journal>

<volume>15</volume>

<pages>2947-2962</pages>

<year>2011</year>

<ee>https://doi.org/10.5194/hess-15-2947-2011</ee>

</article>

title

journal

volume

pages

year

ee

article

author

<article>

<author>Ignace Loris</author>

<title>L1Packv2: A Mathematica package for minimizing an <i>l</i><sub>1</sub>-penalized functional.</title>

<pages>895-902</pages>

<year>2008</year>

<volume>179</volume>

<journal>Computer Physics Communications</journal>

<number>12</number>

<ee>https://doi.org/10.1016/j.cpc.2008.07.010</ee>

<url>db/journals/cphysics/cphysics179.html#Loris08</url>

</article>

title

pages

year

volume

journal

number

ee

url

inproceedings

author

<inproceedings>

<author>E. F. Codd</author>

<title>Seven Steps to Rendezvous with the Casual User.</title>

<month>January</month>

<year>1974</year>

<pages>179-200</pages>

<booktitle>IFIP Working Conference Data Base Management</booktitle>

<crossref>conf/ds/1974</crossref>

<url>db/conf/ds/dbm74.html#Codd74</url>

<note>IBM Research Report RJ 1333, San Jose, California</note>

<cdrom>DS/DS1974/P179.pdf</cdrom>

</inproceedings>

title

month

year

pages

booktitle

crossref

url

note

cdrom

incollection

author

<incollection>

<author>Michael D. Soo</author>

<author>Christian S. Jensen</author>

<author>Richard T. Snodgrass</author>

<title>An Algebra for TSQL2</title>

<booktitle>The TSQL2 Temporal Query Language</booktitle>

<publisher>Kluwer</publisher>

<year>1995</year>

<crossref>books/kl/Snodgrass95</crossref>

<pages>501-544</pages>

<chapter>27</chapter>

<url>db/books/collections/snodgrass95.html#SooJS95</url>

<cdrom>tsql2/P501.pdf</cdrom>

</incollection>

title

booktitle

publisher

year

crossref

pages

chaper

url

cdrom

 

 

6. 参考文献

DBLP数据集简介及简单用法

DBLP知识图谱构建

  • 8
    点赞
  • 49
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值