SIGMOD论文阅读笔记

本次阅读的6篇论文并非学术论文,而是基于工业实践的论文。因此阅读时更侧重于论文中系统实现情况,实验结果,以及未来可能使用方向等。

ExDRa: Exploratory Data Science on Federated Raw Data

这篇论文首先阐明数据科学是一个开放不成熟的领域,许多问题并未得到很好的解决,许多问题有多种解决思路等待发现。本文提出了ExDRa系统旨在提供一个帮助数据科学家探索数据内在关系的基础工具,这一工具主要针对分布式,异构的,原始的数据源

INTRODUCTION部分使用具体例子介绍数据科学家目前处理的问题(Example 1 Data Ownership Dilemma)介绍了工业界常见的场景:模型需要方不想给模型训练者数据,但希望模型训练者能基于数据持有方的数据训练模型。由此引出联邦机器学习这一目前业内探讨激烈的术语。USE CASES介绍了数据科学在肥料生产,纸张生产等领域遇到的具体挑战,介绍了ExDRa用于联邦学习的必要性。SYSTEM ARCHITECTRURE中介绍了ExDRa的宏观结构。FEDERATED RUNTIME中具体介绍了ExDRa提供的功能接口。联邦学习等一系列数据处理操作都基于这些提供的功能接口。EXPERIMENTS实验表明ExDRa系统可以加快机器学习的速度。
在这里插入图片描述

DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python

探索性数据分析(Exploratory Data Analysis EDA)是目前数据科学中重要的一环。现有工具pandas+plotting和pandas-profiling各有其弊端,因而缺少有效工具进行EDA。本文提出开源包DataPrep.EDA以解决前两个工具的问题。

INTRODUCTION介绍了pandas+plotting和pandas-profiling,并介绍了DataPrep.EDA与二者的区别。TASK-CENTRIC EDA介绍5个EDA任务在DataPrep.EDA的具体实现。SYSTEM ARCHITECTURE介绍DataPrep.EDA包含3个主要模块:设置管理,计算模块,实现模块。IMPLEMENTATION介绍了使用场景可能遇到的问题,包括Dask的构建,小数据上缓慢,以及数据处理流程具体细节等。EXPEREMENTAL EVALUATION进行2组实验分别证明DataPrep.EDA的性能和用户体验。
在这里插入图片描述

MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis

自动探索性数据分析(Automatic Exploratory Data Analysis EDA)自动将数据转换成特定数据格式,即对非结构化数据进行结构化处理。MetaInsight将这一问题规约成分类问题:即将数据分为普通数据或是离群数据,从而获得结构化数据。

PRELIMINARIESFORMULATION介绍了问题定义以及涉及到的一些术语。APPROACH介绍MetaInsight3个重要设计:评分函数,挖掘进程和排序算法。其中评分函数用于分类;挖掘进程可以分为:搜索,查询,评价优化4个步骤;排序函数基于总使用评分进行。EVALUATION进行2组实验:一组实验用于检测MetaInsight的性能(5.1)一组实验用于检测用户体验(5.2)。
请添加图片描述

An Ecosystem of Applications for Modeling Political Violence

本文指出conflict researchers目前面临的4个挑战:如何给冲突建模,如何衡量它们,如何处理其时空特征,如何处理潜在的大量信息和解释,并基于这4个挑战提出一个生态工具用以帮助conflict researchers解决这些挑战。

INTRODUCTION具体介绍4个挑战以及现有工具。TOOLS FOR CONFLICT EXPLORATION介绍TwoRavens,Auctus和PODS。CASE STUDIES具体介绍3个案例作为冲突发现生态系统的使用场景。
请添加图片描述

MIDAS: Towards Efficient and Effective Maintenance of Canned Patterns in Visual Graph Query Interfaces

提出MIDAS框架用以高效高性能维护数据库的可能模式。MIDAS采用了一种选择性维护策略,以保证模式覆盖范围的逐步增加,而不会牺牲模式的多样性和认知负载。对真实数据集和可视化图形界面的实验研究表明,与静态GUI相比,MIDAS是有效的。

INTRODUCTION使用2个例子引出canned pattern maintenance (CPM)问题。BACKGROUND介绍有关图的背景知识,包括图拓扑的知识,canned pattern的特征,以及现有解决CPM问题的工具等。THE CPM PROBLEM具体介绍CPM问题的定义,解决此问题的挑战,解决这两个挑战的策略,并引出MIDAS的解决框架,并在MAINTENANCE OF CLUSTERS & CSGSCANNED PATTERN GENERATION中具体介绍算法1中的具体实现。PERFORMANCE STUDY进行4组实验以检验参数对实验结果的影响,MIDAS的代价,同类工具比较,拓展性实验。RELATED WORK介绍相关工作,CONCLUSION介绍最终结论。
请添加图片描述

Exploring Ratings in Subjective Databases

本文开发用于对象数据挖掘(Subjective Data Exploration SDE)的框架SUBDEX,这一框架能够在一个有指导的多步骤过程中,共同探索项目、人员和人员对项目的意见,其中每个步骤以评级图的形式汇总最有用和最多样化的趋势。

INTRODUCTION使用具体案例介绍SDE。DATA MODEL AND SDE将数据建模成三元组,在此基础上介绍SUBDEX的操作以及rating map。OUR SDE FRAMEWORK介绍SUBDEX的框架之间如何合作实现功能。EXPERIMENTAL STUDY进行2组实验,分别在5.2和5.3介绍。
请添加图片描述

这是前面的13篇论文 1 Keyword search on structured and semi-structured data Yi Chen, Wei Wang, Ziyang Liu, Xuemin Lin Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data 2 Efficient type-ahead search on relational data: a TASTIER approach Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng 3 FlashLogging: exploiting flash devices for synchronous logging performance Shimin Chen 4 Query processing techniques for solid state drives Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener, Goetz Graefe 5 A revised r*-tree in comparison with related index structures Norbert Beckmann, Bernhard Seeger 6 ZStream: a cost-based query processor for adaptively detecting composite events Yuan Mei, Samuel Madden Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Composite (or Complex) event processing (CEP) systems search sequences of incoming events for occurrences of user-specified event patterns. Recently, they have gained more attention in a variety of areas due to their powerful and expressive ... 7 A comparison of flexible schemas for software as a service Stefan Aulbach, Dean Jacobs, Alfons Kemper, Michael Seibold Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data A multi-tenant database system for Software as a Service (SaaS) should offer schemas that are flexible in that they can be extended different versions of the application and dynamically modified while the system is on-line. This ... 8 Query optimizers: time to rethink the contract? Surajit Chaudhuri Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Query Optimization is expected to produce good execution plans for complex queries while taking relatively small optimization time. Moreover, it is expected to pick the execution plans with rather limited knowledge of data and without any ... 9 Keyword search in databases: the power of RDBMS Lu Qin, Jeffrey Xu Yu, Lijun Chang Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Keyword search in relational databases (RDBs) has been extensively studied recently. A keyword search (or a keyword query) in RDBs is specified by a set of keywords to explore the interconnected tuple structures in an RDB ... 10 ROX: run-time optimization of XQueries Riham Abdel Kader, Peter Boncz, Stefan Manegold, Maurice van Keulen Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Optimization of complex XQueries combining many XPath steps and joins is currently hindered by the absence of good cardinality estimation and cost models for XQuery. Additionally, the state-of-the-art of even relational query optimization still ... 11 Query by output Quoc Trung Tran, Chee-Yong Chan, Srinivasan Parthasarathy Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data It has recently been asserted that the usability of a database is as important as its capability. Understanding the database schema, the hidden relationships among attributes in the data all play an important role in this context. Subscribing ... 12 Ranking distributed probabilistic data Feifei Li, Ke Yi, Jeffrey Jestes Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Ranking queries are essential tools to process large amounts of probabilistic data that encode exponentially many possible deterministic instances. In many applications where uncertainty and fuzzy information arise, data are collected from ... 13 Authenticated join processing in outsourced databases Yin Yang, Dimitris Papadias, Stavros Papadopoulos, Panos Kalnis Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Database outsourcing requires that a query server constructs a proof of result correctness, which can be verified by the client using the data owner's signature. Previous authentication techniques deal with range queries on a single relation ...
14 Continuous obstructed nearest neighbor queries in spatial databases Yunjun Gao, Baihua Zheng Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data In this paper, we study a novel form of continuous nearest neighbor queries in the presence of obstacles, namely continuous obstructed nearest neighbor (CONN) search. It considers the impact of obstacles on the distance between objects, ... 15 Optimizing complex extraction programs over evolving text data Fei Chen, Byron J. Gao, AnHai Doan, Jun Yang, Raghu Ramakrishnan Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Most information extraction (IE) approaches have considered only static text corpora, over which we apply IE only once. Many real-world text corpora however are dynamic. They evolve over time, and so to keep extracted information up to date we ... 16 Privacy preservation of aggregates in hidden databases: why and how? Arjun Dasgupta, Nan Zhang, Gautam Das, Surajit Chaudhuri Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Many websites provide form-like interfaces which allow users to execute search queries on the underlying hidden databases. In this paper, we explain the importance of protecting sensitive aggregate information of hidden databases from being ... 17 Efficient approximate entity extraction with edit distance constraints Wei Wang, Chuan Xiao, Xuemin Lin, Chengqi Zhang Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respect to a large dictionary of known entities, as the domain ... 18 Large-scale uncertainty management systems: learning and exploiting your data Shivnath Babu, Sudipto Guha, Kamesh Munagala Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data The database community has made rapid strides in capturing, representing, and querying uncertain data. Probabilistic databases capture the inherent uncertainty in derived tuples as probability estimates. Data acquisition and stream systems can ... 19 Data warehouse technology by infobright Dominik Slezak, Victoria Eastwood Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data We discuss Infobright technology with respect to its main features and architectural differentiators. We introduce the upcoming research and development projects that may be of special interest to the academic and industry communities. ... 20 Detecting and resolving unsound workflow views for correct provenance analysis Peng Sun, Ziyang Liu, Susan B. Davidson, Yi Chen Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Workflow views abstract groups of tasks in a workflow into high level composite tasks, in order to reuse sub-workflows and facilitate provenance analysis. However, unless a view is carefully designed, it may not preserve the dataflow ... 21 Indexing correlated probabilistic databases Bhargav Kanagal, Amol Deshpande Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data With large amounts of correlated probabilistic data being generated in a wide range of application domains including sensor networks, information extraction, event detection etc., effectively managing and querying them has become an important ... 22 Cross-tier, label-based security enforcement for web applications Brian J. Corcoran, Nikhil Swamy, Michael Hicks Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data This paper presents SELinks, a programming language focused on building secure multi-tier web applications. SELinks provides a uniform programming model, in the style of LINQ and Ruby on Rails, with language syntax for accessing objects ... 23 Exploiting context analysis for combining multiple entity resolution systems Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotra Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descriptions co-refer in a dataset. Due to its practical significance for ... 24 Kernel-based skyline cardinality estimation Zhenjie Zhang, Yin Yang, Ruichu Cai, Dimitris Papadias, Anthony Tung Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data The skyline of a d-dimensional dataset consists of all points not dominated by others. The incorporation of the skyline operator into practical database systems necessitates an efficient and effective cardinality estimation module. ... 25 Scalable skyline computation using object-based space partitioning Shiming Zhang, Nikos Mamoulis, David W. Cheung Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data The skyline operator returns from a set of multi-dimensional objects a subset of superior objects that are not dominated by others. This operation is considered very important in multi-objective analysis of large datasets. Although a large ...
以前和大家分享过SIGMOD2009的论文,朋友们都很感兴趣,现手里有SIGMOD211的全部论文,再次和大家分享~ 一个包放不下,一共分成了3个包,包含百余篇论文,朋友们可以挑选自己感兴趣的部分下载,我尽量把文章目录写得明白一些。 这是第一部分。 Session 1: Databases on New Hardware LazyFTL: A Page-Level Flash Translation Layer Optimized for NAND Flash Memory (Page 1) Dongzhe Ma (Tsinghua University) Jianhua Feng (Tsinghua University) Guoliang Li (Tsinghua University) Operation-Aware Buffer Management in Flash-Based Systems (Page 13) Yanfei Lv (Peking University) Bin Cui (Peking University) Bingsheng He (Nanyang Technological University) Xuexuan Chen (Peking University) SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage (Page 25) Biplob Debnath (EMC Corporation) Sudipta Sengupta (Microsoft Research) Jin Li (Microsoft Research) Design and Evaluation of Main Memory Hash Join Algorithms for Multi-Core CPUs (Page 37) Spyros Blanas (University of Wisconsin-Madison) Yinan Li (University of Wisconsin-Madison) Jignesh M. Patel (University of Wisconsin-Madison) (Return to Top) Session 2: Query Processing and Optimization Query Optimization Techniques for Partitioned Tables (Page 49) Herodotos Herodotou (Duke University) Nedyalko Borisov (Duke University) Shivnath Babu (Duke University) CrowdDB: Answering Queries with Crowdsourcing (Page 61) Michael J. Franklin (University of California, Berkeley) Donald Kossmann (ETH Zurich) Tim Kraska (University of California, Berkeley) Sukriti Ramesh (ETH Zurich) Reynold Xin (University of California, Berkeley) Skyline Query Processing Over Joins (Page 73) Akrivi Vlachou (Norwegian University of Science and Technology (NTNU)) Christos Doulkeridis (Norwegian University of Science and Technology (NTNU)) Neoklis Polyzotis (UC Santa Cruz) Efficient Parallel Skyline Processing Using Hyperplane Projections (Page 85) Henning Köhler (The University of Queensland) Jing Yang (Renmin University of China) Xiaofang Zhou (The University of Queensland & Renmin University of China) (Return to Top) Session 3: Schema Mapping and Data Integration Scalable Query Rewriting: A Graph-Based Approach (Page 97) George Konstantinidis (Information Sciences Institute / University of Southern California) José Luis Ambite (Information Sciences Institute / University of Southern California) Automatic Discovery of Attributes in Relational Databases (Page 109) Meihui Zhang (National University of Singapore) Marios Hadjieleftheriou (AT&T Labs - Research) Beng Chin Ooi (National University of Singapore) Cecilia M. Procopiuc (AT&T Labs - Research) Divesh Srivastava (AT&T Labs - Research) Leveraging Query Logs for Schema Mapping Generation in U-MAP (Page 121) Hazem Elmeleegy (AT&T Labs - Research) Ahmed Elmagarmid (Qatar Computing Research Institute, Qatar Foundation) Jaewoo Lee (Purdue University) Designing and Refining Schema Mappings via Data Examples (Page 133) Bogdan Alexe (University of California, Santa Cruz) Balder ten Cate (University of California, Santa Cruz) Phokion G. Kolaitis (University of California, Santa Cruz and IBM Research - Almaden) Wang-Chiew Tan (IBM Research - Almaden and University of California, Santa Cruz) (Return to Top) Session 4: Data on the We Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets (Page 145) Songyun Duan (IBM Research - T.J. Watson Research Center) Anastasios Kementsietsidis (IBM Research - T.J. Watson Research Center) Kavitha Srinivas (IBM Research - T.J. Watson Research Center) Octavian Udrea (IBM Research - T.J. Watson Research Center) Efficient Query Answering in Probabilistic RDF Graphs (Page 157) Xiang Lian (Hong Kong University of Science and Technology) Lei Chen (Hong Kong University of Science and Technology) Facet Discovery for Structured Web Search: A Query-Log Mining Approach (Page 169) Jeffrey Pound (University of Waterloo) Stelios Paparizos (Microsoft Research) Panayiotis Tsaparas (Microsoft Research) Schema-As-You-Go: On Probabilistic Tagging and Querying of Wide Tables (Page 181) Meiyu Lu (National University of Singapore) Divyakant Agrawal (University of California at Santa Barbara) Bing Tian Dai (National University of Singapore) Anthony K. H. Tung (National University of Singapore) (Return to Top) Session 5: Data Privacy and Security No Free Lunch in Data Privacy (Page 193) Daniel Kifer (Penn State University) Ashwin Machanavajjhala (Yahoo! Research) TrustedDB: A Trusted Hardware Based Database with Privacy and Data Confidentiality (Page 205) Sumeet Bajaj (Stony Brook University) Radu Sion (Stony Brook University) Differentially Private Data Cubes: Optimizing Noise Sources and Consistency (Page 217) Bolin Ding (University of Illinois at Urbana-Champaign) Marianne Winslett (Advanced Digital Sciences Center & University of Illinois at Urbana-Champaign) Jiawei Han (University of Illinois at Urbana-Champaign) Zhenhui Li (University of Illinois at Urbana-Champaign) iReduct: Differential Privacy with Reduced Relative Errors (Page 229) Xiaokui Xiao (Nanyang Technological University) Gabriel Bender (Cornell University) Michael Hay (Cornell University) Johannes Gehrke (Cornell University) (Return to Top) Session 6: Data Consistency and Parallel DB A Latency and Fault-Tolerance Optimizer for Online Parallel Query Plans (Page 241) Prasang Upadhyaya (University of Washington) YongChul Kwon (University of Washington) Magdalena Balazinska (University of Washington) ArrayStore: A Storage Manager for Complex Parallel Array Processing (Page 253) Emad Soroush (University of Washington) Magdalena Balazinska (University of Washington) Daniel Wang (SLAC National Accelerator Laboratory) Fast Checkpoint Recovery Algorithms for Frequently Consistent Applications (Page 265) Tuan Cao (Cornell University) Marcos Vaz Salles (Cornell University) Benjamin Sowell (Cornell University) Yao Yue (Cornell University) Alan Demers (Cornell University) Johannes Gehrke (Cornell University) Walker White (Cornell University) Warding off the Dangers of Data Corruption with Amulet (Page 277) Nedyalko Borisov (Duke University) Shivnath Babu (Duke University) Nagapramod Mandagere (IBM Almaden Research) Sandeep Uttamchandani (IBM Almaden Research) (Return to Top) Session 7: Service Oriented Computing, Data Management in the Cloud Schedule Optimization for Data Processing Flows on the Cloud (Page 289) Herald Kllapi (University of Athens) Eva Sitaridi (University of Athens) Manolis M. Tsangaris (University of Athens) Yannis Ioannidis (University of Athens) Zephyr: Live Migration in Shared Nothing Databases for Elastic Cloud Platforms (Page 301) Aaron J. Elmore (University of California, Santa Barbara) Sudipto Das (University of California, Santa Barbara) Divyakant Agrawal (University of California, Santa Barbara) Amr El Abbadi (University of California, Santa Barbara) Workload-Aware Database Monitoring and Consolidation (Page 313) Carlo Curino (Massachusetts Institute of Technology) Evan P. C. Jones (Massachusetts Institute of Technology) Samuel Madden (Massachusetts Institute of Technology) Hari Balakrishnan (Massachusetts Institute of Technology) Predicting Cost Amortization for Query Services (Page 325) Verena Kantere (Cyprus University of Technology) Debabrata Dash (ArcSight) Georgios Gratsias (ELCA Informatique SA) Anastasia Ailamaki (École Polytechnique Fédérale de Lausanne) Performance Prediction for Concurrent Database Workloads (Page 337) Jennie Duggan (Brown University) Ugur Cetintemel (Brown University) Olga Papaemmanouil (Brandeis University) Eli Upfal (Brown University) (Return to Top) Session 8: Spatial and Temporal Data Management Reverse Spatial and Textual k Nearest Neighbor Search (Page 349) Jiaheng Lu (Renmin University of China) Ying Lu (Renmin University of China) Gao Cong (Nanyang Technological University) Location-Aware Type Ahead Search on Spatial Databases: Semantics and Efficiency (Page 361) Senjuti Basu Roy (University of Texas at Arlington) Kaushik Chakrabarti (Microsoft Research) Collective Spatial Keyword Querying (Page 373) Xin Cao (Nanyang Technological University) Gao Cong (Nanyang Technological University) Christian S. Jensen (Aarhus University) Beng Chin Ooi (National University of Singapore) Finding Semantics in Time Series (Page 385) Peng Wang (Fudan University & Microsoft Research Asia) Haixun Wang (Microsoft Research Asia) Wei Wang (Fudan University) Querying Contract Databases Based on Temporal Behavior (Page 397) Elio Damaggio (University of California, San Diego) Alin Deutsch (University of California, San Diego) Dayou Zhou (University of California, San Diego) (Return to Top) Session 9: Shortest Paths and Sequence Data Neighborhood-Privacy Protected Shortest Distance Computing in Cloud (Page 409) Jun Gao (Peking University) Jeffery Xu Yu (Chinese University of Hong Kong) Ruoming Jin (Kent State University) Jiashuai Zhou (Peking University) Tengjiao Wang (Peking University) Dongqing Yang (Peking University) On k-Skip Shortest Paths (Page 421) Yufei Tao (Chinese University of Hong Kong) Cheng Sheng (Chinese University of Hong Kong) Jian Pei (Simon Fraser University) Finding Shortest Path on Land Surface (Page 433) Lian Liu (The Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (The Hong Kong University of Science and Technology) WHAM: A High-Throughput Sequence Alignment Method (Page 445) Yinan Li (University of Wisconsin-Madison) Allison Terrell (University of Wisconsin-Madison) Jignesh M. Patel (University of Wisconsin-Madison) A New Approach for Processing Ranked Subsequence Matching Based on Ranked Union (Page 457) Wook-Shin Han (Kyungpook National University) Jinsoo Lee (Kyungpook National University) Yang-Sae Moon (Kangwon National University) Seung-won Hwang (Pohang University of Science and Technology) Hwanjo Yu (Pohang University of Science and Technology) (Return to Top) Session 10: Data Provenance, Workflow and Cleaning Interaction Between Record Matching and Data Repairing (Page 469) Wenfei Fan (University of Edinburgh & Harbin Institute of Technology) Jianzhong Li (Harbin Institute of Technology) Shuai Ma (Beihang University) Nan Tang (University of Edinburgh) Wenyuan Yu (University of Edinburgh) We Challenge You to Certify Your Updates (Page 481) Su Chen (National University of Singapore) Xin Luna Dong (AT&T Labs-Research) Laks V.S. Lakshmanan (University of British Columbia) Divesh Srivastava (AT&T Labs-Research) Labeling Recursive Workflow Executions On-the-Fly (Page 493) Zhuowei Bao (University of Pennsylvania) Susan B. Davidson (University of Pennsylvania) Tova Milo (Tel Aviv University) Tracing Data Errors with View-Conditioned Causality (Page 505) Alexandra Meliou (University of Washington) Wolfgang Gatterbauer (University of Washington) Suman Nath (Microsoft Research) Dan Suciu (University of Washington) (Return to Top) Session 11: Information Extraction Hybrid In-Database Inference for Declarative Information Extraction (Page 517) Daisy Zhe Wang (University of California, Berkeley) Michael J. Franklin (University of California, Berkeley) Minos Garofalakis (Technical University of Crete) Joseph M. Hellerstein (University of California, Berkeley) Michael L. Wick (University of Massachusetts, Amherst) Faerie: Efficient Filtering Algorithms for Approximate Dictionary-Based Entity Extraction (Page 529) Guoliang Li (Tsinghua University) Dong Deng (Tsinghua University) Jianhua Feng (Tsinghua University) Joint Unsupervised Structure Discovery and Information Extraction (Page 541) Eli Cortez (Universidade Federal do Amazonas) Daniel Oliveira (Universidade Federal do Amazonas) Altigran S. da Silva (Universidade Federal do Amazonas) Edleno S. de Moura (Universidade Federal do Amazonas) Alberto H. F. Laender (Universidade Federal de Minas Gerais) Attribute Domain Discovery for Hidden Web Databases (Page 553) Xin Jin (George Washington University) Nan Zhang (George Washington University) Gautam Das (University of Texas at Arlington) (Return to Top) Session 12: Keyword Search and Ranked Queries Keyword Search Over Relational Databases: A Metadata Approach (Page 565) Sonia Bergamaschi (University of Modena and Reggio Emilia, Italy) Elton Domnori (University of Modena and Reggio Emilia, Italy) Francesco Guerra (University of Modena and Reggio Emilia, Italy) Raquel Trillo Lado (University of Zaragoza) Yannis Velegrakis (University of Trento) Sharing Work in Keyword Search Over Databases (Page 577) Marie Jacob (University of Pennsylvania) Zachary Ives (University of Pennsylvania)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值