SIGMOD 2017论文的摘要与看法

SIGMOD 2017论文聚焦数据库并发控制,如ACIDRain和Cicada,以及存储与分布式系统,如Azure Data Lake Store和OctopusFS。此外,还探讨了数据流处理、版本维护、并行与分布式查询、树与图处理、新硬件、交互式数据探索和AQP等领域。文章强调了高效存储、并发事务处理、分布式文件系统和实时数据流分析的最新进展。
摘要由CSDN通过智能技术生成

SIGMOD2017

持续更新

3.1 Concurrency并发

ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications
ACIDRain:对数据库支持的Web应用程序的并发性攻击


Cicada: Dependably Fast Multi-Core In-Memory Transactions
Cicada:依赖于快速的多核内存事务


BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications
BatchDB:用于交互式应用程序的混合OLTP+OLAP工作负载的高效隔离执行


Transaction Repair for Multi-Version Concurrency Control
多版本并发控制的事务修复


Concerto: A High Concurrency Key-Value Store with Integrity
Concerto:具有完整性的高并发键值存储


Fast Failure Recovery for Main-Memory DBMSs on Multicores
多核心的主存DBMSs的快速故障恢复


Bringing Modular Concurrency Control to the Next Level
将模块化并发控制引入下一层

3.2 Storage and Distribution 存储与分布式

Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics
Azure数据池存储:大型数据分析的超规模分布式文件服务


OctopusFS: A Distributed File System with Tiered Storage Management
octopus usfs:一个具有分层存储管理的分布式文件系统


Monkey: Optimal Navigable Key-Value Store
Monkey: 最佳的适合航行的、可驾驶的键值对存储
【单词】:
Navigable:adj. 可航行的;可驾驶的;适于航行的


Wide Table Layout Optimization based on Column Ordering and Duplication


Query Centric Partitioning and Allocation for Partially Replicated Database Systems


Spanner: Becoming a SQL System


3.3 Streams 数据流

Enabling Signal Processing over Data Streams
数组数据的增量视图维护


Complete Event Trend Detection in High-Rate Event Streams
在高速率事件流中的完成事件趋势检测


LittleTable: A Time-Series Database and Its Uses
LittleTable:一个时间序列数据库及其用途

3.4 Versions and Incremental Maintenance 版本和增量维护

Incremental View Maintenance over Array Data
数组数据的增量视图维护


Incremental Graph Computations: Doable and Undoable 增量图计算:可操作和不可操作


DEX: Query Execution in a Delta-based Storage System
DEX:基于delta的存储系统中的查询执行

3.5 Parallel and Distributed Query Processing

Massively Parallel Processing of Whole Genome Sequence Data: An In-Depth Performance Study 全基因组序列数据的大规模并行处理:一项深入的性能研究


Distributed Provenance Compression
分布式来源压缩

Abstract
Network provenance, which records the execution history of network events as meta-data, is becoming increasingly important for network accountability and failure diagnosis. For example, network provenance may be used to trace the path that a message traversed in a network, or to reveal how a particular routing entry was derived and the parties involved in its derivation. A challenge when storing the provenance of a live network is that the large number of arriving messages may incur substantial storage overhead. In this paper, we explore techniques to dynamically compress distributed provenance stored at scale. Logically, compression is achieved by grouping equivalent provenance trees and maintaining only one concrete copy for each equivalence class. To efficiently identify the equivalent provenance, we (1) introduce distributed event-based linear programs (DELPs) to specify distributed network applications, and (2) statically analyze DELPs to allow for quick detection of provenance equivalence at runtime. Our experimental results demonstrate that our approach leads to significant storage reduction and query latency improvement over alternative approaches.

摘要:
网络起源作为元数据记录网络事件的执行历史,对网络可靠性和故障诊断越来越重要。例如,可以使用网络起源跟踪消息在网络中遍历的路径,或者揭示特定的路由条目是如何派生的,以及与之相关的各方。存储实时网络起源时的一个挑战是,大量到达的消息可能导致大量的存储开销。本文探讨了动态压缩规模化存储的分布式种源的技术。从逻辑上讲,压缩是通过对等价种源树进行分组,并为每个等价类维护一个具体副本来实现的。为了有效地识别等价种源,我们(1)引入了基于事件的分布式线性程序(DELPs)来指定分布式网络应用程序,(2)静态分析DELPs,以便在运行时快速检测种源等价性。我们的实验结果表明,与其他方法相比,我们的方法可以显著减少存储和查询延迟。


ROBUS: Fair Cache Allocation for Data-parallel Workloads ROBUS:数据并行工作负载的公平缓存分配


Heterogeneity-aware Distributed Parameter Servers 了解异质性的分布参数服务器


Distributed Algorithms on Exact Personalized PageRank 分布式算法的精确个性化PageRank


Parallelizing Sequential Graph Computations (Best paper award)
并行序列图计算

3.6 Tree & Graph Processing

Landmark Indexing for Evaluation of Label-Constrained Reachability Queries 标记索引用于评估标签约束的可达性查询


Efficient Ad-Hoc Graph Inference and Matching in Biological Databases
生物数据库中高效的自适应图推理与匹配


DAG Reduction: Fast Answering Reachability Queries
DAG减少:快速响应可达性查询


Flexible and Feasible Support Measures for Mining Frequent Patterns in Large Labeled Graphs
在大标记图形中挖掘频繁模式的灵活性与可行性的支持措施


Exploiting Common Patterns for Tree-Structured Data
为树结构数据开发通用模式


Extracting and Analyzing Hidden Graphs from Relational Databases
从关系数据库中提取和分析隐藏的图


TrillionG: A Trillion-scale Synthetic Graph Generator using a Recursive Vector Model
TrillionG:使用递归向量模型的万亿级合成图生成器


ZipG: A Memory-efficient Graph Store for Interactive Queries


All-in-One: Graph Processing in RDBMSs Revisited


Computing A Near-Maximum Independent Set in Linear Time by Reducing-Peeling

3.7 New Hardware

Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures


A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs
FPGA-based Data Partitioning


Template Skycube Algorithms for Heterogeneous Parallelism on Multicore and GPU Architectures

3.8 Interactive Data Exploration and AQP 交互式数据探索和AQP

Controlling False Discoveries During Interactive Data Exploration
在交互式数据挖掘过程中控制错误的发现


MacroBase: Prioritizing Attention in Fast Data


Data Canopy: Accelerating Exploratory Statistical Analysis


Two-Level Sampling for Join Size Estimation


A General-Purpose Counting Filter: Making Every Bit Count


BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart

3.9 Beliefs, Conflicts, Knowledge

Beta Probabilistic Databases: A Scalable Approach to Belief Updating and

这是前面的13篇论文 1 Keyword search on structured and semi-structured data Yi Chen, Wei Wang, Ziyang Liu, Xuemin Lin Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data 2 Efficient type-ahead search on relational data: a TASTIER approach Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng 3 FlashLogging: exploiting flash devices for synchronous logging performance Shimin Chen 4 Query processing techniques for solid state drives Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener, Goetz Graefe 5 A revised r*-tree in comparison with related index structures Norbert Beckmann, Bernhard Seeger 6 ZStream: a cost-based query processor for adaptively detecting composite events Yuan Mei, Samuel Madden Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Composite (or Complex) event processing (CEP) systems search sequences of incoming events for occurrences of user-specified event patterns. Recently, they have gained more attention in a variety of areas due to their powerful and expressive ... 7 A comparison of flexible schemas for software as a service Stefan Aulbach, Dean Jacobs, Alfons Kemper, Michael Seibold Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data A multi-tenant database system for Software as a Service (SaaS) should offer schemas that are flexible in that they can be extended different versions of the application and dynamically modified while the system is on-line. This ... 8 Query optimizers: time to rethink the contract? Surajit Chaudhuri Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Query Optimization is expected to produce good execution plans for complex queries while taking relatively small optimization time. Moreover, it is expected to pick the execution plans with rather limited knowledge of data and without any ... 9 Keyword search in databases: the power of RDBMS Lu Qin, Jeffrey Xu Yu, Lijun Chang Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Keyword search in relational databases (RDBs) has been extensively studied recently. A keyword search (or a keyword query) in RDBs is specified by a set of keywords to explore the interconnected tuple structures in an RDB ... 10 ROX: run-time optimization of XQueries Riham Abdel Kader, Peter Boncz, Stefan Manegold, Maurice van Keulen Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Optimization of complex XQueries combining many XPath steps and joins is currently hindered by the absence of good cardinality estimation and cost models for XQuery. Additionally, the state-of-the-art of even relational query optimization still ... 11 Query by output Quoc Trung Tran, Chee-Yong Chan, Srinivasan Parthasarathy Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data It has recently been asserted that the usability of a database is as important as its capability. Understanding the database schema, the hidden relationships among attributes in the data all play an important role in this context. Subscribing ... 12 Ranking distributed probabilistic data Feifei Li, Ke Yi, Jeffrey Jestes Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Ranking queries are essential tools to process large amounts of probabilistic data that encode exponentially many possible deterministic instances. In many applications where uncertainty and fuzzy information arise, data are collected from ... 13 Authenticated join processing in outsourced databases Yin Yang, Dimitris Papadias, Stavros Papadopoulos, Panos Kalnis Jun. 2009 Proceedings of the 35th SIGMOD international conference on Management of data Database outsourcing requires that a query server constructs a proof of result correctness, which can be verified by the client using the data owner's signature. Previous authentication techniques deal with range queries on a single relation ...
以前和大家分享过SIGMOD2009的论文,朋友们都很感兴趣,现手里有SIGMOD211的全部论文,再次和大家分享~ 一个包放不下,一共分成了3个包,包含百余篇论文,朋友们可以挑选自己感兴趣的部分下载,我尽量把文章目录写得明白一些。 这是第一部分。 Session 1: Databases on New Hardware LazyFTL: A Page-Level Flash Translation Layer Optimized for NAND Flash Memory (Page 1) Dongzhe Ma (Tsinghua University) Jianhua Feng (Tsinghua University) Guoliang Li (Tsinghua University) Operation-Aware Buffer Management in Flash-Based Systems (Page 13) Yanfei Lv (Peking University) Bin Cui (Peking University) Bingsheng He (Nanyang Technological University) Xuexuan Chen (Peking University) SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage (Page 25) Biplob Debnath (EMC Corporation) Sudipta Sengupta (Microsoft Research) Jin Li (Microsoft Research) Design and Evaluation of Main Memory Hash Join Algorithms for Multi-Core CPUs (Page 37) Spyros Blanas (University of Wisconsin-Madison) Yinan Li (University of Wisconsin-Madison) Jignesh M. Patel (University of Wisconsin-Madison) (Return to Top) Session 2: Query Processing and Optimization Query Optimization Techniques for Partitioned Tables (Page 49) Herodotos Herodotou (Duke University) Nedyalko Borisov (Duke University) Shivnath Babu (Duke University) CrowdDB: Answering Queries with Crowdsourcing (Page 61) Michael J. Franklin (University of California, Berkeley) Donald Kossmann (ETH Zurich) Tim Kraska (University of California, Berkeley) Sukriti Ramesh (ETH Zurich) Reynold Xin (University of California, Berkeley) Skyline Query Processing Over Joins (Page 73) Akrivi Vlachou (Norwegian University of Science and Technology (NTNU)) Christos Doulkeridis (Norwegian University of Science and Technology (NTNU)) Neoklis Polyzotis (UC Santa Cruz) Efficient Parallel Skyline Processing Using Hyperplane Projections (Page 85) Henning Köhler (The University of Queensland) Jing Yang (Renmin University of China) Xiaofang Zhou (The University of Queensland & Renmin University of China) (Return to Top) Session 3: Schema Mapping and Data Integration Scalable Query Rewriting: A Graph-Based Approach (Page 97) George Konstantinidis (Information Sciences Institute / University of Southern California) José Luis Ambite (Information Sciences Institute / University of Southern California) Automatic Discovery of Attributes in Relational Databases (Page 109) Meihui Zhang (National University of Singapore) Marios Hadjieleftheriou (AT&T Labs - Research) Beng Chin Ooi (National University of Singapore) Cecilia M. Procopiuc (AT&T Labs - Research) Divesh Srivastava (AT&T Labs - Research) Leveraging Query Logs for Schema Mapping Generation in U-MAP (Page 121) Hazem Elmeleegy (AT&T Labs - Research) Ahmed Elmagarmid (Qatar Computing Research Institute, Qatar Foundation) Jaewoo Lee (Purdue University) Designing and Refining Schema Mappings via Data Examples (Page 133) Bogdan Alexe (University of California, Santa Cruz) Balder ten Cate (University of California, Santa Cruz) Phokion G. Kolaitis (University of California, Santa Cruz and IBM Research - Almaden) Wang-Chiew Tan (IBM Research - Almaden and University of California, Santa Cruz) (Return to Top) Session 4: Data on the We Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets (Page 145) Songyun Duan (IBM Research - T.J. Watson Research Center) Anastasios Kementsietsidis (IBM Research - T.J. Watson Research Center) Kavitha Srinivas (IBM Research - T.J. Watson Research Center) Octavian Udrea (IBM Research - T.J. Watson Research Center) Efficient Query Answering in Probabilistic RDF Graphs (Page 157) Xiang Lian (Hong Kong University of Science and Technology) Lei Chen (Hong Kong University of Science and Technology) Facet Discovery for Structured Web Search: A Query-Log Mining Approach (Page 169) Jeffrey Pound (University of Waterloo) Stelios Paparizos (Microsoft Research) Panayiotis Tsaparas (Microsoft Research) Schema-As-You-Go: On Probabilistic Tagging and Querying of Wide Tables (Page 181) Meiyu Lu (National University of Singapore) Divyakant Agrawal (University of California at Santa Barbara) Bing Tian Dai (National University of Singapore) Anthony K. H. Tung (National University of Singapore) (Return to Top) Session 5: Data Privacy and Security No Free Lunch in Data Privacy (Page 193) Daniel Kifer (Penn State University) Ashwin Machanavajjhala (Yahoo! Research) TrustedDB: A Trusted Hardware Based Database with Privacy and Data Confidentiality (Page 205) Sumeet Bajaj (Stony Brook University) Radu Sion (Stony Brook University) Differentially Private Data Cubes: Optimizing Noise Sources and Consistency (Page 217) Bolin Ding (University of Illinois at Urbana-Champaign) Marianne Winslett (Advanced Digital Sciences Center & University of Illinois at Urbana-Champaign) Jiawei Han (University of Illinois at Urbana-Champaign) Zhenhui Li (University of Illinois at Urbana-Champaign) iReduct: Differential Privacy with Reduced Relative Errors (Page 229) Xiaokui Xiao (Nanyang Technological University) Gabriel Bender (Cornell University) Michael Hay (Cornell University) Johannes Gehrke (Cornell University) (Return to Top) Session 6: Data Consistency and Parallel DB A Latency and Fault-Tolerance Optimizer for Online Parallel Query Plans (Page 241) Prasang Upadhyaya (University of Washington) YongChul Kwon (University of Washington) Magdalena Balazinska (University of Washington) ArrayStore: A Storage Manager for Complex Parallel Array Processing (Page 253) Emad Soroush (University of Washington) Magdalena Balazinska (University of Washington) Daniel Wang (SLAC National Accelerator Laboratory) Fast Checkpoint Recovery Algorithms for Frequently Consistent Applications (Page 265) Tuan Cao (Cornell University) Marcos Vaz Salles (Cornell University) Benjamin Sowell (Cornell University) Yao Yue (Cornell University) Alan Demers (Cornell University) Johannes Gehrke (Cornell University) Walker White (Cornell University) Warding off the Dangers of Data Corruption with Amulet (Page 277) Nedyalko Borisov (Duke University) Shivnath Babu (Duke University) Nagapramod Mandagere (IBM Almaden Research) Sandeep Uttamchandani (IBM Almaden Research) (Return to Top) Session 7: Service Oriented Computing, Data Management in the Cloud Schedule Optimization for Data Processing Flows on the Cloud (Page 289) Herald Kllapi (University of Athens) Eva Sitaridi (University of Athens) Manolis M. Tsangaris (University of Athens) Yannis Ioannidis (University of Athens) Zephyr: Live Migration in Shared Nothing Databases for Elastic Cloud Platforms (Page 301) Aaron J. Elmore (University of California, Santa Barbara) Sudipto Das (University of California, Santa Barbara) Divyakant Agrawal (University of California, Santa Barbara) Amr El Abbadi (University of California, Santa Barbara) Workload-Aware Database Monitoring and Consolidation (Page 313) Carlo Curino (Massachusetts Institute of Technology) Evan P. C. Jones (Massachusetts Institute of Technology) Samuel Madden (Massachusetts Institute of Technology) Hari Balakrishnan (Massachusetts Institute of Technology) Predicting Cost Amortization for Query Services (Page 325) Verena Kantere (Cyprus University of Technology) Debabrata Dash (ArcSight) Georgios Gratsias (ELCA Informatique SA) Anastasia Ailamaki (École Polytechnique Fédérale de Lausanne) Performance Prediction for Concurrent Database Workloads (Page 337) Jennie Duggan (Brown University) Ugur Cetintemel (Brown University) Olga Papaemmanouil (Brandeis University) Eli Upfal (Brown University) (Return to Top) Session 8: Spatial and Temporal Data Management Reverse Spatial and Textual k Nearest Neighbor Search (Page 349) Jiaheng Lu (Renmin University of China) Ying Lu (Renmin University of China) Gao Cong (Nanyang Technological University) Location-Aware Type Ahead Search on Spatial Databases: Semantics and Efficiency (Page 361) Senjuti Basu Roy (University of Texas at Arlington) Kaushik Chakrabarti (Microsoft Research) Collective Spatial Keyword Querying (Page 373) Xin Cao (Nanyang Technological University) Gao Cong (Nanyang Technological University) Christian S. Jensen (Aarhus University) Beng Chin Ooi (National University of Singapore) Finding Semantics in Time Series (Page 385) Peng Wang (Fudan University & Microsoft Research Asia) Haixun Wang (Microsoft Research Asia) Wei Wang (Fudan University) Querying Contract Databases Based on Temporal Behavior (Page 397) Elio Damaggio (University of California, San Diego) Alin Deutsch (University of California, San Diego) Dayou Zhou (University of California, San Diego) (Return to Top) Session 9: Shortest Paths and Sequence Data Neighborhood-Privacy Protected Shortest Distance Computing in Cloud (Page 409) Jun Gao (Peking University) Jeffery Xu Yu (Chinese University of Hong Kong) Ruoming Jin (Kent State University) Jiashuai Zhou (Peking University) Tengjiao Wang (Peking University) Dongqing Yang (Peking University) On k-Skip Shortest Paths (Page 421) Yufei Tao (Chinese University of Hong Kong) Cheng Sheng (Chinese University of Hong Kong) Jian Pei (Simon Fraser University) Finding Shortest Path on Land Surface (Page 433) Lian Liu (The Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (The Hong Kong University of Science and Technology) WHAM: A High-Throughput Sequence Alignment Method (Page 445) Yinan Li (University of Wisconsin-Madison) Allison Terrell (University of Wisconsin-Madison) Jignesh M. Patel (University of Wisconsin-Madison) A New Approach for Processing Ranked Subsequence Matching Based on Ranked Union (Page 457) Wook-Shin Han (Kyungpook National University) Jinsoo Lee (Kyungpook National University) Yang-Sae Moon (Kangwon National University) Seung-won Hwang (Pohang University of Science and Technology) Hwanjo Yu (Pohang University of Science and Technology) (Return to Top) Session 10: Data Provenance, Workflow and Cleaning Interaction Between Record Matching and Data Repairing (Page 469) Wenfei Fan (University of Edinburgh & Harbin Institute of Technology) Jianzhong Li (Harbin Institute of Technology) Shuai Ma (Beihang University) Nan Tang (University of Edinburgh) Wenyuan Yu (University of Edinburgh) We Challenge You to Certify Your Updates (Page 481) Su Chen (National University of Singapore) Xin Luna Dong (AT&T Labs-Research) Laks V.S. Lakshmanan (University of British Columbia) Divesh Srivastava (AT&T Labs-Research) Labeling Recursive Workflow Executions On-the-Fly (Page 493) Zhuowei Bao (University of Pennsylvania) Susan B. Davidson (University of Pennsylvania) Tova Milo (Tel Aviv University) Tracing Data Errors with View-Conditioned Causality (Page 505) Alexandra Meliou (University of Washington) Wolfgang Gatterbauer (University of Washington) Suman Nath (Microsoft Research) Dan Suciu (University of Washington) (Return to Top) Session 11: Information Extraction Hybrid In-Database Inference for Declarative Information Extraction (Page 517) Daisy Zhe Wang (University of California, Berkeley) Michael J. Franklin (University of California, Berkeley) Minos Garofalakis (Technical University of Crete) Joseph M. Hellerstein (University of California, Berkeley) Michael L. Wick (University of Massachusetts, Amherst) Faerie: Efficient Filtering Algorithms for Approximate Dictionary-Based Entity Extraction (Page 529) Guoliang Li (Tsinghua University) Dong Deng (Tsinghua University) Jianhua Feng (Tsinghua University) Joint Unsupervised Structure Discovery and Information Extraction (Page 541) Eli Cortez (Universidade Federal do Amazonas) Daniel Oliveira (Universidade Federal do Amazonas) Altigran S. da Silva (Universidade Federal do Amazonas) Edleno S. de Moura (Universidade Federal do Amazonas) Alberto H. F. Laender (Universidade Federal de Minas Gerais) Attribute Domain Discovery for Hidden Web Databases (Page 553) Xin Jin (George Washington University) Nan Zhang (George Washington University) Gautam Das (University of Texas at Arlington) (Return to Top) Session 12: Keyword Search and Ranked Queries Keyword Search Over Relational Databases: A Metadata Approach (Page 565) Sonia Bergamaschi (University of Modena and Reggio Emilia, Italy) Elton Domnori (University of Modena and Reggio Emilia, Italy) Francesco Guerra (University of Modena and Reggio Emilia, Italy) Raquel Trillo Lado (University of Zaragoza) Yannis Velegrakis (University of Trento) Sharing Work in Keyword Search Over Databases (Page 577) Marie Jacob (University of Pennsylvania) Zachary Ives (University of Pennsylvania)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值