Google: Excellent Papers for 2011
Google在研究博客中总结了他们2011年的精彩论文《Excellent Papers for 2011》，包括社会网络、机器学习、人机交互、信息检索、自然语言处理、多媒体、系统等各个领域，很精彩的论文集锦。 http://googleresearch.blogspot.com/2012/03/excellent-papers-for-2011.html
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our
In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a
In the coming weeks we will be offering a more in-depth look at these publications, but here are some summaries:
“Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function”,
Lyon's long title summarizes a result that he has been working toward over many years of modeling sound processing in the inner ear.
Electronic Commerce and Algorithms
“Online Vertex-Weighted Bipartite Matching and Single-bid Budgeted Allocations”,
The authors introduce an elegant and powerful algorithmic technique to the area of online ad allocation and matching: a hybrid of random perturbations and greedy choice to make decisions on the fly. Their technique sheds new light on classic matching algorithms, and can be used, for example, to pick one among a set of relevant ads, without knowing in advance the demand for ad slots on future web page views.
“Milgram-routing in social networks”,
Milgram’s "six-degrees-of-separation experiment" and the fascinating small world hypothesis that follows from it, have generated a lot of interesting research in recent years. In this landmark experiment, Milgram showed that people unknown to each other are often connected by surprisingly short chains of acquaintances. In the paper we prove theoretically and experimentally how a recent model of social networks, "Affiliation Networks", offers an explanation to this phenomena and inspires interesting technique for local routing within social networks.
“Non-Price Equilibria in Markets of Discrete Goods”, Avinatan Hassidim, Haim Kaplan, Yishay Mansour, Noam Nisan,
We present a correspondence between markets of indivisible items, and a family of auction based n player games. We show that a market has a price based (Walrasian) equilibrium if and only if the corresponding game has a pure Nash equilibrium. We then turn to markets which do not have a Walrasian equilibrium (which is the interesting case), and study properties of the mixed Nash equilibria of the corresponding games.
“From Basecamp to Summit: Scaling Field Research Across 9 Locations”,
The paper reports on our experience with a basecamp research hub to coordinate logistics and ongoing real-time analysis with research teams in the field. We also reflect on the implications for the meaning of research in a corporate context, where much of the value may be less in a final report, but more in the curated impressions and memories our colleagues take away from the the research trip.
“User-Defined Motion Gestures for Mobile Interaction”, Jaime Ruiz,
Modern smartphones contain sophisticated sensors that can detect rich motion gestures — deliberate movements of the device by end-users to invoke commands. However, little is known about best-practices in motion gesture design for the mobile computing paradigm. We systematically studied the design space of motion gestures via a guessability study that elicits end-user motion gestures to invoke commands on a smartphone device. The study revealed consensus among our participants on parameters of movement and on mappings of motion gestures onto commands, by which we developed a taxonomy for motion gestures and compiled an end-user inspired motion gesture set. The work lays the foundation of motion gesture design—a new dimension for mobile interaction.
“Reputation Systems for Open Collaboration”, B.T. Adler, L. de Alfaro,
This paper describes content based reputation algorithms, that rely on automated content analysis to derive user and content reputation, and their applications for Wikipedia and google Maps. The Wikipedia reputation system WikiTrust relies on a chronological analysis of user contributions to articles, metering positive or negative increments of reputation whenever new contributions are made. The Google Maps system Crowdsensus compares the information provided by users on map business listings and computes both a likely reconstruction of the correct listing and a reputation value for each user. Algorithmic-based user incentives ensure the trustworthiness of evaluations of Wikipedia entries and Google Maps business information.
Machine Learning and Data Mining
“Domain adaptation in regression”,
Domain adaptation is one of the most important and challenging problems in machine learning.
“On the necessity of irrelevant variables”, David P. Helmbold,
Relevant variables sometimes do much more good than irrelevant variables do harm, so that it is possible to learn a very accurate classifier using predominantly irrelevant variables.
“Online Learning in the Manifold of Low-Rank Matrices”,
Learning measures of similarity from examples of similar and dissimilar pairs is a problem that is hard to scale. LORETA uses retractions, an operator from matrix optimization, to learn low-rank similarity matrices efficiently. This allows to learn similarities between objects like images or texts when represented using many more features than possible before.
“Training a Parser for Machine Translation Reordering”, Jason Katz-Brown,
Machine translation systems often need to understand the syntactic structure of a sentence to translate it correctly. Traditionally, syntactic parsers are evaluated as standalone systems against reference data created by linguists. Instead, we show how to train a parser to optimize reordering accuracy in a machine translation system, resulting in measurable improvements in translation quality over a more traditionally trained parser.
“Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation”, Ashish Venugopal,Jakob Uszkoreit, David Talbot,
We propose a general method to watermark and probabilistically identify the structured results of machine learning algorithms with an application in statistical machine translation. Our approach does not rely on controlling or even knowing the inputs to the algorithm and provides probabilistic guarantees on the ability to identify collections of results from one’s own algorithm, while being robust to limited editing operations.
“Inducing Sentence Structure from Parallel Corpora for Reordering”,
Automatically discovering the full range of linguistic rules that govern the correct use of language is an appealing goal, but extremely challenging.
Multimedia and Computer Vision
“Kernelized Structural SVM Learning for Supervised Object Segmentation”,
The paper proposes a principled way for computers to learn how to segment the foreground from the background of an image given a set of training examples. The technology is build upon a specially designed nonlinear segmentation kernel under the recently proposed structured SVM learning framework.
“Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths”,
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our stabilization technique automatically converts casual shaky footage into more pleasant and professional looking videos by mimicking these cinematographic principles. The original, shaky camera path is divided into a set of segments, each approximated by either constant, linear or parabolic motion, using an algorithm based on robust L1 optimization. The stabilizer has been part of the YouTube Editor (youtube.com/editor) since March 2011.
“The Power of Comparative Reasoning”,
The paper describes a theory derived vector space transform that converts vectors into sparse binary vectors such that Euclidean space operations on the sparse binary vectors imply rank space operations in the original vector space. The transform a) does not need any data-driven supervised/unsupervised learning b) can be computed from polynomial expansions of the input space in linear time (in the degree of the polynomial) and c) can be implemented in 10-lines of code. We show competitive results on similarity search and sparse coding (for classification) tasks.
“Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections”, Dipanjan Das,
We would like to have natural language processing systems for all languages, but obtaining labeled data for all languages and tasks is unrealistic and expensive. We present an approach which leverages existing resources in one language (for example English) to induce part-of-speech taggers for languages without any labeled training data. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in a hidden Markov model trained with the Expectation Maximization algorithm.
“TCP Fast Open”, Sivasankar Radhakrishnan,
TCP Fast Open enables data exchange during TCP’s initial handshake. It decreases application network latency by one full round-trip time, a significant speedup for today's short Web transfers. Our experiments on popular websites show that Fast Open reduces the whole-page load time over 10% on average, and in some cases up to 40%.
“Proportional Rate Reduction for TCP”,
Packet losses increase latency of Web transfers and negatively impact user experience. Proportional rate reduction (PRR) is designed to recover from losses quickly, smoothly and accurately by pacing out retransmissions across received ACKs during TCP’s fast recovery. Experiments on Google Web and YouTube servers in U.S. and India demonstrate that PRR reduces the TCP latency of connections experiencing losses by 3-10% depending on response size.
Security and Privacy
As software is increasingly written in high-level, type-safe languages, attackers have fewer means to subvert system fundamentals, and attacks are more likely to exploit errors and vulnerabilities in application-level logic.
“App Isolation: Get the Security of Multiple Browsers with Just One”, Eric Y. Chen, Jason Bau,
We find that anecdotal advice to use a separate web browser for sites like your bank is indeed effective at defeating most cross-origin web attacks.
“Improving the speed of neural networks on CPUs”,
As deep neural networks become state-of-the-art in real-time machine learning applications such as speech recognition, computational complexity is fast becoming a limiting factor in their adoption. We show how to best leverage modern CPU architectures to significantly speed-up their inference.
“Bayesian Language Model Interpolation for Mobile Speech Input”,
Voice recognition on the Android platform must contend with many possible target domains - e.g. search, maps, SMS. For each of these, a domain-specific language model was built by linearly interpolating several n-gram LMs from a common set of Google corpora. The current work has found a way to efficiently compute a single n-gram language model with accuracy very close to the domain-specific LMs but with considerably less complexity at recognition time.
“Large-Scale Parallel Statistical Forecasting Computations in R”,
This paper describes the implementation of a framework for utilizing distributed computational infrastructure from within the R interactive statistical computing environment, with applications to timeseries forecasting. This system is widely used by the statistical analyst community at Google for data analysis on very large data sets.
“Dremel: Interactive Analysis of Web-Scale Datasets”, Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton,
Dremel is a scalable, interactive ad-hoc query system. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. Besides continued growth internally to Google, Dremel now also backs an increasing number of external customers including BigQuery and UIs such as AdExchange front-end.
“Representative Skylines using Threshold-based Preference Distributions”,
The paper adopts principled approach towards representative skylines and formalizes the problem of displaying k tuples such that the probability that a random user clicks on one of them is maximized. This requires mathematically modeling (a) the likelihood with which a user is interested in a tuple, as well as (b) how one negotiates the lack of knowledge of an explicit set of users. This work presents theoretical and experimental results showing that the suggested algorithm significantly outperforms previously suggested approaches.
“Hyper-local, directions-based ranking of places”, Petros Venetis, Hector Gonzalez,
Click through information is one of the strongest signals we have for ranking web pages. We propose an equivalent signal for raking real world places: The number of times that people ask for precise directions to the address of the place. We show that this signal is competitive in quality with human reviews while being much cheaper to collect, we also show that the signal can be incorporated efficiently into a location search system.
“Power Management of Online Data-Intensive Services”, David Meisner, Christopher M. Sadler,
Compute and data intensive Web services (such as Search) are a notoriously hard target for energy savings techniques. This article characterizes the statistical hardware activity behavior of servers running Web search and discusses the potential opportunities of existing and proposed energy savings techniques.
“The Impact of Memory Subsystem Resource Sharing on Datacenter Applications”, Lingjia Tang, Jason Mars, Neil Vachharajani,
In this work, the authors expose key characteristics of an emerging class of Google-style workloads and show how to enhance system software to take advantage of these characteristics to improve efficiency in data centers. The authors find that across datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing micro-architectural resources on a single machine (such as on-chip caches and bandwidth to memory). The impact of co-locating threads from multiple applications with diverse memory behavior changes the optimal mapping of thread to cores for each application. By employing an adaptive thread-to-core mapper, the authors improved the performance of the datacenter applications by up to 22% over status quo thread-to-core mapping, achieving performance within 3% of optimal.
“Language-Independent Sandboxing of Just-In-Time Compilation and Self-Modifying Code”, Jason Ansel, Petr Marchenko,
Since its introduction in the early 90's, Software Fault Isolation, or SFI, has been a static code technique, commonly perceived as incompatible with dynamic libraries, runtime code generation, and other dynamic code.
“Thialfi: A Client Notification Service for Internet-Scale Applications”, Atul Adya, Gregory Cooper,
This paper describes a notification service that scales to hundreds of millions of users, provides sub-second latency in the common case, and guarantees delivery even in the presence of a wide variety of failures.