A differentiable approach to inductive logic programming


Recent work in neural abstract machines has proposed many useful techniques to learn sequences of applications of discrete but differentiable operators. These tech- niques allow us to model traditionally procedural problems using neural networks. In this work, we are interested in using neural networks to learn to perform logic reasoning. We propose a model that has access to differentiable operators which can be composed to perform reasoning. These differentiable reasoning operators were first introduced in TensorLog, a recently proposed probabilistic deductive database. Equipped with a model than can perform logic reasoning, we further investigate the task of inductive logic programming.


Inductive logic programming (ILP) [1] refers to a broad class of problems that aim to find logic rules that model the observed data. The observed data usually contains background knowledge and examples, typically in the form of database relations or knowledge graphs. Inductive logic programming is often combined with use of probabilistic logics, and is a useful technique for knowledge base completion and other relational learning tasks [2]. However, past inductive logic programming approaches have involved discrete search through the space of possible structures [3]. This search is expensive, and difficult to integrate with neural networks.
TensorLog [4] is a recently proposed probabilistic deductive database. The major contribution of TensorLog is that it provides a principled way to define differentiable reasoning processes. TensorLog reduces a broad class of logic programs to inferences made on factor graphs, with logical variables encoded as multinomials over the domain of database constants and logical relations encoded as factors. By “unrolling” the factor graph inference into a sequence of message passing steps, one can obtain a differentiable function that answers a particular class of local queries against a database.
The key idea in this paper is to learn a neural network controller that composes TensorLog’s message passing operators sequentially. Since the operations are differentiable, and can support reasoning, the resulting “neural ILP” system can learn logic programs. These logic programs can be interpreted as induced logic rules.

A neural network model for inductive logic programming

Even though the reasoning process can be made differentiable, it is still not an easy task to design a neural network model for inductive logic programming. Many challenges remain, such as the interface between the neural network controller and the reasoning operators, the representation of logic rules, and the interface to dynamic memory.
To address these problems, we adapt techniques developed in neural abstract machines literature, including Neural Programmer [5], Memory Networks [6], Differentiable Neural Computer [7], and attention mechanism [8], as well as recent work on learning procedural tasks, such as Neural Theorem Prover [9].
The main part of the model is a recurrent neural network acting as a controller, and the controller has access to differentiable operators and memory. Figure 1 provides an overview of the model. The controller takes in the previous state and produces a new state to select three things via attention mechanisms: the next operator, the first input to that operator, and the second input (if there is one). After selection, the operator is applied to the arguments and the output is stored in the next available memory slot.
Intuitively, the operators correspond to the mathematical operations used in message passings. The operators map distributions over database constants to a new distribution: for example, one operator oprelation might map a distribution over set X to a distribution over {y : relation(x, y), x ∈ X }, the set of all database constants that satisfy this particular relation with X . There are a constant number of operators for each database predicate (i.e. relation), as well as some additional operations for set union and intersections.
The memory has two parts. The first part contains embeddings of inputs to query. In the case of argument retrieval query 1, the inputs to the query are the database constants and predicates. The embeddings can be learned jointly during training [10]. The second part of the memory is used to store intermediate operator outputs, which can be thought of as the messages being passed around in graphical models setting.
Once trained, the model can be used to induce logic rules that connect relations in the database. For example, if we want to know how other relations imply the relation uncle, we can ask the trained model a query that involves uncle. The trained model will compose operators to perform reasoning about the query. We can then read off the operators that have the most attention at each time step. If these operators correspond to brother and father, we know that the model has correctly learned to use these two relations to reason about uncle.


We experiment with an European royal family dataset. This dataset contains 3007 entities, 28373 facts, and 12 relations. The relations are father, mother, husband, wife, son, daughter, brother, sister, uncle, aunt, nephew, niece. We split the entities into train and test sets.
To learn to induce logic rules about a specific relation R, we let the database consists of facts about the other relations for both train and test sets. During training, we ask the model to answer queries about the relation R using facts in the database. The loss is the mean squared error between the model’s answer and the true answer. The model is trained with weak supervision. Only the query input and answer are used, and intermediate steps do not have supervision. Table 1 lists induced logic rules for some relations. We found that the model can learn not only rules with multiple predicates, but also rules that involve disjunctions. In the future, we plan to apply the model on more complex and challenging datasets, including web-scale knowledge bases.


神经抽象机器的最新工作提出了许多有用的技术,以学习离散但可微分的算子的应用序列。 这些技术使我们能够使用神经网络对传统的程序问题进行建模。 在这项工作中,我们对使用神经网络学习执行逻辑推理感兴趣。 我们提出了一个模型,该模型可以访问可构造以执行推理的可微分运算符。 这些可区分的推理运算符首先在TensorLog中引入,TensorLog是最近提出的概率演绎数据库。 配备了可以执行逻辑推理的模型,我们进一步研究归纳逻辑编程的任务。





直观上,运算符对应于消息传递中使用的数学运算, 运算符将数据库常量的分布映射到新的分布:比如,一个操作符OP把集合X的分布映射到分布,这个分布是满足与X特点关系的数据库常量集合。每个数据库谓词(即关系)都有固定数量的运算符,还有一些用于集合并集和交集的附加运算符。
内存记忆有两部分,第一部分包含了要查询的输入嵌入。在参数检索查询的情况下,查询输入是数据库常量和谓语。 嵌入可以在训练期间共同学习。存储记忆的第二部分用于存储操作符的中间舒服,可以将其视为在图形模型设置中传递的消息。
训练后,该模型可用于归结连接数据库中关系的逻辑规则。 例如,如果我们想知道其他关系如何隐含关系‘’叔叔‘’,我们可以向训练好的模型询问‘’叔叔‘’的查询。 训练好的模型将组成运算符以执行关于查询的推理。 然后,我们可以读取每个时间步骤中最受关注的运算符。 如果这些运算符对应于‘’父亲‘’和‘’兄弟‘’,我们知道模型已经正确学习了使用这两个关系来推理‘’叔叔‘’。







