Course:CPSC522/Knowledge Compilation

Knowledge Compilation

Knowledge compilation is a technique for compiling a knowledge base into a target data structure on which many desired operations can be performed efficiently.
Principal Author: Bahare Fatemi

Abstract

Knowledge compilation has been emerging recently as a new direction of research with regard to the computational intractability of general propositional reasoning. The aim of knowledge compilation is to compile a models/theories into target data structures supporting tractable reasoning. In this page, we provides a brief description of knowledge compilation technique and a series of target languages that are computationally tractable for several reasoning tasks. We use real world examples to provide a clear insight of the technique used by knowledge compilation. We also provide a comparison of the target languages in terms of supporting tractable reasoning as well as succinctness.

Builds on

Knowledge compilation is a subset of artificial intelligence which is used for reasoning in propositional logic.

More general than

Long before knowledge compilation emerged as a research field, compilation to languages like BDD and PI were used for efficient query answering. Knowledge compilation emerged as a research field with the introduction of the NNF (negation normal form) language and with representing that the previous languages are subsets of the NNF.

Content

Introduction

Imagine a library with thousands of books without any order. Suppose you are searching for a specific book. Searching through all the books one by one to find a desired book takes a considerable amount of time.

Now suppose the books are in alphabetical order. Searching for a book takes much less time.

Imagine that you are a librarian and you are in charge of helping people find their books. Assume that searching the whole library for a specific book takes 5 minutes and as a librarian, you have to do the search 5 million times. If you decide to sort the books of the library, that takes 300 minutes. But after sorting the books, searching through the library for a book will cost one minute. If the books are not sorted, you have to take 1 million times * 5 minutes each time = 5000000 minutes and if the books are sorted alphabetically, you have to take 300 minutes + 1 million times * 1 hour each = 1000300 hours, which is a huge save of time.

Now let's consider a similar problem in the field of data structure and algorithm. Suppose we have the information about students in computer science department of UBC, and the query is to find a specific student record. Sorting students will help the department to find the specific student easier. Before sorting, finding a specific student record has O(n) time complexity where n is the number of records in the dataset. If we sort the students' records and store them in a binary tree, binary search algorithm helps us to find a specific record in O(log(n)). Querying on the data is like searching through the library and students' records are like the books.

In both of the above examples, we have a data and some desired queries. We spend some time to convert the original data to a target structure, on which the queries can be answered more efficiently. Knowledge compilation plays the same role in artificial intelligence field. In artificial intelligence there are lots of real world models and we have to query on these models. Knowledge compilation compiles these real world models to a target data structure in which operations are less costly as output. So the function of the knowledge compilation is to compile the input model to a target data structure. Figure 1 shows a high level description of knowledge compilation.

Knowledge compilation has been emerging recently as a new direction of research with the computational intractability of general propositional reasoning. Knowledge compilation splits the reasoning process into two phases: off-line compilation phase and on-line query-answering phase. The main motivation for knowledge compilation is to put as much load as possible in off-line phase to make the reasoning and querying more efficient. The other motivation is to make the reasoning system simpler in on-line phase.

As an example of knowledge compilation, we can think of the input model as Boolean logic. Basic operations in Boolean logic are conditioning, conjoining, disjoining, and so on, as we can see in figure 2. Using knowledge compilation, one can compile Boolean logic theory into a data structure called DNNF (decomposable negation normal form) , on which operations are less costly. For example conditioning can be done in linear time complexity using DNNF, while it has exponential time complexity in CNF (Conjunctive Normal Form), which is the prevalent language of Boolean logic theories.

The NNF language and its subsets

Negation normal form (NNF) is a rooted, directed acyclic graph where each leaf is labeled with true, false, $Xor\neg X$ and each internal leaf is labeled with $\land$ or $\lor$ .

Variables of a node $C_{i}$ in the NNF structure can be exploited by $Vars(C_{i})$ .

Many of the target data structures used by knowledge compilation for logical theories are subsets of NNF. These subsets correspond to NNFs with specific properties. Here are such properties.

Decomposability

For all and-node $C$ in an NNF, with its children $C_{1},...,C_{n}$ . If for $i\neq j,Vars(C_{i})\cap Vars(C_{j})=\emptyset$ this NNF satisfies the decomposability property .

Determinism

For all or-node $C$ in an NNF, with its children $C_{1},...,C_{n}$ . If for $i\neq j,C_{i}\land C_{j}\models false$ this NNF satisfies the determinism property .

Smoothness

For all or-node $C$ in an NNF, with its children $C_{1},...,C_{n}$ . If for $i\neq j,Vars(C_{i})=Vars(C_{j})$ this NNF satisfies the smoothness property .

Decision

A decision node in an NNF, is a node labeled with true, false, or is an or-node having the form $(X\land \alpha )\lor (\neg X\land \beta )$ , in which X is a variable and $\alpha$ and $\beta$ are decision nodes .

Subsets of NNF language

There are different kinds of languages which are subsets of NNF language. Some of them are widely studied in computer science. This section introduces them by their acronym in table 1 taken from .

DNNF is a NNF with decomposability property, BDD is an NNF whose root is a decision node, and OBDD is defined as well-known ordered binary decision diagrams . Other languages are easy to recognize by their acronyms.

Table 1: Language acronyms
Acronym Description
NNF Negation Normal Form
DNNF Decomposable Negation Normal Form
d-NNF Deterministic Negation Normal Form
s-NNF Smooth Negation Normal Form
d-DNNF Deterministic Decomposable Negation Normal Form
sd-DNNF Smooth Deterministic Decomposable Negation Normal Form
BDD Binary Decision Diagram
OBDD Ordered Binary Decision Diagram

Queries

There are some important queries for logical theories which have different time complexities in different languages. There is a list of some important queries and a brief description of them in table 2 taken from .

Table 2: Notation for queries.
Notation Query
CO polytime consistency check
VA polytime validity check
CE polytime clausal entailment check
IM polytime implicant check
EQ polytime equivalence check
SE polytime sentential entailment check
CT polytime model counting
ME polytime model enumeration

Table 3 taken from  shows subsets of the NNF language and queries they support in polytime. ✔ means satisfaction of that query on the corresponding language, ✘ means not satisfaction of that query on the corresponding language, and ? means it is an open problem.

Table 3: Subsets of NNF language and their corresponding polytime queries.
L CO VA CE IM EQ SE CT ME
NNF
DNNF
d-NNF
s-NNF
d-DNNF ?
sd-DNNF ?
BDD
OBDD

Compiling languages

There are algorithms for compiling a Boolean logic theory into the above languages. A detail analysis of these algorithms is out of this page's scope. Interested readers can refer to c2d compiler and DSHARP.

Succinctness

Imposing different properties like decomposability, smoothness, and determinism to NNFs results in more polynomial operation. On the other side adding different properties makes the data structure exponentially larger. for example there is a succinctness ordering of target compilation language between DNNF and OBDD which is taken from :

$DNNf Annotated Bibliography

 Darwiche, Adnan. "Decomposable negation normal form." Journal of the ACM (JACM) 48.4 (2001): 608-647.

 Darwiche, Adnan. "On the tractable counting of theory models and its application to belief revision and truth maintenance." arXiv preprint cs/0003044 (2000).

 Darwiche, Adnan, and Pierre Marquis. "A knowledge compilation map." Journal of Artificial Intelligence Research 17.1 (2002): 229-264.

 Bryant, Randal E. "Graph-based algorithms for boolean function manipulation." Computers, IEEE Transactions on 100.8 (1986): 677-691. Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-NonCommercial-ShareAlike 3.0. The full text of this license may be found here: CC by-nc-sa 3.0 