GENERIC INFERENCE
www.it-ebooks.info
GENERIC INFERENCE
A Unifying Theory for Automated Reasoning
Marc Pouly
Cork Constraint Computation Centre
University College Cork, Ireland
Interdisciplinary Centre for Security, Reliability and Trust
University of Luxembourg
Jürg Kohlas
Department of Informatics
University of Fribourg, Switzerland
WILEY
A JOHN WILEY & SONS, INC., PUBLICATION
www.it-ebooks.info
Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be
suitable for your situation. You should consult with a professional where appropriate. Neither the
publisher nor author shall be liable for any loss of profit or any other commercial damages, including
but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at
(317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic formats. For more information about Wiley products, visit our web site at
www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Pouly, Marc, 1980-, author.
Generic Inference : A Unifying Theory for Automated Reasoning / Marc Pouly, Jürg Kohlas.
p. cm
Includes bibliographical references and index.
ISBN 978-0-470-52701-6 (hardback)
1. Valuation theory. 2. Algorithms. 3. Algebra, Abstract. I. Kohlas, Jürg, 1939-,
author. II. Title.
QA162.P65 2011
519.54—dc22
2010042336
Printed in Singapore.
oBook ISBN: 9781118010877
ePDF ISBN: 9781118010846
ePub ISBN: 9781118010860
10 9 8 7 6 5 4 3 2 1
www.it-ebooks.info
To Marita and Maria
www.it-ebooks.info
CONTENTS
List of Instances and Applications
xiii
List of Figures and Tables
xvii
Acknowledgments
xxi
Introduction
xxiii
PART I
1
LOCAL COMPUTATION
Valuation Algebras
1.1
1.2
1.3
3
Operations and Axioms
First Examples
Conclusion
Appendix: Generalizations of the Valuation Algebra Framework
A.l
Ordered Sets and Lattices
A. 1.1 Partitions and Partition Lattices
A.2
Valuation Algebras on General Lattices
A.3
Valuation Algebras with Partial Projection
Problem Sets and Exercises
4
6
17
17
17
19
20
24
26
VII
www.it-ebooks.info
viii
2
CONTENTS
29
2.1
2.2
2.3
2.4
3
Inference Problems
30
32
33
44
44
Graphs, Trees and Hypergraphs
Knowledgebases and their Representation
The Inference Problem
Conclusion
Problem Sets and Exercises
Computing Single Queries
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
47
Valuation Algebras with Variable Elimination
49
Fusion and Bucket Elimination
50
3.2.1
The Fusion Algorithm
50
3.2.2
Join Trees
55
3.2.3
The Bucket Elimination Algorithm
57
3.2.4
First Complexity Considerations
61
3.2.5
Some Generalizing Complexity Comments
66
3.2.6
Limitations of Fusion and Bucket Elimination
68
Valuation Algebras with Neutral Elements
69
3.3.1
Stable Valuation Algebras
70
Valuation Algebras with Null Elements
73
Local Computation as Message-Passing Scheme
75
3.5.1
The Complexity of Fusion as Message-Passing Scheme 77
Covering Join Trees
78
Join Tree Construction
82
3.7.1
Join Tree Construction by Triangulation
83
The Collect Algorithm
87
3.8.1
The Complexity of the Collect Algorithm
90
3.8.2
Limitations of the Collect Algorithm
91
Adjoining an Identity Element
91
The Generalized Collect Algorithm
92
3.10.1 Discussion of the Generalized Collect Algorithm
96
3.10.2 The Complexity of the Generalized Collect Algorithm
97
An Application: The Fast Fourier Transform
98
Conclusion
100
Appendix : Proof of the Generalized Collect Algorithm
Problem Sets and Exercises
101
103
4
109
Computing Multiple Queries
www.it-ebooks.info
CONTENTS
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
IX
The Shenoy-Shafer Architecture
4.1.1
Collect & Distribute Phase
4.1.2
The Binary Shenoy-Shafer Architecture
4.1.3
Performance Gains due to the Identity Element
4.1.4
Complexity of the Shenoy-Shafer Architecture
4.1.5
Discussion of the Shenoy-Shafer Architecture
4.1.6
The Super-Cluster Architecture
Valuation Algebras with Inverse Elements
4.2.1
Idempotent Valuation Algebras
The Lauritzen-Spiegelhalter Architecture
4.3.1
Complexity of the Lauritzen-Spiegelhalter Architecture
The HUGIN Architecture
4.4.1
Complexity of the HUGIN Architecture
The Idempotent Architecture
4.5.1
Complexity of the Idempotent Architecture
Answering Uncovered Queries
4.6.1
The Complexity of Answering Uncovered Queries
Scaling and Normalization
Local Computation with Scaling
4.8.1
The Scaled Shenoy-Shafer Architecture
4.8.2
The Scaled Lauritzen-Spiegelhalter Architecture
4.8.3
The Scaled HUGIN Architecture
Conclusion
111
114
116
117
118
120
120
121
122
123
127
128
131
131
134
134
141
143
147
147
150
151
152
Appendix: Valuation Algebras with Division
D. 1 Properties for the Introduction of Division
D.l.l
Separative Valuation Algebras
D.I.2 Regular Valuation Algebras
D.1.3 Idempotent Valuation Algebras
D.2
Proofs of Division-Based Architectures
D.2.1 Proof of the Lauritzen-Spiegelhalter Architecture
D.2.2 Proof of the HUGIN Architecture
D.3
Proof for Scaling in Valuation Algebras
Problem Sets and Exercises
153
154
155
164
167
167
167
168
169
173
PART II GENERIC CONSTRUCTIONS
5
Semiring Valuation Algebras
181
5.1
182
Semirings
www.it-ebooks.info
X
CONTENTS
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.1.1
Multidimensional Semirings
5.1.2
Semiring Matrices
Semirings and Order
Semiring Valuation Algebras
Examples of Semiring Valuation Algebras
Properties of Semiring Valuation Algebras
5.5.1
Semiring Valuation Algebras with Neutral Elements
5.5.2
Stable Semiring Valuation Algebras
5.5.3
Semiring Valuation Algebras with Null Elements
Some Computational Aspects
Set-Based Semiring Valuation Algebras
Properties of Set-Based Semiring Valuation Algebras
5.8.1
Neutral and Stable Set-Based Semiring Valuations
5.8.2
Null Set-Based Semiring Valuations
Conclusion
185
185
186
190
192
195
196
196
197
198
200
203
203
204
204
Appendix: Semiring Valuation Algebras with Division
E. 1
Separative Semiring Valuation Algebras
E.2
Regular Semiring Valuation Algebras
E.3
Cancellative Semiring Valuation Algebras
E.4
Idempotent Semiring Valuation Algebras
E.5
Scalable Semiring Valuation Algebras
Problem Sets and Exercises
205
205
209
212
213
215
217
6
Valuation Algebras for Path Problems
219
6.1
6.2
6.3
6.4
6.5
6.6
220
223
232
237
245
246
249
253
256
256
262
262
6.7
6.8
6.9
6.10
Some Path Problem Examples
The Algebraic Path Problem
Quasi-Regular Semirings
Quasi-Regular Valuation Algebras
Properties of Quasi-Regular Valuation Algebras
Kleene Algebras
6.6.1
Matrices over Kleene Algebras
Kleene Valuation Algebras
Properties of Kleene Valuation Algebras
Further Path Problems
Conclusion
Problem Sets and Exercises
www.it-ebooks.info
CONTENTS
7
XI
Language and Information
265
7.1
266
266
269
271
274
275
279
280
281
286
289
290
7.2
7.3
7.4
Propositional Logic
7.1.1
Language and Semantics
7.1.2
Propositional Information
7.1.3
Some Computational Aspects
Linear Equations
7.2.1
Equations and Solution Spaces
7.2.2
Algebra of Affine Spaces
Information in Context
7.3.1
Contexts: Models and Language
7.3.2
Tuple Model Structures
Conclusion
Problem Sets and Exercises
PART III
8
APPLICATIONS
Dynamic Programming
293
8.1
8.2
294
297
297
299
303
304
304
307
311
317
318
8.3
8.4
8.5
9
Solutions and Solution Extensions
Computing Solutions
8.2.1
Computing all Solutions with Distribute
8.2.2
Computing some Solutions without Distribute
8.2.3
Computing all Solutions without Distribute
Optimization and Constraint Problems
8.3.1
Totally Ordered Semirings
8.3.2
Optimization Problems
Computing Solutions of Optimization Problems
Conclusion
Problem Sets and Exercises
Sparse Matrix Techniques
321
9.1
322
323
325
328
335
338
339
343
9.2
Systems of Linear Equations
9.1.1
Gaussian Variable Elimination
9.1.2
Fill-ins and Local Computation
9.1.3
Regular Systems
9.1.4
LDR-Decomposition
Symmetric, Positive Definite Matrices
9.2.1
Valuation Algebra of Symmetric Systems
9.2.2
Solving Symmetric Systems
www.it-ebooks.info
XII
CONTENTS
9.3
9.4
10
9.2.3
Symmetrie Gaussian Elimination
9.2.4
Symmetrie Deeompositions and Elimination Sequenees
9.2.5
An Application: The Least Squares Method
Semiring Fixpoint Equation Systems
9.3.1
Computing Quasi-Inverse Matrices
9.3.2
Local Computation with Quasi-Regular Valuations
9.3.3
Local Computation with Kleene Valuations
Conclusion
Problem Sets and Exercises
348
354
357
364
365
366
371
382
382
Gaussian Information
385
10.1
10.2
10.3
Gaussian Systems and Potentials
Generalized Gaussian Potentials
Gaussian Information and Gaussian Potentials
10.3.1 Assumption-Based Reasoning
10.3.2 General Gaussian Information
10.3.3 Combination of Gaussian Information
10.3.4 Projection of Gaussian Information
Valuation Algebra of Gaussian Potentials
An Application: Gaussian Dynamic Systems
An Application: Gaussian Bayesian Networks
Conclusion
386
395
400
400
403
407
408
414
417
423
428
Valuation Algebra Properties of Hints
Gaussian Densities
Problem Sets and Exercises
428
428
433
435
10.4
10.5
10.6
10.7
Appendix:
J.l
J.2
References
437
Index
447
www.it-ebooks.info
List of Instances and Applications
1.1 Indicator Functions, Boolean Functions and Crisp Constraints
7
1.2 Relational Algebra
9
1.3 Arithmetic Potentials
10
1.4 Set Potentials
12
1.5 Density Functions
13
1.6 Gaussian Densities and Gaussian Potentials
15
A.7 Set Potentials on Partition Lattices
24
A.8 Quotients of Density Functions
26
2.1 Вayesian Networks
33
2.2 Query Answering in Relational Databases
35
2.3 Dempster-Shafer Theory of Evidence
36
2.4 Satisfiability of Constraints and Propositional Logic
38
2.5 Hadamard Transform
41
xiii
www.it-ebooks.info
XIV
LIST OF INSTANCES AND APPLICATIONS
2.6 Discrete Fourier and Cosine Transforms
42
B.4 Filtering, Prediction and Smoothing in hidden Markov chains
45
4.4 Probability Potentials
146
4.5 Probability Density Functions
146
D.9 Scaling of Belief Functions
170
5.1 Weighted Constraints, Spohn Potentials and GAI Preferences
192
5.2 Possibility Potentials, Probabilistic Constraints and Fuzzy Sets
193
5.3 Set-based Constraints and Assumption-based Reasoning
194
5.3 Possibility Measures
202
5.3 Disbelief Functions
202
E.15 Quasi-Spohn Potentials
212
E. 18 Bottleneck Constraints
215
6.1 Connectivity Path Problem
220
6.2 Shortest and Longest Distance Problem
221
6.3 Maximum Capacity Problem
221
6.4 Maximum Reliability Problem
222
6.5 Regular Languages
223
6.6 Path Counting Problem
257
6.7 Markov Chains
257
6.8 Numeric Partial Differentiation
259
6.9 Matrix Multiplication
261
F. 1 Path Listing Problems
263
F.2 Testing whether Graphs are Bipartite
263
F.3 Identifying Cut Nodes in Graphs
263
F.6 Network Compilation
264
F.6 Symbolic Partial Differentiation
264
7.1 Satisfiability in Prepositional Logic
272
7.2 Theorem Proving in Prepositional Logic
272
www.it-ebooks.info
LIST OF INSTANCES AND APPLICATIONS
XV
7.3 Propositional Logic
283
7.4 Linear Equations Systems and Affine Spaces
283
7.5 Linear Inequality Systems and Convex Polyhedra
284
7.6 Predicate Logic
284
G. 1 Consequence Finding in Propositional Logic
290
8.1 Classical Optimization
307
8.2 Satisfiability of Constraints and Propositional Logic
308
8.3 Maximum Satisfiability of Constraints and Propositional Logic
308
8.4 Most and Least Probable Explanations
308
8.5 Bayesian and Maximum Likelihood Decoding
309
8.6 Linear Decoding and Gallager-Tanner-Wiberg Algorithm
310
9.1 Least Squares Method
357
9.2 Smoothing and Filtering in Linear Dynamic System
361
10.1 Gaussian Time-Discrete Dynamic Systems
395
10.2 Kaiman Filter: Filtering Problem
418
10.3 Gaussian Bayesian Networks
423
www.it-ebooks.info
List of Figures
I.l
The graph associated with the inference problem of Exercise 1.1.
A.l
A tree of diagnoses for stubborn cars.
21
2.1
Undirected graphs.
30
2.2
Directed graphs.
31
2.3
A labeled graph.
31
2.4
Hypergraph, primal graph and dual graph.
32
2.5
Possibilities of knowledgebase representation.
33
2.6
Bayesian network of a medical example.
35
2.7
A digital circuit of a binary adder.
39
2.8
The result table of a full adder circuit.
39
2.9
Connecting two 1-bit-adders produces a 2-bit-adder.
40
2.10
Input and output of a discrete Fourier transform.
42
2.11
Function board of a discrete Fourier transform.
44
xxvi
xvii
www.it-ebooks.info
XVÜi
LIST OF FIGURES
2.12
A hidden Markov chain.
46
3.1
Finalization of the graphical fusion process.
56
3.2
Extending a labeled tree to a join tree.
56
3.3
The bucket-tree of Example 3.5.
59
3.4
The join tree obtained from the bucket-tree of Figure 3.3.
60
3.5
The join tree for Example 1.2.
61
3.6
Different elimination sequences produce different join trees.
63
3.7
A primal graph and its induced graph.
64
3.8
A covering join tree of an inference problem.
76
3.9
The fusion algorithm as message-passing scheme.
77
3.10
A covering join tree of Instance 2.1.
79
3.11
A covering join tree of Instance 2.1.
81
3.12
A graph with two possible triangulations.
83
3.13
The triangulated primal graph of Instance 2.1.
86
3.14
Join graph and derived join tree.
86
3.15
A complete run of the collect algorithm.
89
3.16
Join tree of a discrete Fourier transform.
99
4.1
Changing the root of a join tree.
110
4.2
Mailboxes in the Shenoy-Shafer architecture.
111
4.3
Join tree to illustrate the Shenoy-Shafer architecture.
113
4.4
Using non-binary join trees creates redundant combinations.
117
4.5
Larges treewidth due to non-binary join trees.
117
4.6
The performance gain due to the identity element 1.
118
4.7
The performance gain due to the identity element 2.
119
4.8
Illustration of the super-cluster approach.
121
4.9
Separator in the HUGIN architecture.
129
4.10
The path between node 1 and node 6 in the join tree of Figure 4.3. 140
4.11
A graphical summary of local computation architectures.
www.it-ebooks.info
154
LIST OF FIGURES
XIX
D. 1
Embedding a separative valuation algebra into a union of groups.
157
D.2
Decomposing a regular valuation algebra into a union of groups.
165
5.1
Semiring valuation algebras with neutral and null elements.
199
E. 1
Embedding a separative semigroup into a union of groups.
207
E.2
Decomposing a regular semigroup into a union of groups.
210
E.3
Embedding a cancellative semigroup into a group.
213
E.4
Semiring valuation algebras and division-related properties.
216
6.1
The connectivity path problem.
221
6.2
The shortest distance problem.
222
6.3
The maximum capacity problem.
222
6.4
The maximum reliability problem.
223
6.5
Determining the language of an automaton.
224
6.6
Path sets in cycled graphs may be infinite.
225
6.7
The adjacency matrix of a directed, weighted graph.
226
6.8
An interpretation of equation (6.32).
250
6.9
The path counting problem.
257
6.10
Interpreting computations in Markov chains as path problems.
259
6.11
The computational graph of a function.
260
8.1
A binary, unreliable, memory less communication channel.
310
8.2
The join tree that belongs to the node factors of Example 8.4.
314
8.3
A serial connection of unreliable, memoryless channels.
318
8.4
A parallel connection of unreliable, memoryless channels.
319
9.1
Zero-pattern of a linear equation system.
325
9.2
The join tree for Figure 9.1.
326
9.3
The join tree induced by the variable elimination in Example 9.1.
334
9.4
Recursive buildup of lower triangular matrices.
350
9.5
The join tree of Example 9.3.
353
9.6
The graph representing the non-zero pattern of Example 9.4.
355
www.it-ebooks.info
XX
LIST OF FIGURES
9.7
The join tree of the triangulated graph in Figure 9.6.
9.8
The join tree covering the equations of the linear dynamic system. 362
9.9
A path problem with values from a partially ordered semiring.
379
10.1
A causal model for the wholesale price of a car.
388
10.2
The graph of the time-discrete dynamic system.
396
10.3
A covering join tree for the system (10.6).
397
10.4
Influence among variables in the Kaiman filter model.
419
10.5
A covering join tree for the Kaiman filter model.
419
www.it-ebooks.info
355
Acknowledgments
The authors would like to thank our collaborators from the Department of Informatics
at the University of Fribourg (Switzerland), the Cork Constraint Computation Centre
at the University College Cork (Ireland) and the Interdisciplinary Centre for Security,
Reliability and Trust at the University of Luxembourg for the many interesting
discussions that helped to improve the content and presentation of this book. In
particular, we are grateful for the indispensable support of Christian Eichenberger
during our journey through the Gaussian world and to Radu Marinescu and Nie Wilson
for providing expert knowledge regarding inference, search methods and semiring
systems. Jutta Langel and Jacek Jonczy both corrected parts of this book and provided
valuable feedback. Special thank also goes to Cassie Craig from John Wiley & Sons,
Inc. for the editing support and to the Swiss National Science Foundation, which
financed the research stay of Marc Pouly at the Cork Constraint Computation Centre.
Marc Pouly & Jiirg Kohlas
XXI
www.it-ebooks.info
Introduction
Generic Algorithms
Abstract mathematical structures usually have a large number of models. Algorithms
based on operations and laws of such structures therefore have an identical form:
they are generic. This means that for each particular instance of the structure, only
the basic operations have to be adapted, whereas the overall algorithm remains the
same. The simplest and typical problem is the one of sorting. It applies to totally
ordered structures with an order relation <. Any sorting algorithm can be formulated
in terms of the order relation < and it can be adapted to any particular model of an
order relation by specifying this relation; for example for integers, real numbers or for
a lexicographical order relation of alphanumeric strings. In addition, many programming languages allow for generic programming, i.e. in their syntax they provide the
means to formulate generic algorithms and to specialize them to particular instances.
In this book, an abstract algebraic structure called valuation algebra is proposed.
It provides the foundation to formulate an abstract inference problem, to construct
a graphical data structure and a generic inference algorithm to solve the inference
problem. Usually, one arrives at such general structures from concrete problems and
algorithms by lifting them to their most abstract form. This also holds for the present
case. In 1988, Lauritzen and Spiegelhalter (Lauritzen & Spiegelhalter, 1988) proposed an algorithm for solving the inference problem for Bayesian networks based
XXIII
www.it-ebooks.info
XXÏV
INTRODUCTION
on a paradigm called local computation. Soon after, Shenoy and Shafer (Shafer &
Shenoy, 1988; Shenoy & Shafer, 1990) noted that the same algorithm could also be
applied to solve inference problems with belief functions. They proposed a small
but sufficient system of axioms for an algebraic framework that makes possible the
application of the generic inference algorithm. The algebraic theory of valuation
algebras was developed in (Kohlas, 2003), and it was shown that they permit, under
certain varying additional conditions, four different generic inference architectures.
In (Pouly, 2008) a computer implementation of these generic architectures together
with a number of concrete instances is described, see also (Pouly, 2010).
In the meantime, many different models of valuation algebras from very distant fields of Mathematics and Computer Science were identified. They range from
probabilistic and statistical systems, different uncertainty formalisms to constraint
systems, passing by various systems of logic, relational databases and systems of
linear equations over fields and semirings. Many of these instances are presented
in this book. Their computational interest can always be expressed in terms of the
general inference problem, which can be solved by the same generic algorithm or
inference architectures. This is first and foremost of great practical interest. Instead
of developing and programming inference algorithms for each instance separately,
it is possible to use a single generic system. Of course, in the past, individual solution algorithms and computer programs have been proposed and written mostly
independent from each other, using the same basic concepts but often naming them
quite differently. This adds to a certain Babylonian confusion, especially for students
who should get an overall view of Computer Science. Further, it also represents a
dissipation of valuable human resources in research and problem solving.
Second, the process of abstraction is of great theoretical importance. It provides a
unifying view of seemingly different fields of Computer Science and allows to elaborate both similarities and differences between problem classes. In fact, valuation
algebras allow for a very appealing general view of information processing: information comes in pieces, usually from different sources. Each piece of information
concerns a particular domain and answers, possibly only partially, specific questions.
Pieces of information can be aggregated or combined, which is one of the basic operations of a valuation algebra. So, the general inference problem essentially consists
of combining all available information. But usually one is only interested in some
particular facet or aspect of the total information. Therefore, the parts of the total
information, which are relevant to precise questions, must be extracted. Extraction
or projection of information is the second basic operation of a valuation algebra.
In relational databases for example, combination corresponds to the join operation,
whereas information extraction corresponds to the usual operation of projection in the
relational algebra. So, valuation algebras show that this scheme of relational algebra
is much more general. The form and nature of information elements may be very
different from model to model.
www.it-ebooks.info
- Xem thêm -