Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CSC242 Intro to AI

Project 3: Uncertain Inference

In this project you will get some experience with uncertain inference by implementing some of the algorithms for it from the textbook and evaluating your results. We will focus on Bayesian networks, since they are popular, well-understood, and well-explained in the textbook. They are also the basis for many other formalisms used in AI for dealing with uncertain knowledge.

Requirements Overview

1. You must design and implement a framework for representing Bayesian Networks for random variables with finite ranges, as seen in class and in the textbook (Chap- ter 13, 3rd. ed. Chapter 14).

2. You must implement the Inference by Enumeration algorithm for exact inference described in AIMA Section 13.3/14.4.

3. You must implement the Rejection Sampling and Likelihood Weighting algorithms for approximate inference described in AIMA Section 13.4/14.5.

4. You must implement a method for reading networks from files, following a format known as XMLBIF.

We will run your programs and test them on a number of queries for a number of net- works. Your grade will be based on how many of them your programs answer correctly. Your programs must be able to read networks from files.

Each of these are described in more detail in subsequent sections, and there is starter code posted with the project if you don’t want to design and build it yourself.

Background (AIMA 4th ed. Chapter 13, 3rd ed. Chapter 14)

A Bayesian network is a directed acyclic graph whose vertices are the random variables {X} n E n Y, where:

• X is the query variable

E are the evidence (observed) variables

• Y are the unobserved (or hidden) variables1

The inference problem for Bayesian Networks is to calculate P(X |e), where e are the observed values for the evidence variables.  That is, inference in Bayesian networks involves computing the posterior distribution of the query variable given the evidence. In other words, inference in Bayesian networks involves computing a probability for each possible value of the query variable, given the evidence.

In general, from the definition of conditional probability and the process of marginaliza- tion, we have that:

P(X |e) = α P(X,e) = α工P(X,e, y)

y         (AIMA Eq. 12.9/13.9)

And for a Bayesian network you can factor that full joint distribution into a product of the conditional probabilities stored at the nodes of the network:

n

P(x1 , . . . ,xn ) =u P(xi |parents(Xi ))

i=1              (AIMA Eq. 13.2/14.2)

Therefore, for a Bayesian network:

n

P(X |e) = αu P(xi |parents(Xi ))

y    i=1

Or in words: “a query can be answered from a Bayesian Network by computing sums of products of conditional probabilities from the network” (AIMA p. 428/523).  These equations are the basis for the various inference algorithms for Bayesian networks.

Designing an Implementation of Bayesian Networks

You will need to implement a representation of Bayesian networks and several inference algorithms that use the representation. Before you can do that, you need to understand

what these things are. Really understand.

So: what is a Bayesian network?

THINK ABOUT THIS YOURSELF NOW, THEN READ ON.

DID YOU THINK ABOUT IT?

REALLY?

WITH A WHITEBOARD OR A PIECE OF PAPER IN FRONT OF YOU? IF SO, READ ON.

I thought first of a directed graph.  A directed graph is a set of nodes (vertices) with directed edges between them. Everyone in CSC242 should be comfortable with graphs. If you’re a bit rusty on them, either look at your notes from CSC172 or go find the textbook used in CSC173 online (Chapter 9 is on graphs).

In a Bayesian network, each node in the graph is associated with a random variable. Each random variable has a name and a range of possible values. We will assume finite ranges for this project.

Each node in the graph also stores a probability distribution.  Root nodes (with no par- ents) store the prior probability distribution of their random variable.  Non-root nodes store the conditional probability distribution of their random variable given the values of its parents: P(Xi |parents(Xi )).

Next question: What is a Bayesian network inference problem?

THINK ABOUT THIS YOURSELF NOW, THEN READ ON.

A Bayesian network inference problem involves computing the distribution P(X |e), where X is the query variable and e is an assignment of values to the evidence variables (note the similarity with our terminology from Unit 2).

The answer to a Bayesian network inference problem is a posterior distribution for the query variable given the evidence.  That is, for each possible value of the query vari- able, it’s the probability that the query variable takes on that value, given the evidence. Another way of thinking about distributions is that they are a mapping from values of a variable to probabilities—the probability that the variable takes on that value, given the evidence. Note that distributions must satisfy the axioms of probability.

A Bayesian network inference problem solver takes a Bayesian network (any Bayesian network) and an inference problem (any inference problem, query and evidence, using variables from the network), and has one or more methods that compute and return the posterior probability distribution of the query variable given the evidence.  The solver may have additional properties or parameters if needed.

Implementing Bayesian Networks

With the background fully understood, start by designing the main elements of your program. This should be done abstractly: define what the different pieces are and what they need to do but not how they do it. An important aspect of this is the relationships between the pieces.

If you will be programming in Java, I strongly recommend using abstract classes or inter- faces for the abstract specifications. This allows you to concentrate on the relationships between the pieces rather than the details of how to make them do what they need to do.  You will implement the interfaces after you have the design right.  Non-Java pro- grammers can do similar things, although why wouldn’t you use Java for a project like this?

So: networks (graphs), variables, ranges, assignments. These should be fairly straight- forward.  Use Java Collections where appropriate, or something similar in another lan- guage, although why wouldn’t you use Java for a project like this?

One tricky thing to design is the representation of probability distributions. You need to be able to represent in the computer both prior (unconditional) distributions P(X) and posterior (conditional) distributions P(X |Y,Z, . . .).

Think about it.

SERIOUSLY, THINK ABOUT IT. THEN READ ON.

Figure 1: Example of a small Bayesian network possibly useful during development

We said earlier that for random variables with finite ranges such as we are considering, a prior (or unconditional) probability distribution is a mapping from the possible values of a variable to the probability that the variable has that value. So a mapping. . . could be something involving a hashtable, right? But also consider that the ranges are generally not very large, so maybe a simple array or list of value-probability pairs would be easier and perhaps even faster in practice.

For random variables with finite ranges, a posterior (or conditional) probability distribu- tion is a table (a conditional probability table, or CPT). Each CPT is“for”the variable at that node. Each row of the table corresponds to an assignment of values to its parents’ variables.  The columns of a CPT give the probabilities for each value of the variable, given the assignment for the row.  So the table can be thought of as a mapping from assignments to distributions, although that might not be the easiest way to implement them.

One warning for Java programmers: hashtables whose keys are not instances of builtin classes can be tricky. Be sure you understand how hashCode and equals work (see the documentation for java .lang .Object for a discussion).

So now you’ve got the basic framework of classes setup. You should be able to create a small Bayesian network in code, perhaps like the one shown in Figure 1. You could also create the Burglary/Earthquake Alarm network (AIMA Fig. 13.2/14.2) and the Wet Grass example (AIMA Fig. 13.15/14.12) by hand if you want.

Time to think about inference. And then you still need to tackle the problem of reading networks from files.

Exact Inference

For the first part of the project, you must implement the “inference by enumeration” algorithm described in AIMA Sec. 13.3/14.4.  Pseudocode for the algorithm is given in Fig. 13.11/14.9 and Sec. 13.3.2/14.4.2 suggests some speedups for you to consider.

You should be able to turn the pseudocode into real code using your classes and inter- faces (or whatever you Java haters are doing). I actually copy the pseudocode into my Java source file, put“//”in front of each line to make it into comments, and use it as a guide as I write the code.

One comment from hard experience: The inference by enumeration algorithm requires that the variables be set in topological order (so that by the time you need to lookup a probability in a variable’s node’s conditional probability table, you have the values of its parents). I don’t think that the pseudocode in the textbook makes this clear.

You should be able to have your implementation answer queries on the small network that you built by hand. You can compute the correct answers by hand for any choice of probabilities in the network. You could also test your implementation on the Alarm and Wet Grass networks if you created them, since there are exact answers for one query for each of them in the book (and you could work out more by hand).

Approximate Inference

You must also implement algorithms for approximate inference in Bayesian networks. The algorithms described in the textbook (AIMA 13.4/14.5) and in class are:

1.  Rejection sampling

2.  Likelihood weighting

3. Gibbs sampling

The first two are straightforward to implement once you have the representation of Bayesian networks and their components, but they can be very inefficient. Gibbs sam- pling is not that hard, the only complication being sampling from the distribution of a variable given its Markov blanket (see AIMA Eq. 13.10/14.12).

You must implement the first two algorithms. You would learn a lot by also implementing Gibbs Sampling, but it is not required and we will not give extra credit for it. In fact, there are no opportunities for extra credit on this project.

Reading Bayesian Networks From Files

Creating Bayesian networks by hand using code is incredibly tedious and error-prone. So you must also to be able to read in networks from files. This is the only way that we can test your program, since we aren’t going to do a bunch of programming to setup a new problem.

We will give you Java code for parsers that read two quasi-standard file representa- tions for Bayesian networks: the original BIF format (Cozman, 1996) and its successor XMLBIF (Cozman, 1998). The downloads for this project include several example net- works (see below for descriptions), and there are many others available on the Internet.2

If you develop your own representation classes, which I highly recommend, you should definitely be able to use the XMLBIF parser with a little reverse-engineering.  The BIF parser might be harder, since it was itself generated from a grammar of the BIF format using ANTLR.

Whatever you do, your program must be able to read XMLBIF files. If you can also read BIF files, let us know in your README.

Running Your Programs

Your implementation must be able to handle different problems and queries. For exact inference, your program must accept the following arguments on the command-line:

• The filename of the XMLBIF encoding of the Bayesian network.

• The name of the query variable, matching one of the variables defined in the file.

• The names and values of evidence variables, again using names and range values as defined in the file.

So for example, if this was a Java program, you would use the following to invoke it on the alarm example:

java  -cp   . . .  MyBNInferencer  aima-alarm .xml  B  J  true  M  true

That is, load the network from the XMLBIF file aima-alarm .xml, the query variable is B , and the evidence variables are J with value true and M also with value true . 

The“wet grass” example from the textbook (also seen in class) requires the following:

java  -cp   . . .  MyBNInferencer  aima-wet-grass .xml  R  S  true           The network is in XMLBIF file aima-wet-grass .xml, the query variable is R (for Rain) and the evidence is that S (Sprinkler) has value true .

The output of your program must be the posterior distribution of the query variable given the evidence. That is, print out the probability of each possible value of the query variable given the evidence.

Your README must make it very clear how to run your program and specify these parameters. If you cannot make them work as described above, you should check with the TAs before the deadline.  You can also use make or similar tools for running your programs from the command-line.

Do not submit Eclipse projects that can only be run using “Run Configurations.”

For running approximate inferencers, we must be able to specify the number of samples to be used for the approximation. This must be the first parameter to your program. For example:

java  -cp   . . .  MyBNApproxInferencer  1000  aima-alarm .xml  B  J  true  M  true

This specifies 1000 samples be used for the run. The distribution of the random variable B will be printed at the end of the run. If you need additional parameters, document their use carefully in your README.

Bayesian Network Examples

You should test your inference problem solvers on the AIMA Alarm example (Fig. 13.2/14.2) and the AIMA Wet Grass example (Fig 13.15/14.12). You should try different combina-   tions of query variable and evidence.  You can work out the correct answers by hand   if necessary. Note that these both use only boolean variables, but in general variables   may have any (finite) range.3

Here are a few more examples of Bayesian networks and references to the literature where they were introduced. We may use some of these to test your program (in addition to the AIMA problems), and we may use some that are not listed here.

• The“dog problem”network from (Charkiak, 1991).

• The“car starts”problem from (Heckerman, at al, 1995).

• The SACHS protein-signaling network from (Sachs et al., 2005).

• The ALARM network for determing whether to trigger an alarm in a patient moni- toring system, from (Beinlich et al., 1989).

• The INSURANCE network from (Binder et. al, 1997).

• The HAILFINDER network from (Abramson et al., 1996).

Project Downloads

You should try to write the code for this project yourself.

TRY IT. YOU WILL LEARN THE MOST IF YOU DO THIS YOURSELF.

If you give it a solid try and just can’t get it right, we have provided some resources for you:

• You may download the javadoc documentation for my implementation of the project. I strongly urge you to try it yourself before you look at mine.

File: CSC242-project- 03-doc . zip

•  If you really can’t write the abstract specifications, you may download mine (pack- age bn .core).

File: CSC242-project- 03-core . zip

•  If you really can’t implement the data structures, you may download mine (pack- ages bn .base and bn .util).

File: CSC242-project- 03-base . zip

• You may download the code for the BIF and XMLBIF parsers, for use with your own code or with mine, as described above (package bn .parser).

File: CSC242-project- 03-parser . zip

• You may download the set of examples, including examples of creating Bayesian              networks manually for the AIMA burglary and wet grass examples (package bn .examples). File: CSC242-project- 03-examples . zip

You must still implement the inference algorithms by yourself.  Not only will you learn more by doing the rest yourself also, it is often the case that it’s harder to figure out some- body else’s code than designing and writing your own. The javadoc bundle includes the documentation of my package bn .inference if you want some suggestions for what your methods might look like.

I strongly suggest that you avoid the“Java Bayes”website for the duration of this project (if you want to learn anything and not run into Academic Honesty issues).

Additional Requirements and Policies

The short version:

• You may only use Java or Python. I STRONGLY recommend Java.

• You must use good object-oriented design (yes, even in Python).

• There are other language-specific requirements detailed below.

• You must submit a ZIP including your source code, a README, and a completed submission form by the deadline.

• You must tell us how to build your project in your README.

• You must tell us how to run your project in your README.

•  Projects that do not compile will receive a grade of 0.

•  Projects that do not run or that crash will receive a grade of 0 for whatever parts did not work.

•  Late projects will receive a grade of 0 (see below regarding extenuating circum- stances).

• You will learn the most if you do the project yourself, but collaboration is per- mitted in groups of up to 3 students.

Programming Requirements

• You may use Java or Python for this project.

  I STRONGLY recommend that you use Java.

 Any sample code we distribute will be in Java.

 Other languages (Haskell, Clojure, Lisp, etc.) only by prior arrangement with the instructor.

• You must use good object-oriented design.

 You must have well-designed classes (and perhaps interfaces).

 Yes, even in Python.

• No giant main methods or other unstructured chunks of code.

 Yes, even in Python.

• Your code should use meaningful variable and function/method names and have plenty of meaningful comments.

  I can’t believe that I even have to say that.

 And yes, even in Python.

Submission Requirements

You must submit your project as a ZIP archive containing the following items:

1. The source code for your project.

2. A file named README .txt or README .pdf (see below).

3. A completed copy of the submission form posted with the project description (de- tails below).

Your README must include the following information:

1. The course: “CSC242”

2. The assignment or project (e.g., “Project 1”)

3. Your name and email address

4. The names and email addresses of any collaborators (per the course policy on collaboration)

5.  Instructions for building and running your project (see below).

The purpose of the submission form is so that we know which parts of the project you attempted and where we can find the code for some of the key required features.

• Projects without a submission form or whose submission form does not ac- curately describe the project will receive a grade of 0.

•  If you cannot complete and save a PDF form, submit a text file containing the questions and your (brief) answers.

Project Evaluation

You must tell us in your README file how to build your project (if necessary) and how to run it.

Note that we will not load projects into Eclipse or any other IDE. We must be able to build and run your programs from the command-line. If you have questions about that, go to a study session.

We must be able to cut-and-paste from your documentation in order to build and run your code. The easier you make this for us, the better your grade will be. It is your job to make the building of your project easy and the running of its program(s) easy and informative.

For Java projects:

• The current version of Java as of this writing is: 19.0.2 (OpenJDK)

•  If you provide a Makefile, just tell us in your README which target to make to build your project and which target to make to run it.

• Otherwise, a typical instruction for building a project might be:

javac  * . java

Or for an Eclipse project with packages in a src folder and classes in a bin folder, the following command can be used from the src folder:

javac  -d   . . /bin  `find   .  -name   ' * . java' `

• And for running, where MainClass is the name of the main class for your program:

java  MainClass   [arguments  if  needed  per  README]

or

java  -d   . . /bin  pkg .subpkg .MainClass   [arguments  if  needed  per  README] • You must provide these instructions in your README.

For Python projects:

I strongly recommend that you not use Python for projects in CSC242. All CSC242 stu- dents have at least two terms of programming in Java. The projects in CSC242 involve the representations of many different types of objects with complicated relationships be- tween them and algorithms for computing with them.  That’s what Java was designed for.

But if you insist. . .

• The latest version of Python as of this writing is: 3.11.2 (python.org).

• You must use Python 3 and we will use a recent version of Python to run your project.

• You may NOT use any non-standard libraries. This includes things like NumPy or pandas. Write your own code—you’ll learn more that way.

• We will follow the instructions in your README to run your program(s).

• You must provide these instructions in your README.

For ALL projects:

We will NOT under any circumstances edit your source files. That is your job.

Projects that do not compile will receive a grade of 0.  There is no way to know if your program is correct solely by looking at its source code (although we can sometimes tell that is incorrect).

Projects that do not run or that crash will receive a grade of 0 for whatever parts did not work. You earn credit for your project by meeting the project requirements. Projects that don’t run don’t meet the requirements.

Any questions about these requirements: go to study session BEFORE the project is due.

Late Policy

Late projects will receive a grade of 0. You must submit what you have by the dead- line. If there are extenuating circumstances, submit what you have before the deadline and then explain yourself via email.

If you have a medical excuse (see the course syllabus), submit what you have and explain yourself as soon as you are able.

Collaboration Policy

I assume that you are in this course to learn. You will learn the most if you do the projects yourself.

That said, collaboration on projects is permitted, subject to the following requirements:

• Teams of no more than 3 students, all currently taking CSC242.

• Any team member must be able to explain anything they or their team submits, IN PERSON AT ANY TIME, at the instructor’s or TA’s discretion.

• One member of the team should submit code on the team’s behalf in addition to their writeup. Other team members must submit a README (only) indicating who

their collaborators are.

• All members of a collaborative team will get the same grade on the project.

Academic Honesty

I assume that you are in this course to learn. You will learn nothing if you don’t do the projects yourself.

Do not copy code from other students or from the Internet.

Avoid Github and StackOverflow completely for the duration of this course. There is code out there for all these projects. You know it. We know it.

Posting homework and project solutions to public repositories on sites like GitHub is a vi- olation of the University’s Academic Honesty Policy, Section V.B.2 “Giving Unauthorized Aid.”Honestly, no prospective employer wants to see your coursework.  Make a great project outside of class and share that instead to show off your chops.

References

Abramson, B., J. Brown, W. Edwards, A. Murphy, and R. L. Winkler (1996). Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting, 12(1):57-71.

Andreassen, S., R. Hovorka, J. Benn, K. G. Olesen, and E. R. Carson (1991). A Model- based Approach to Insulin Adjustment. In Proceedings of the 3rd Conference on Artificial Intelligence in Medicine, pp. 239-248. Springer-Verlag.

Beinlich I., Suermondt H.J., Chavez R.M., Cooper G.F. (1989). The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, pp.

247–256.

Binder, J., D. Koller, S. Russell, and K. Kanazawa (1997).  Adaptive Probabilistic Net- works with Hidden Variables. Machine Learning, 29(2-3):213–244.

Charniak, E. (1991). Bayesian Networks Without Tears. AI Magazine, Winter 1991, pp. 50–63.

Cozman, F. (1996).  The Interchange Format for Bayesian Networks [BIF]. Website at

http://sites.poli.usp.br/p/fabio.cozman/Research/InterchangeFormat/ xmlbif02.html (accessed March 2019).

Cozman, F. (1998). The Interchange Format for Bayesian Networks [XMLBIF]. Website         at http://sites.poli.usp.br/p/fabio.cozman/Research/InterchangeFormat/ index.html (accessed March 2019).

Heckerman, D., J. S. Breese, K. Rommelse (1995). Decision-Theoretic Troubleshooting. In Communications of the ACM, 38:49-57.

Sachs, K., O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan (2005).  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data.   Science, 308:523-529.