Production system

The educational technology and digital learning wiki
Revision as of 18:44, 14 August 2007 by Daniel K. Schneider (talk | contribs) (using an external editor)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Draft

<pageby nominor="false" comments="false"/>

Definition

A production system (PS) is a form of specifiying and defining an algorithm. Their main features are the so-called rewrite rules which are also used to define generative grammars in linguistics and computer science.

Production systems have been developped a few decades ago in the context of research on human information processing and knowledge representation. They are still being used in various kinds of expert systems and intelligent tutoring systems. An XSLT processor also can be considered a production system when it executes templates.

Rewrite rules

Before introducting production systems we shall brievely explain the formal properties fo such rules. A rewrite rule takes very generally speaking a chain of symbols and rewrites it into another chain of symbols.

(hello xy) --> (goal yz) xz 

rewrites the string "(hello xy)" into the string

(goal yz) xz"

A generative grammar is a collection of rewrite rules which permit to translate in successive steps some input into some output. A good example are so called parsers which translate a chain of symbols representing a sentence, into an axiomatic chain of symbols defining the grammatical structure of the sentence. This is called to perform perform bottom-up parsing, i.e. the resulting tree defines the input chain. The same procedure can be done in the opposite way. Starting form a axiom a longer string of symbols can be generated. This procedure can be used in top-down parsing, e.g. generating possible structures, until one matches the input. In this work we will not use these linguistic grammars directely again.

Production systems are in some sort a generalization of these rewrite grammars. First its rules may be more complex. Second there is also a formalized mechanism which uses these rules, i.e. an engine that applies this "grammar" to some data in some memory. Thus we can state the three essential components of a PS:

  1. A set of production rules (rewrite rules)
  2. A data base
  3. An interpreter which uses (1) and (2) in order to perform

computations.

In a simple production system a production rule has the following general form :


if <SITUATION> then <ACTION>

This means that if the interpreter discovers a certain situation or condition in the data base which matches the "situation-side" of a rule, the rule is evoked and ready to be used. Thus the first function of the interpreter is evaluative or interpretive. It looks through the data base in order to find rules which it can possibly use. If it finds only one such rule that matches a situation, it is executed, that means that its action-part is used in order to transform the data base or to do some other symbolic computation. Thus instead of "situation" and "action" we can also use the terms "if" - "then", or "condition" - "conclusion".

Let's examine two rules in the following that are inspired by a production system used to model the application of a law.


(define-rule decision-yes
 IF
 (person is submitted to authorization regime)
 (person gets an authorization)
 THEN
 (message: "The acquisition will be authorized")
 (the system made a final legal conclusion)
)
(define-rule authorization-regime-yes
 IF
 (person acquired real estate)
 (person is foreign)
 THEN
 (person is submitted to authorization regime)
)

You don't need to understand in detail what it does, just get the general flavor of it. The rules are written in some pseudo-English language. Each rule has a name, an "if-part" and a "then-part". The first rule (called decision-yes) checks two things in the condition part (i.e. two legal conditions). If these things are true we know that the clauses in the conclusion part are also true, i.e. we can write into the data base an information about a legal case and a legal conclusion. The second rule checks whether one of the clauses in the first rule holds.

If the interpreter finds several rules which could be executed, it has to choose the best of all rules. This is its control function.

Very often he has to rely on another set of rules (selection heuristics) in order to able to do so. An interpreter works in cycles. It looks at the data-base (which may contain several symbols) and tries to find the rules which match that situation, ie. a symbol or a set of symbols. Then it executes the rule which seems to look best. This has an effect on the data base, i.e. the action part of a rule changes some of its content. At this point the interpreter starts looking for rules again and thus started its second cycle. Such a computation cannot go forever of course. Some rules might tell the system that a problem is solved or that it is not solvable and that it has to stop.

Now we are able to establish the correspondence to the theory of the human information processor. The sets of rules are part of the long term memory, the data base on which the interpreter works represents the short term memory, and the processor contains the interpreter. Now it is of course possible that a rewrite rule itself might be rewritten, or at least the system might change the order in which it prefers executing these rules in the case of choice. This means that production systems can be used to model adaptive behavior. In this case it will apply different rules to the same situation. Production systems can also be used to simulate real learning. In this case we need rules that are able to write new rules. Finally we have to say that even GPS had a more complex organisation than the one which we have sketched above and that more modern production systems only have the name in common with the first models.

This brief presentation of some principles of GPS shows the spirit and also some of the fallacies of earlier AI programs. First of all we saw that the authors made the mistake of generalizing the ideas they gained out of the analysis of specific problem solving domains onto the whole area of human problem solving. Second they did not pay much intention to the issues of representation. The way in which problems and rules were stated was fairly simple, which was indeed possible for easy formalizable problem-domains. More modern systems integrated more elaborated knowledge structures (e.g. plans and methods) than the simple if-then rules GPS used, i.e. single productions sometimes look like complex frames, a concept we will introduce later. Attempts are also made to model real learning, i.e. building completely knew rules with the aid of learning heuristics, such as discovering or rule abstraction from input. GPS was a precurser of the production systems that are used today for expert system engineering. Expert system knowledge engineers don't make any claims about the nature of human thought, they just build programs that perform certain intelligent task. As such they are also very valuable for the social sciences: They allow for analytical modeling as we shall demonstrate in the second part.

Forward chaining and backward chaining

Forward chaining and backward chaining are very important concepts of search that is done with production system. In our previous discussion we didn't really tell why the interpreter searches certain rules to apply.

There are two very basic situations: If the interpreter works in a goal oriented way, he wants to prove the "then-part" of a rule, i.e. some kind of intellectual goal. Let's examine our rules again: In the "decision-yes" rule we have a clause "(the system made a final legal conclusion)". Let's assume further on that this might be the goal of the whole problem solving process. The interpreter knowing that, tries know to find out whether the two conditions in the "if-part" are also true. In other words he sets up two sub-goals and he trie to find two rules that "prove" these clauses. Looking through the rule-base he will certainly find our second rule that makes such a conclusion if it's conditions are true. This process continues until some facts are directely asserted as such in the data-base. E.g. in this case we might have the information that a person acquired real estate and that the person is also foreign. Now it is understandable why this goal-oriented search procedure is called backward-search. In goal-oriented termiology we can say that the mechanism starts from a general goal, and decomposes that general goal into subgoals (or problems to be solved) until it stumbles on some data that assert that a subgoal is reached. More technically speaking it walks down a decision tree.

"Forward-chaining" works just the other way round. it is fundamentally "fact-driven". Let's take again our two rules and imagine that the interpreter just sits there and waits for data that trigger rules. Something might "write" into the data base now that "(person is foreign)" and that "(person acquired real estate)". Seeing these assertions, the interpreter will now immediately take this second rule and conclude that "(person is submitted to authorization regime").

Now imagine a third rule:

(define-rule authorization-yes

 IF
 (person is a company)
 (real estate is used as production site)
 THEN
 (person gets an authorization)

If the interpreter also finds the facts that "(person is a company)" and "(real estate is used as production site)" it can also conclude that "(person gets an authorization)". With these two intermediate conclusions combined, the first rule will "fire" and conclude that the system made a legal conclusion.

Now let's try to generalize these rather technical concepts forward and backward chaining. Both actually represent two very aspects of human intellectual activity and are thus important concepts for models of decision makers. Our small example made clear that problem-solving activity is guided both by goal-oriented and "forward-propagation" activities. Let's examine this phenomenon under a different, but connected angle. Let's assume that we model a decision maker with a simple data-base (short term memory) and a production system (long term heuristic memory). Each time we insert something into the data base (memory) (i.e. the problem solver learns something), some processes (rules) are triggered and conclusions are made which in turn can trigger other conclusions. Somtimes a problem solver wants to learn something, i.e. he wants to fetch something from memory. This is a goal which can be triggered by some input into the data base. If he does not directely find the answer (this something) in the data base, he tries to apply the rule base until this goal is satisfied. So very generally speaking, we perform forward inference whenever we learn something (i.e. insert something into a knowledge base) and we perfom backward inference whenever we "search" for something (i.e. we fetch something from a knowledge base). These processes can be mixed of course and become quite complex. We shall present a more complete model including these concepts. Before we must discuss the nature of complex knowledge structures other than networks and rule bases.