Production system

The educational technology and digital learning wiki
Jump to navigation Jump to search

Draft

<pageby nominor="false" comments="false"/> Note: This entry is just a copy/paste of some paragraphs I wrote many years ago when I was a graduate student - Daniel K. Schneider 19:25, 14 August 2007 (MEST) (be warned).

Definition

A production system (PS) is a form of specifying and defining an algorithm. Their main features are the so-called rewrite rules which are also used to define generative grammars in linguistics and computer science.

Production systems have been developed a few decades ago in the context of research on human information processing and knowledge representation. They are still being used in various kinds of expert systems and intelligent tutoring systems. An XSLT processor also can be considered a production system when it executes templates.

Rewrite rules

Before introducing production systems we shall briefly explain the formal properties of such rules. A rewrite rule takes very generally speaking a chain of symbols and rewrites it into another chain of symbols.

(hello xy) --> (goal yz) xz 

rewrites the string "(hello xy)" into the string

(goal yz) xz

A generative grammar is a collection of rewrite rules which permit to translate in successive steps some input into some output. A good example are so called parsers which translate a chain of symbols representing a sentence, into an axiomatic chain of symbols defining the grammatical structure of the sentence. This is called to perform perform bottom-up parsing, i.e. the resulting tree defines the input chain. The same procedure can be done in the opposite way. Starting form a axiom a longer string of symbols can be generated. This procedure can be used in top-down parsing, e.g. generating possible structures, until one matches the input. In this work we will not use these linguistic grammars directly again.

Production systems are in some sort a generalization of these rewrite grammars. First its rules may be more complex. Second there is also a formalized mechanism which uses these rules, i.e. an engine that applies this "grammar" to some data in some memory. Thus we can state the three essential components of a PS:

  1. A set of production rules (rewrite rules)
  2. A data base
  3. An interpreter which uses (1) and (2) in order to perform computations.

In a simple production system a production rule has the following general form :

if <SITUATION> then <ACTION>

This means that if the interpreter discovers a certain situation or condition in the data base which matches the "situation-side" of a rule, the rule is evoked and ready to be used. Thus the first function of the interpreter is evaluative or interpretive. It looks through the data base in order to find rules which it can possibly use. If it finds only one such rule that matches a situation, it is executed, that means that its action-part is used in order to transform the data base or to do some other symbolic computation. Thus instead of "situation" and "action" we can also use the terms "if" - "then", or "condition" - "conclusion".

Let's examine two rules in the following that are inspired by a production system used to model the application of a law.

(define-rule decision-yes
 IF
 (person is submitted to authorization regime)
 (person gets an authorization)
 THEN
 (message: "The acquisition will be authorized")
 (the system made a final legal conclusion)
)
(define-rule authorization-regime-yes
 IF
 (person acquired real estate)
 (person is foreign)
 THEN
 (person is submitted to authorization regime)
)

You don't need to understand in detail what it does, just get the general flavor of it. The rules are written in some pseudo-English language. Each rule has a name, an "if-part" and a "then-part". The first rule (called decision-yes) checks two things in the condition part (i.e. two legal conditions). If these things are true we know that the clauses in the conclusion part are also true, i.e. we can write into the data base an information about a legal case and a legal conclusion. The second rule checks whether one of the clauses in the first rule holds.

If the interpreter finds several rules which could be executed, it has to choose the best of all rules. This is its control function.

Very often he has to rely on another set of rules (selection heuristics) in order to able to do so. An interpreter works in cycles. It looks at the data-base (which may contain several symbols) and tries to find the rules which match that situation, i.e.. a symbol or a set of symbols. Then it executes the rule which seems to look best. This has an effect on the data base, i.e. the action part of a rule changes some of its content. At this point the interpreter starts looking for rules again and thus started its second cycle. Such a computation cannot go forever of course. Some rules might tell the system that a problem is solved or that it is not solvable and that it has to stop.

Forward chaining and backward chaining

Forward chaining and backward chaining are very important concepts of search that is done with production system. In our previous discussion we didn't really tell why the interpreter searches certain rules to apply. There are two very basic situations:

If the interpreter works in a goal-oriented way (backward chaining), it wants to prove the "then-part" of a rule, i.e. some kind of intellectual goal. Let's examine our rules again: In the "decision-yes" rule we have a clause "(the system made a final legal conclusion)". Let's assume further on that this might be the goal of the whole problem solving process. The interpreter knowing that, tries know to find out whether the two conditions in the "if-part" are also true. In other words he sets up two sub-goals and he tries to find two rules that "prove" these clauses. Looking through the rule-base he will certainly find our second rule that makes such a conclusion if its conditions are true. This process continues until some facts are directly asserted as such in the data-base. E.g. in this case we might have the information that a person acquired real estate and that the person is also foreign. Now it is understandable why this goal-oriented search procedure is called backward-search.

In goal-oriented terminology we can say that the mechanism starts from a general goal, and decomposes that general goal into subgoals (or problems to be solved) until it stumbles on some data that assert that a subgoal is reached. More technically speaking it walks down a decision tree.

Forward-chaining works just the other way round, it is fundamentally "fact-driven". Lets take again our two rules and imagine that the interpreter just sits there and waits for data that trigger rules. Something might "write" into the data base at some point that "(person is foreign)" and that "(person acquired real estate)". Seeing these assertions, the interpreter will now immediately take this second rule and conclude that "(person is submitted to authorization regime").

Now imagine a third rule:

(define-rule authorization-yes

 IF
 (person is a company)
 (real estate is used as production site)
 THEN
 (person gets an authorization)

If the interpreter also finds the facts that "(person is a company)" and "(real estate is used as production site)" it also can conclude that "(person gets an authorization)". With these two intermediate conclusions combined, the first rule will "fire" and conclude that the system made a legal conclusion.

Now let's try to generalize these rather technical concepts forward and backward chaining. Our small example should have made clear that problem-solving activity can be guided both by goal-oriented and "forward-propagation" activities. Both could represent two very aspects of human intellectual activity and are thus help to model at least some kinds of human thought, e.g. decision making.

Let's examine this phenomenon under a different, but connected angle. Let's assume that we model a learner with a simple data-base (short term memory) and a production system (long term heuristic memory).

Each time we insert something into the data base (memory), the problem solver learns a fact and some processes (rules) are triggered and conclusions are made which in turn can trigger other conclusions, i.e. more complex learning than just fact storing occurs.

Sometimes a problem solver wants to satisfy a goal, i.e. deduce something from facts and rules in his memories. This goal can be triggered by some input into the data base. If he does not directly find the answer (this something) in the data base, he tries to apply the rule base until this goal is satisfied.

So very generally speaking one could model some thought processes in 2 ways:

  • We perform forward inference whenever we learn some new facts that will lead to yet new fact. We insert something into a knowledge base that triggers rules.
  • We perform backward inference whenever we need to deduct something, i.e. we fetch something from a knowledge base and if can't find it.

These processes can be mixed of course and become quite complex.

Other related computer modelling techniques include semantic networks, application of mental scripts, schemas, frames, plans etc.

A note on the relationship to cognitive theory

There is a correspondence with the theory of the human information processing and various models of knowledge representation. The sets of rules are part of the long term memory, the data base on which the interpreter works represents the short term memory, and the processor contains the interpreter. Now it is of course possible that a rewrite rule itself might be rewritten, or at least the system might change the order in which it prefers executing these rules in the case of choice. This means that production systems can be used to model adaptive behavior. In this case it will apply different rules to the same situation. Production systems can also be used to simulate real learning. In this case we need rules that are able to write new rules. Finally, we have to say that that computer models of human information like GPS and more modern successors like the ACT theory (see procedure learning) had a more complex organization than the one which we have sketched above and that more modern production systems only have the name in common with the models used in simpler expert systems.

Recent developments

With talk (and some action) around the semantic web and business process modeling, there is renewed interest in production systems

PRR rules

Production Rule Representation (PRR) is a 2009 standard by OMG. Below is an example found in the [v1.0 specification, p 52

English text
If the shopping cart contains between 2 and 4 items and either the purchase value is greater than $100 and the customer category is gold or the purchase value is greater than $200 and the customer category is Silver then apply a 15% discount on the shopping cart value.
A proprietary rule language format (of one of the submitters)
rule discount {
when
{
   ?customer: Customer();
   ?shoppingCart: ShoppingCart(customer == ?customer);
   evaluate((?shoppingCart.containsItemsInRange(2, 4)) &&
            (((((?shoppingCart.getValue() > 100d) && (?customer.category equals "Gold")) ||
            ((?shoppingCart.getValue() > 200d) &&
}
then
{
     modify ?shoppingCart
            {
               shoppingCart.discountValue = shoppingCart.discountValue + 15f);
            }
}
}

PRR OCL:

Rule discount

ruleVariable:
       ?customer: Customer = Customer->any()
       ?shoppingCart: ShoppingCart = ShoppingCart->any(c: customer | c=?customer)

Condition:
       (?shoppingCart.containsItemsInRange(2, 4)
       and
       (((?shoppingCart.items->collect(i:Item|i.value))->sum()>100
        and
        ?customer.category == "Gold")
       or
       ((?shoppingCart.items->collect(value))->sum() > 200
         and
         ?customer.category == "Silver")))

Action:
      shoppingCart.discountValue = shoppingCart.discountValue+15f

Links and references

Bibliography

  • Forgy, Charles (1982). "Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem," Artificial Intelligence, 19, pp

17-37.