The Semiotics of Control Rules: 'What Do You Mean by Positive Small?'

Brian SCHOTT and Thomas WHALEN

Decision Sciences Department / Georgia State University / Atlanta, Georgia USA 30303

bschott@gsu.edu and whalen@gsu.edu

Return to Brian Schott @ GSU

We report various approaches for identifying linguistic predicates and the linguistic trems that define them in fuzzy logic controllers. Approaches include statistical quadratic response surface methods, nonlinear iterative search, nomographic generalization, and connectionist methods, especially those that address the issue of linguistic term meaning.

The marriage of artificial neural networks and fuzzy rules has yielded a greatly strengthened generational offspring empowered by the execution speed and training capability of networks and the perspicuity of linguistic rules. Imprecise rules expressed in the ordinary working language of an expert can be more readily acquired for a knowledge based system. Fuzzy logic provides alternative methods of representing the rules in a canonical form. The exact identification of the predicates, linguistic terms, and connectives for implementing such a knowledge base can require considerable craftiness and tuning. Identification is often referred to as tuning when it is done offline and learning or adaptation when it is done during the control process.

This paper begins with a review of linguistic control rules and control systems. Next we discuss various identification methods. Finally we summarize the potentials and pitfalls of the numerous possibilities.

Linguistic control rules

Control systems are designed to automatically manage the behavior of a dynamic physical process by comparing the state of the controlled system (the plant) to some ideal value (the control objective or set point) and applying some control signal. The rules that define the control system can be derived mathematically when suitable simplifications are justified, or by human expertise when it exists. The human expertise comes in various forms including physical skill developed with experience. An expert's physical skill can sometimes be recorded live while the expert is controlling the system, and methods have been developed to translate such samples into effective controllers. An expert's physical skill can also be verbally acquired using knowledge engineering principles. It is this kind of control systems for which fuzzy logic has yielded considerable effective, self-documenting controllers. These fuzzy controllers have received considerable attention, both practical and theoretical, because domain experts with no special training in control engineering can qualitatively frame rules for narrowly defined systems [Sugeno & Yasukawa, 1991][1]. The general structure of such rules can often be acquired rather directly because of their linguistic flavor; nevertheless, tuning or calibrating the fuzzy variables can still be very challenging.

A typical fuzzy controller encodes its rules as one or more combinations of subrules. For example one rule might contain subrules which all tell how to adjust the control signal depending on the values of the same two plant output variables. One such subrule might be, "The push force should be Positive Small if the error angle is negative small and the change in the error angle (since the last observation) is negative small." Clearly, a term like Positive Small must have different meanings in contexts that use different units of measurement (push force, error angle, change in error angle). Some systems also allow a term to have different values on the same measurement scale depending on which rule set, or even which subrule, provides the context.

===================================================

Some Alternative adjustments and calibrations measurement unit rescaling

===================================================

The design and tuning of linguistic variables in a control system can involve adjustments and calibrations which include methods that vary the topography of the controller or rescale its dimensions. Rescaling the dimensions of the universes of discourse alters all of the term meanings together, thus altering controller behavior. Dramatic change can be achieved, for example, by moving from a linear to a logarithmic scale in which the linguistic terms indicate orders of magnitude. Linguistic logarithms occur in ordinary speech: for example First Samuel 18:7 "Saul has slain his thousands, and David his tens of throusands." More gradual adjustments can be achieved by multiplicative ("gain") adjustment parameters. While such rescaling is necessary, it is often insufficient for tuning. The shape and placement of the linguistic terms is as important as rescaling.

After the fixed features and the adjustable features of the control system have been designated, an optimization method requires a figure of performance merit. Common performance variables for mobile systems include success vs failure, time to failure, safety, fuel economy, operating smoothness, over- and under-shooting avoidance, and speed of recovery. Multiple goals can be accomplished additively, multiplicatively, lexicographically, or with fuzzy operations such as minimum and maximum.

A control surface is a surface plot which portrays the relation between controller inputs and outputs; for each combination of input values, a point on the surface gives the corresponding control output value. Additional objectives can be expressed relative to properties of the control surface.

Schott and Whalen[2] have studied the topology of the control surface as terms are redefined and as their relative positions are varied. Discretization error and even unwanted nonmonotonic changes in control surfaces accompanied many gradual repositioning strategies relative to linguistic terms. Few generalizable guidelines for repositioning and reshaping terms have been identified. Neural network methods (discussed later) may be able to accomplish desirable term set revisions, but mostly without retention of the clear linguistic meaning.

Many methods for optimization have been applied to the tuning problem. Statistical response surface methodology (RSM) assumes that the adjustable parameters which define the controller relate to the defined objective in a manner that produces a (unimodal) quadratic surface to be minimized. The eigenstructure of the matrix that defines a response surface allows the system to tune a controller in relatively few iterations that each require the designer's decision. Nonlinear iterative search which extend simplex methods (such as Box's "complex" method [Whalen & Schott[3]] heuristically seek a local extremum by interpolation or extrapolation relative to the parameter values which yield the best gradient for the criterion variable. These heuristic methods have been adapted to allow multiple objectives chosen lexicographically [Whalen and Schott[4]]. These methods give very different results depending on the selected figure of merit.

Neural networks

Artificial neural networks (ANNs or "neural nets") contain layers of connected nodes referred to as input, hidden, and output layers. Information usually proceeds unidirectionally from inputs via hidden layers to output layers. There may be multiple hidden layers between the single input and single output layers. Most neural nets designed for control have a single node in the output layer for each control variable. The value of this control variable is input to the controlled system. Not all nodes in adjacent layers are connected; the pattern of connections, as well as the weights associated with the connections, determines the output of the net. The nodes in the hidden and output layers contain simple processing units variously called activation [Lin[5]], signal [Kosko[6]], or transfer functions [Zahedi[7]]. The processing units accept as input values which may have been rescaled since being triggered or output by node(s) in the preceding layer. In their simplest form the processing units are accumulators which, upon reaching or exceeding a threshold value, produce a pulse output. Others use logistic curves, ramps, or other response functions. The impact of the output on the next layer is mitigated by the intermediate connection weights.

Connectionist network models are being constructed for a wide variety of realworld systems ("plants" in control systems). The design of networks includes arranging and connecting the network nodes as well as selecting the activation functions. The strategy for designing each system can be extremely elusive. But the rules of fuzzy logic have suggested convincing, concrete architectures for connectionist models. The rules and the linguistic terms can suggest connectionist nodes and levels almost as naturally as the arrangement of a rectangular array of pixels can portray a picture.

Fuzzy neural networks

Fuzzy sets and their associated representation of linguistic terms have been used to determine the connection patterns, the processing units, and the connection weights of neural nets. Different strategies have been employed by various controller designers. We will describe some of the more common designs, especially those which preserve the linguistic nature of the components. Fuzzy controllers are comprised of linguistic terms, if-then rules, and variables. Each possible value of a linguistic variable has an associated membership value or possibility value. (In cases where the linguistic variable is continuous valued, discretization produces a finite number of base values.) The same linguistic terms can be applied to multiple variables; for example the values of the speed and the acceleration variables might both be low, medium, and high.

Membership function quantitative values are represented either by the connection weights or by the output of nodes' activation functions. Both the original tuning of a net and adaptive learning of a tuned net often relies on either (a) revising pointwise the membership values associated with individual base values or (b) reshaping or repositioning each linguistic term in a wholesale manner. The first method can be accomplished very efficiently in ANNs by allowing the connection weights to model the membership values. But the linguistic perspicuity of the revised net is minimal. The second method requires that the membership functions be represented by the node activation functions. Limiting the activation functions to being monotonic thresholding functions can undermine the natural shape and interpretation of linguistic variables unless pairs of nodes can be coaxed to act as leftside and rightside partners. To alleviate these problems many controller designers have let the activation functions be the linguistic terms represented as various unimodal shapes such as Gaussian or normal, triangular or trapezoidal.

Unsupervised learning

Neural networks are trained using either unsupervised or supervised learning. In unsupervised learning the connections weights (and in some cases membership function parameters) are adjusted competitively (relative to other weights) according to how close the nodes they connect are to the specific sample value being processed; if the associated layer's node values are close (remote), the weight value is increased (decreased) and the change is directly related to the distance. If the untrained network is presented with a sample of input-output pairs for which the system was in control, the competitive learning enhances connections that resemble the behavior of the expert who properly controlled the system. If the neural network is equipped with all possible combinations of terms comprising possible fuzzy if-then rules (each combination is seeded with similar but randomly different connection weights), the learning favors connections that are effective.

Supervised learning

Networks employing supervised learning activate all the layers of the network in order, starting with the sample inputs and ending with the computed output values ("feedforward"). Once the output values are computed by the network, their values are compared with the real desired output value and adjustments are made according to mathematical equations developed by the human network architect (backpropagation"). Potentially these adjustments can alter both connection weight values (but differently from unsupervised adjustments) and parameters in the activation functions. Since both can be altered, potentially faster learning, or learning with smaller sample sets, can be achieved.

Fuzzy neural net designs

Kosko demonstrates unsupervised learning to train networks containing connections weights that model membership values for the final control surface. That is, the fuzzy terms in each subrule define a control surface pertaining only to a small section of the whole surface. The system can deductively calculate the membership values based on an expert's terms and rules or it can inductively learn the membership values based on an expert's recorded behavior. Both systems require that the weighted sum of the subrules be computed before the results are defuzzified. The weights are interpreted as a level of confidence for the subrule. This allows Kosko to think of the terms as being of fixed shape and location. The weights are derived inductively in all cases using "product-space clustering" based on the relative frequency of the use of the subrules in training runs.

Rovatti and Guerrieri[8] (R&G) suggest a promising method based on what they call fuzzy sets of rules that extends Kosko's method to use supervised neural learning and to give a theoretical explanation for the weighted sum approach.

Lin, Lin and Lee[9] (LL&L) build a five layer network in which the membership functions of linguistic terms are parameterized as symmetric trapezoid shaped activation functions in the second and fourth layer nodes modelling respectively the antecedent and consequent terms. This strategy of modelling membership functions in the nodes themselves as "radial basis functions" has become very popular with fuzzy neural system architects designing backpropagation supervised networks. The third layer, connecting the antecedent and consequent terms, models the rules by connecting only the desired nodes in layers two and four and aggregating according to the activation function's definition. The fifth layer transmits output from the system and act as a defuzzifier.

Prior to the supervised learning stage LL&L employ unsupervised learning separately on the first two layers and on the last two layers (but in reverse order: layer 5, then layer 4) using input sample data. The systems described here perform the unsupervised and supervised phases in an automatic sequence. LL&L present two alternative unsupervised methods of developing the network structure: the first method depends on existing defined terms and rules (both of which may be automatically adjusted during training); it produces a set of input terms that are arranged in a tiled pattern to cover the cross-product input space. The unsupervised stage adds consequent terms based on a fuzzy similarity criterion. Not all consequent terms are linked to all antecedent terms in the final network. For example the illustrated system contained 3, 5, and 5 terms in the three antecedent variables and it started with 3 consequent terms, but resulted in 10 consequent terms; the final system required 45 rule nodes, short of the complete complement of 3¥5¥5=75 potential rule nodes. The trained system yields linguistically defined rules for which the antecedent terms have been repositioned and reshaped to optimally give meaning to the original input terms; the consequent terms have been newly positioned and shaped.

LL&L's second method depends only on providing the number of linguistic terms in the inputs and output variables, not the terms or rules connections; the input terms are irregularly organized in the cross-product input space in this method. This unsupervised learning prunes the number of rules from all possible to a more manageable number, and adjusts the locations of the linguistic terms' membership functions.The trained system yields linguistically defined rules for which the terms have been positioned and shaped totally automatically; the terms must be given names after-the-fact by the developer. LL&L claim that the second method is more effective when a large number of variables define the controller.

Lee, Kwang and Wang[10] (LK&W) describe another system which places parameterized triangle shaped linguistic term membership functions in the network nodes to be adjusted by supervised backpropagation and they allow the last layer to act as a defuzzifier. LK&W's approach differs from similar approaches like that of LL&W in that LK&W only allow the antecedent linguistic terms to vary, holding the consequent terms fixed. The learned revision of the antecedent terms serves to identify them linguistically. The activation functions in the last layers use the more traditional sigmoid shape and the defuzzification is accomplished with more hidden neural layers. The number of rules in the final system equals the product of the number of linguistic terms in all of the antecedent terms (no rule pruning is sought).

Reinforcement learning

Berenji[11] and Lin and others have extended the supervised learning of neural systems to reinforcement learning. Reinforcement learning networks generate a feedback signal directly to the controller during actual running of the control system and can produce working systems for which no expert training runs are available. A separate controller called a critic is designed to produce the necessary feedback. In Berenji's early system the membership functions are piecewise monotonic functions modelled and updated as connection weights. Later systems allow parameterized linguistic term membership functions in the network nodes to be adjusted by the reinforcement learning. Berenji reports that the reinforcement learning methods can produce linguistically defined rules with relatively few system runs.

Reinforcement learning in fuzzy neural systems has been used by Berenji and by Lin to finely tune the fuzzy linguistic terms in controllers.System success and failure in repeated trials provide the reinforcement required by such controllers. Reinforcement learning refines the parameters of the linguistic terms that define the inputs and outputs of the controller. For example the quantification of a "large" output signal from a controller can be adjusted by a location or a scaling parameter.

Conclusion

Each of the research streams reviewed in this paper focused on optimizing the performance of a particular benchmark control system. Conscious attention to semiotic questions of the nature and origin of meaning range from peripheral to nonexistent. Nevertheless, there is much to be learned from a comparative study of the ways neural and fuzzy systems manipulate, modify, combine and create linguistic terms to accomplish the goals of their design. On the one hand, bringing some of the insights of traditional semiotics to such a comparative study can lead to new generations of neural and fuzzy systems which are more powerful, more perspicuous or both. On the other hand, the evolution of language within a neural/fuzzy system and between generations of systems can provide an exciting source of analogy and perhaps theory testing for those interested in the nature and processes of linguistic meaning among humans. The present highly preliminary paper has attempted to be a small first step towards these lofty goals.

References

[1].Sugeno, M. & Yasukawa, T. "Linguistic modeling based on numerical data," Proceedings of the 4th International Fuzzy Systems Association Congress, 1991, pp 264-267.

[2].Schott, B. and Whalen, T. "Nonmonotonicity and Discretization Error in Fuzzy Rule-Based Control using COA and MOM Defuzzification," in Proceedins of the Fifth IEEE International Conference on Fuzzy Systems 1, 1996, pp 450-456.

[3].Whalen, T. and Schott, B., "Optimal tuning of a fuzzy controller using Box's 'Complex' algorithm," in Fifth IFSA World Congress, 1993, pp 1350-1353.

[4].Whalen, T. H. and Schott, B., "Lexicographic tuning of a fuzzy controller," in Fuzzy Logic and its Applications, Information Sciences, and Intelligent Systems, 1995, pp 91-99.

[5].Lin, C. T. Neural Fuzzy Control Systems with Structure and Parameter Learning, World Scientific (River Edge, NJ) 1994.

[6].Kosko, Bart "Fuzzy Associative Memories," chapter 8 in Neural Networks and Fuzzy Systems, Addison Wesley (Englewood Cliffs, NJ) 1992.

[7].Zahedi, Fatemeh, Intelligent Systems for Business, Wadsworth Publishing (Belmont, CA) 1993.

[8].Rovatti, R. and Guerrieri, R., "Fuzzy Sets of Rules for System Identification," in IEEE Transactions on Fuzzy Systems 4 2 pp89-102, (May 1996).

[9].Lin, Chin-Teng, Lin, Cheng-Jian, and Lee, C. S. George, "Fuzzy adaptive learning control network with on-line neural learning." Fuzzy Sets & Systems 71 1, 1995 p. 25-45.

[10].Lee, Keon-Myung, Kwang, Dong-Hoon, and Wang, Hyung Leek "A fuzzy neural network model for fuzzy inference and rule tuning" International Journal of Uncertainty, Fuzziness, and Knowledge-Based System 2, 3, 1994, pp265-277.

[11].Berenji, Hamid R. "A reinforcement learning-based architecture for fuzzy logic control" International Journal of Approximate Reasoning 6 2 1992 pp 267-292.