next up previous
Next: The second question: are Up: A dialogue on representation: Previous: A dialogue on representation:

The first question: are representation languages natural languages?

YW: This dialogue, though not a philosophical or psychological one, has overlap with one of the main aspects of Fodor's Language of Thought [4] claims: that the basis of mental representation is language-like in nature. Fodor presents a set of claims concerning the language-like properties of his putative LOT: in particular, the hierarchical, or tree-like, nature of its structures and the non-compositionality of the meanings of its predicates. The former has involved Fodor in extended disputes with connectionists about whether or not tree structures can be replicated by connectionist learning techniques. The latter runs in the face of standard compositionality- oriented accounts of text (or sentence) meaning. We differ from Fodor on the crucial issue of what it means to be ``language-like''.

SN: Fodor's is only one set of criteria for what features make a language ``NL-like''. We can suggest additional ones. Two basic features, will probably include the ``live'', ``unconstructed'' character of NL and the functional criterion of being designed to support human communication, that is, relying on the human apparatus of understanding.

YW: The first feature of language that should concern us in this discussion is as follows: can the predicates of a formal representational language avoid ending up ambiguous as to sense? A negative answer to this question would make RLs NL-like. It will also mean that understanding a representation involves knowing what sense a symbol is being used in. If NLs are necessarily extensible as to sense -- and words get new senses all the time - then can RLs that use NL symbols avoid this fate?

SN: The predicates of a representational language are consciously CONSTRUCTED. They do not exist except through the will of the acquirer. We can argue about the process of construction and how the elements of a representational language get realized in practice. But the crucial difference is that NLs HAPPEN, RLs are MADE. You presuppose somehow that an RL is not constructed but rather EXISTS. And if, indeed, RL symbols are allowed to be ambiguous, then having to know in what sense an RL symbol is being used simply sends the task of disambiguation one step further: either to yet another, this time, unambiguous, RL.

YW: Here is where we start to disagree strongly: for me, RLs are not made, or rather they are made up of existing NL bits, all too often English. And I can give no sense to the claim that we make symbols ambiguous or not; we have no such control of NLs or RLs.

SN: In some NLP applications, texts in an RL (such as the Mikrokosmos TMR, see, e.g. [15]) are typically used to represent the meaning of an NL text. Stating that RL elements are ambiguous is equivalent to saying that NL meaning cannot be truly extracted and represented. Another complication is the difficulty in using ambiguous RL statements as inputs to generation.

One can reconstruct the impetus for the above question in a deep pessimism about whether one can create an unambiguous RL. This is an important issue, though different from the original one. Briefly, there are two ways in which a representation language can be ambiguous. First, when one and the same RL representation can correspond to two or more non-synonymous NL texts and, secondly, when one and the same NL text can be represented with two or more distinct representations. In the latter case, an added issue is how to establish whether these distinct representations are synonymous or maybe differ only in the grain size of description (which might be considered allowable variation for the purposes of NLP applications). I believe that occurrences of ambiguity of the former kind are to be avoided if possible in RLs.

YW: Yes, but that is not what I mean by ambiguity in RLs. I see no difference between what you describe and the question of whether a passage of, say, Italian and of French are synonymous. I have no problem allowing that they are, modulo Quinean doubts. My worry is about the symbols that comprise them, be they RLs or NLs. Fodor faces this problem no more than do formalists who write ``runs(John)'' and appear to simply know which of the may senses of ``run'' the symbol bears in that context.

SN: One reason is that formalists do not usually work with an RL which has an interpreted vocabulary (they sometimes refer to such an entity when they talk about models in model-theoretic semantics). Of course, runs(John) is not an expression in a natural language even if ``run'' is taken as an unspecified sense of the English word. The reason is almost trivial: the parentheses and the intended assignment of a function-like character to ``runs'' and argument-like character to ``John''. Next, the artificial language used in this notation assumes an interpreter with human semantic capacity because a particular sense of the English word ``run'' must be selected, as well as a particular sense of john. Of course, the most blatant abuse of the similarity with NL and the most overt proof that such notations expect human interpreters is the ending -s on the predicate. Logicians would not think twice of writing ``run(John&Mary)'' fully expecting the interpreter to understand that ``runs'' and ``run'' are identical! It is only in this sense we can say that this representation is like English. The formal system in which such language is used never bothers to explain formally this reliance on the human processor, concentrating instead on studying the formal manipulations of symbols.

YW: Your run/runs point is a nice one and tells on my side, I feel, at least as to how formalists and would-be formalists actually use formulae in a casual, self-deceptive, way. Shall we now ask, what are the essential properties of being language-like and does a representational language have any of those properties, accidentally or necessarily?

SN: Suppose I set these out in a table as follows, where a plus against one of our initials means agreement with the purported feature to the left (as applied to NL or RL symbols), a minus means disagreement, a query means uncertainty, and +/- means both yes and no.

  NL RL
ambiguous as to sense SN:+ YW:+ SN:- YW:+
extensible SN:+ YW:+ SN:+/- YW:+
constructed (vs. accepted) SN:- YW:- SN:+ YW:+
structures hierarchical (JF)    
non-compositional    
presupposes human processor:    
- at processing time SN:+ YW:+ SN:- YW:-
- at acquisition time SN:+ YW:+ SN:+ YW:- (?)
primitives are:    
NL words/phrases SN:+ YW:+ SN:- YW:+
NL word-senses SN:- YW:? SN:+/- YW:-



YW: Let us put this matter very crudely: you seem to believe that the classificatory hierarchy of, say, WordNet [13] consists of English words while that of Mikrokosmos does not. To me they seem both to consist of an ascending thesaurus of similar terms up to some notional top nodes, but I can see no difference between them IN PRINCIPLE, only in richness of structure. This point is merely about the interpretation of symbols in a verbal/ontology hierarchy and how they are normally interpreted.

SN: Normally interpreted by WHOM? By people? Or by computer programs?

One of the differences between us may simply relate to how information is stored in the static knowledge sources of an NLP system, and used as a basis for inferences on representations. If the representation is ambiguous, as I believe it would if RL were an NL, then the inference system would have to disambiguate RL symbols. The decision to retain ambiguity in an RL leads to the situation where disambiguation occurs AFTER a representation is obtained.

YW: I can accept of course that life would be simpler, if duller, if NLs consisted of unambiguous symbols, and the same goes for RLs. What I cannot grasp here is that it is, as you say, a matter of decision, to let a representation be ambiguous or not. I cannot understand that. Let me ask again at its simplest: how can you believe the elements in Mikrokosmos, or Schank's CD [21], or anything else like that, are other than English words, with their own sense ambiguity?

SN: This question may be understood in at least two different ways. Let me first comment on the one for which I feel I have a better repartee: surely RL elements are not just additional senses of NL words with which they share the ASCII codes. You wouldn't say that the Spanish MAYOR is another sense of the English MAYOR, would you? As to the second interpretation of your statement, let me just say that having a separate language for primitives helps to explain paradigmatic relations among word- and phrase-senses, such as synonymy, antonymy etc. Of course, as in WordNet, one can bypass this explicitness about relations through the use of devices like synsets, but then one ends up with a knowledge source which does not support all the operations necessary for automatic meaning processing.

YW: OK, so we are moving into Quinean territory now, as when he used the impossibility of veridical paraphrase and translation to attack the folk-content notion of meaning [19]. It would seem natural if we want to defend the latter (what shall we call it-Commonsense semantics?) that we also believe in some provable/demonstrable notion of paraphrase. I would argue that even the present poor quality Information Extraction (IE) and MT are steps towards a practical notion of paraphrase-but that's not a philosophical defense.

SN: Do you mean that in order to defend the feasibility of meaning representation one needs to defend the feasibility of paraphrasing and translation? Clearly, even paraphrases and translations made by humans do not always convey EXACTLY the same meaning. The simple defense might be that philosophers habitually operate with an ideal RL and do not take into account the notion of the grain size of description, to say nothing about the possibility of a slip of judgment or an outright error on the part of acquirers.

YW: There is a venerable tradition of describing meaning through translation and paraphrase without representing it separately. It was Frege who seems to have wanted a functional notion of word meaning representation, or sense, (Sinn) that related surface entities (referents, or Bedeutungen, for Frege) but which did not ITSELF YIELD OR POSSESS CONTENT. For Frege [5] sense does not contain a coded meaning: it is just a function that allows you to specify or locate plausible referents in the world: a black box, or a sort of recognizer if you like!

SN: And that brings us back to the issue of whether a representation of text meaning is required or can simply be pointed at, and compared with the meaning of another text OR connected with a particular denotatum (which is in the real world, not in another set of symbols). In reality, we MODEL the world of denotata using a set of symbols because we simply cannot avoid it if we want to develop computer applications which require meaning representations.

YW: Well, of course, that is exactly what connectionists deny--they think they can give some sense to non-symbolic models-but I don't suppose we need bother with them here. And I admit that my obsessive questions about the exact status of primitives in a KR (and whether they are NL words or not) ignores Frege's best known injunction which was not to consider the sense of the symbol OUTSIDE AN EXPRESSION. Will we get any closer to resolution here by considering a possible scale of NL likeness for an RL:

English or Bulgarian
Esperanto
Predicate calculus
Some Interlingua

Are these all equally expressive and if so how could we know or prove it? If they are then that is one NL/RL link: anything one can say in one one can say in the other. Certainly many users of Predicate calculus and Interlingual formalisms have held this position of equal expressivity.

SN: An RL must support automatic inferencing operations. One might just consider the difficulty (or otherwise) of adapting any of the above kinds of RLs to this task. The major consideration is, again, whether the language is intended for people or for machines. The answer is easy in the case of English, Bulgarian and Esperanto; more problematic for predicate calculus; and impossible for interlinguas, which can be constructed. Any two independently constructed grammars of a language will be different though may well have the same weak or even strong generative capacity. The ideal interlingua would be good for both computers and people: it would support inferencing in a broad domain, thus permitting high-quality meaning representation for texts; and it would also be easy to repair and expand (which, for the foreseeable future will remain largely a task for people).

YW: I am not sure this tells one way or the other on the language-likeness issue, though it does make one ask if we have a good notion of ``equivalent coverage"" for representational languages in the way we do for grammars.

A key phrase that may help clarify our difference is that you say NLs are comprised of words and RLs of word-senses. But language research is different from other AI areas because, in all areas but language, we can imagine a computer system being better than us: better than physicians or grandmaster chess players. We CANNOT imagine the system understanding language better than people, and this point is not often appreciated in some NLP areas.

SN: At the risk of sounding like a broken record, I'd like to insist that the purpose of a representation is to get the symbol ambiguity out, which is exactly what you think cannot be done. But does that point require an objective measure of symbol ambiguity, anywhere in our discussion or outside it?

YW: Somewhere in his discussion of what he calls ``The Concrete Lexicon versus The Abstract Dictionary'', Martin Kay [9] seemed to be arguing that the brain MUST subscript symbols to separate the senses of (brain) RL primitives within the Concrete Lexicon, i.e. the head. I have never been sure quite what he meant, but he was clearly discussing the same issue as us, as many have before, and he seemed to me to be roughly on my side here: conceding that RL atoms could be ambiguous and this would have to be resolved by the processor that used them, that is, the brain itself. Indeed, this is what you said earlier would be needed, and is not the same as the RL expressions being ambiguous, which you take me to mean here and I do not.

SN: If we must talk about the brain, I am agnostic. I don't know an awful lot about what is going on inside that device.

YW: No, neither do I, but people like us who talk about the nature of RLs for human knowledge must, like all AI workers, be making at least potential claims about the brain, whether we admit it or not.

SN: I wonder whether we indeed do. Maybe if we concentrate on representation by computers and for computers, we will be off the hook.


next up previous
Next: The second question: are Up: A dialogue on representation: Previous: A dialogue on representation:
Gillian Callaghan 2000-03-29