Translating thought into language

The essence of one’s thought is a very hard thing to describe. And even though we often think to ourselves in terms of words, there is something nebulous and remote which exists before the words appear. Perhaps it is an inclination of the soul, or what some describe as “intent”, but by casting that essence into words we make it concrete.

The nature of this essential thought is highly complex, indicating that something is probably lost when we force it into a few simple words. Usually it suffices for us when we think with these words, because we are also aware of the subtle context surrounding the choice of those words. What “anger” means to two people can greatly differ, But we have no trouble using “anger” when thinking to ourselves, because we know exactly what we mean.

Some words have a common definition, such as “horse” or “table”. Other words usually require clarification in order to avoid misunderstandings. The degree of clarification needed is somehow related to the complexity of the original thought which prompted the choice of that word.

The role of this “clarification” can be viewed as follows: In the field of our minds, imagine that a certain tree is growing. This tree provides us with fruit, shade, and many other wonderful things.

One day a friend notices how happy we are, and they wonder why. We tell them about the tree, and decide they should have one of their own. But because a tree is a very difficult thing to uproot (and sometimes kills the tree), we have to give them a tiny little seed, which they can plant themselves.

The seed is not the tree, but in another way it is. If the person treats it well, it will grow up similar to (but not exactly the same as) the original. The smallness of the seed and its apparent uselessness – there’s no shade or fruit whatsoever – represents the difference between the subtlety of our thoughts, and the words we use to communicate. The “context” spoken of before is the care and feeding that the seed requires in order to grow properly.

What is happening is that the first person is taking a very complex things, call it meaning, and he is encapsulating that meaning into something portable. That portable something, the seed, requires a few other things in order for a second tree to be created.

In a similar way, the essence of our thoughts is captured in words, but these words mean nothing without a common understanding of: (1) a basic definition of the word, (2) the specific “slant” I intend for the word to have, and (3) why I’m saying the word at all.

If the other person can understand these three things, then a second essence, very much like our own (thought not exactly the same), will appear in their mind.

This basically describes what happens when we “parse” the words of another person. They are spewing basic units of information at us (words), which have a common basic definition (that is, they have a lexical form, or a basic representation which we can understand simply from knowing the same language). These lexical forms combine in certain patterns to convey a specific intent. Telling someone, “Your house is on fire!” says something very different from “Is your house on fire?” The lexical units, or lexemes, are the same (Is, house, fire, your, on), but the combinations imply a different meaning. Also, only certain combination are “legal”. What determines legality is something called syntax. Syntax specifies how lexemes can be sequenced. Syntax also indicates that certain sequences have a specific meaning: placing “your” in front of “house” means that the house belongs to you. But “house your” is not legal syntax.

This notion of meaning is called semantics. Several syntaxes can express the same semantic intent. “Your house”, “Sein Haus”, “Su casa”, all say the exactly the same thing. In fact, they even have the same syntax; it is the lexemes which differ. However, each syntax defines it legal lexemes, making these three sentences part of three different syntaxes, or grammars. Normally we refer to English and German as languages, but we could also refer to them as grammars, which imply a single syntax having a unique lexical dictionary.

The same syntax can also have different semantics. This difference is usually determined by the context of the expression, such as whether “Su casa” means “his house” or “your house”. It depends on the subject.

It is the job of a translator to take a sentence from one syntax, and re-express that sentence using a different syntax, but without changing the semantics. Translating “Mien Haus” to “My house”, does the job faithfully without changing the meaning at all. When two computer languages are involved, and the second language is some kind of machine kind, we call the program a compiler.

A compiler works in much the same way as two people who are having a conversation. To an outside observer, nothing is really passing between the two individuals but sound waves. The sound waves are fluid, having high points, low points, and strange fluctuations.

The hearer in the conversation first chops these sounds into units. He takes in each unit, in relation to the one before, and builds up sentences. As the sentences are growing, their meaning is formed until the hearer understands what the other person was saying. But understanding cannot happen at the level of words, it happens in units of own. Sometimes these units very small, such as sentences, and sometimes they are very large. There is actually a gradation of size, starting with a single unit to encompass the entire conversation, and then each theme in the conversation, and further to each point within the theme.

A computer language communicates in units which parallel these in many ways. A program is made up of routines, along with statements which express relationships among routines and data.