Natural Language Processing: Natural Language as Logic
I am trying to swap in enough background in Natural Language Processing and Understanding to build a new natural language-based game. One of my initial goals is to be able to convert an English sentence into a form of logic that can be used in a Prolog program to do something in the game.
Since I have effectively zero experience in this area, it took a lot of exploring to understand even the intro paragraphs to a lot of the academic research I’ve been reading. Here’s what I’ve found so far in case it helps other weary developers attempting the same thing.
Let me cut to the chase and then loop back and explain: I want some code that turns English into logic that I can use in a game. The best tool I’ve found so far is the “English Resource Grammar (ERG)”. That tool converts English into a predicate logic (ish) form. However, understanding how to use it takes a bunch of background. I won’t claim to have gotten deep into this yet, or even to have full understanding of the key concepts, but read on for what has helped me so far…
Mental Model for Natural Language Syntax and Semantics
Honestly, just figuring out the right mental model – the right way to think about – parsing and understanding natural languages took me a while. Maybe it is obvious, but let me make a sweeping generalization that has at least helped me frame the problem and is probably more or less true:
Parsing natural language is like parsing a computer language: A C++ parser turns the characters and words you type in a program into a form the computer can act on and an English parser takes the words in some utterance and turns them into a form a computer can act on.
Unlike C++ which has a well-understood grammar that describes the allowed syntax and set of rules for taking the grammar output and interpreting it to do something (the semantic rules), it appears to me that the grammar of English has been largely worked out, but the semantics is still a work in progress by linguists. So, if you start searching, you’ll still find lots of papers, debate, etc about how to represent the meaning (i.e. semantics) of an utterance in a form a computer can do something with. We still don’t have complete set of rules that map language to a meaning that agrees with the mapping most humans would do.
[BTW: The analogy is by no means perfect, kind of like comparing a cheetah and a car when trying to understand movement. Human language is organic and fuzzy and resilient, the C++ language is designed and precise and brittle. Regardless, the point is that, in my game, the user is going to type English and the game needs to do something and show that it understood things in a deep way (not based on keywords, etc.), much like what a compiler does with program text.]
All that said, lots of practical progress has been made and below is a summary of some key ideas, concepts and approaches I’ve found so far.
Is Natural Language Logical?
The first step is to decide if it is even feasible to convert the meaning of an English utterance to a rigorous logical form. Turns out this is still an open question for all statements, as far as I can tell, but it hasn’t been disproven yet, and a ton of progress has been made. It looks like there were two people whose contributions threw the area wide open with a one-two punch: Noam Chomsky claimed that there were formalizable rules for syntax (i.e. the ways a sentence can be put together) and Richard Montague claimed the same thing for semantics (i.e. the meaning of utterances).
“Montague’s idea that a natural language like English could be formally described using logicians’ techniques was a radical one at the time.” Encylopedia of Language and Linguistics
As I’ve explored the intersection of linguistics and logic, I’ve encountered the term “Montague Grammar” constantly. Unfortunately, his founding paper “Universal Grammar” (1970) , is behind a pay wall. The internet abounds with great summaries, however. The best I’ve found at getting at the gist of what Montague was saying, at least with respect to how to compose things together, is Compositionality (Janssen, 1997)
His papers and books are so influential that they have gained acronyms (maybe this is something that happens a lot, but I haven’t noticed it before). The “PTQ” or “The Proper Treatment of Quantification” (1973) appears to be the most important for understanding the semantic parsing tool that I’m focused on now: “The English Resource Grammar.”
I’ll be honest, there is no way I made it through that document and my eyes glazed over in the executive summary I found as well. That is some dense reading.
After a ton of searching, I found some course notes that talk about Generalized Quantifier Theory that gave me the basic understanding I needed to understand what logic quantifiers (logic statements like “There exists an x such that:” or “For all x:”) have to do with natural language semantics. Getting through the first 3 classes there will give you enough basics to start understanding why the term “quantifier” and “quantification” come up so often in the ERG and why they are important for converting English into logic. I definitely recommend reading those course notes (interesting that the good ol’ encyclopedia has a really good overview as well.)
The gist of it is this: Many parts of a sentence can just be translated into simple predicates that specify properties of other things in predicate logic: the “blue” in “blue hat” can be converted to asserting a property “blue” of an object “hat” –> “blue(hat)”. However, many cannot. Instead, they are specifying a “scope” that then gets applied to other parts of the sentence.
An easy example is “everyone” as in “everyone is happy”. We can’t represent it like “happy(everyone)” because this would be interpreted as an individual named “everyone”. “Everyone” is specifying a whole domain, much like the logic quantifier “For all X…” (i.e. ∀X) is. It would be more accurate to represent it as “∀X(happy(X))”.
OK, so now we’ve got two different ways to represent things. Wouldn’t it be nice to have a single unifying way to think about modeling natural language? Yes it would. Generalized Qualifiers give you that. Read the links.
Once you understand Generalized Quantifiers, it appears you have begun your journey to understanding the main path for converting English into a form of logic. At least from my reading of the Internet.
(Neo) Davidsonian Representations
Generalized Quantifiers get you a lot of the way there, but if you start trying to understand how verbs are represented in many (most?) descriptions of turning language into logic (including the ERG) you’ll run head-on into “(Neo)Davidsonian representations”. The “Davidsonian Semantics” were developed by Donald Davidson and articulated in his paper “The Logical Form of Action Sentences” (1967).
Before Davidson (hand waving here…), you may have modelled “Brutus stabbed Caesar with a knife” as “stabbed(Brutus, Caesar)”. Seems logical (so to speak). The problem is how to deal with “brutally stabbed” or other words that are modifying the stabbing event? He proposed that verbs introduce an “event variable” that can be used and modified elsewhere like this “stabbed(e, Brutus, Caesar) & brutal(e)”. So far so good.
The “Neo” part comes from Terry Parsons’ “Events in the Semantics of English” by noticing that we have given a special place for Brutus and Caesar in this event by saying “stabbed(e, Brutus, Caesar)” why not make this more general? like this: “stabbed(e) & Agent(e, Brutus) & Target(e, Caesar) & brutal(e)”.
This type of representation solves a few problems. First, now we can ask things about e that were difficult (impossible?) to ask when it was a single predicate. “Who did it?” (Agent(e, X) -> X = Brutus) “Was it brutal?” (brutal(e) -> true), etc. Plus: we don’t have to figure out how many arguments to have in the predicate. We just keep adding predicates as long as we need to.
I’m glossing over a lot, but that is the gist as far as I can tell.
Again, it helps to remember that this is like archeologists figuring out the C++ semantics without any specifications. We are slowing building up a model that lets you represent an English statement in a logical form, each breakthrough clears up a part of the “English Specification”.
Next Steps
I’m continuing to dive into trying to get the output of the “English Resource Grammar (ERG)” into a form I can use in a game. Stay tuned…