Converting Phrases to MRS and Well-Formed Trees
To use our completed backtracking solver, we need to write the code that will convert a human phrase into TreePredications
and call this solver with them. So, we need to generate all the MRS documents for the phrase and all the well-formed trees for the MRS documents.
To do this, we’ll write code to use the ACE parser to convert a phrase into an MRS document. We can use the ACEParser
class from pydelphin
to do this. The only trick is that we need to supply a grammar file. The grammar file tells ACE which language we are speaking. It is platform dependent, so we’ve got a helper function that determines which one to return for the current user:
def mrss_from_phrase(self, phrase):
# Don't print errors to the screen
f = open(os.devnull, 'w')
# Create an instance of the ACE parser and ask to give <= 25 MRS documents
with ace.ACEParser(self.erg_file(), cmdargs=['-n', '25'], stderr=f) as parser:
ace_response = parser.interact(phrase)
for parse_result in ace_response.results():
# Keep track of the original phrase on the object
mrs = parse_result.mrs()
mrs.surface = phrase
yield mrs
def erg_file(self):
if sys.platform == "linux":
ergFile = "erg-2020-ubuntu-perplexity.dat"
elif sys.platform == "darwin":
# Mac returns darwin for both M1 and Intel silicon, need to dig deeper
unameResult = platform.uname()
if "ARM" in unameResult.version:
# M1 silicon
ergFile = "erg-2020-osx-m1-perplexity.dat"
else:
# Intel silicon
ergFile = "erg-2020-osx-perplexity.dat"
else:
ergFile = "erg-2020-ubuntu-perplexity.dat"
return ergFile
Next, we need to take those MRS documents and turn them into well-formed trees. For this, we’ll create a function called trees_from_mrs()
. It will call the function we wrote in the section on well-formed trees called valid_hole_assignments()
that does the assignments of predication labels to “holes” as discussed in that section. It will then call the tree_from_assignments()
function (also included below) that does the work of actually building a tree from those assignments and represents the tree using the text format we designed in the MRS to Python topic:
def trees_from_mrs(self, mrs):
# Create a dict of predications using their labels as each key
# for easy access when building trees
# Note that a single label could represent multiple predications
# in conjunction so we need a list for each label
mrs_predication_dict = {}
for predication in mrs.predications:
if predication.label not in mrs_predication_dict.keys():
mrs_predication_dict[predication.label] = []
mrs_predication_dict[predication.label].append(predication)
# Iteratively return well-formed trees from the MRS
for holes_assignments in valid_hole_assignments(mrs, self.max_holes):
# valid_hole_assignments can return None if the grammar returns something
# that doesn't have the same number of holes and floaters (which is a grammar bug)
if holes_assignments is not None:
# Now we have the assignments of labels to holes, but we need
# to actually build the *tree* using that information
well_formed_tree = tree_from_assignments(mrs.top,
holes_assignments,
mrs_predication_dict,
mrs)
pipeline_logger.debug(f"Tree: {well_formed_tree}")
yield well_formed_tree
def tree_from_assignments(hole_label, assignments, predication_dict, mrs):
# Get the list of predications that should fill in the hole
# represented by labelName
if hole_label in assignments.keys():
predication_list = predication_dict[assignments[hole_label]]
else:
predication_list = predication_dict[hole_label]
# predication_list is a list because multiple items might
# have the same key and should be put in conjunction (i.e. be and'd together)
conjunction_list = []
for predication in predication_list:
tree_node = [predication.predicate]
# Recurse through this predication's arguments
# and look for any scopal arguments to recursively convert
for arg_name in predication.args.keys():
original_value = predication.args[arg_name]
# CARG arguments contain strings that are never
# variables, they are constants
if arg_name in ["CARG"]:
new_value = original_value
else:
argType = original_value[0]
if argType == "h":
new_value = tree_from_assignments(original_value, assignments, predication_dict, mrs)
else:
new_value = original_value
tree_node.append(new_value)
conjunction_list.append(tree_node)
# Since these are "and" they can be in any order
# Sort them into an order which ensures event variable
# usage comes before introduction (i.e. ARG0)
return sort_conjunctions(conjunction_list)
The sort_conjunctions()
function isn’t shown because it is not a small amount of code and it isn’t important to understanding the material here. It is there because our evaluation model evaluates predications in a depth-first manner. Terms that are in conjunction need to be evaluated in a particular order so that event arguments are filled in before they are used, and sort_conjunctions()
does this. You can browse the code for it here.
With all that, we can now write code that takes a phrase and generates all the trees from it:
Todo: update example
The next topic will describe a heuristic for determining which of those trees is the one the user meant.
Comprehensive source for the completed tutorial is available here.
Last update: 2023-05-14 by EricZinda [edit]