The question element
The question element contains an element that is presented to the user as a literal text string that they can submit to C-Phrase. Question elements may also be involved in testing and there is a way to generate questions from logs of user utterances.
<!ELEMENT question (sql)*> <!ATTLIST question text CDATA #REQUIRED status CDATA #IMPLIED comments CDATA #IMPLIED>
text is the literal text of the question.
status can be empty or be one of the set of constants: "success","adequate", "inadequate","failure","excluded","pending". We have found these key words are a good mix for administrators that are marking questions in a corpus collected by logs.
comments allows you to attach arbitrary documentation over a question.
The sub-element of question is an sql element. This gives the target sql that the question corresponds to. In cases of ambiguity there can be several mrl sub-elements.
Here is an example corpus file for geo
.
<?xml version="1.0" encoding="UTF-8"?> <corpus> <question text="give me the cities in california" status="success"> <sql query="SELECT DISTINCT X1.name,X1.state,X1.population FROM City AS X1, State AS X2 WHERE X1.state=X2.name AND X2.name='California'"/> </question> <question text="what is the population of new york" status="success"> <sql query="SELECT DISTINCT X1.population FROM City AS X1 WHERE X1.name='New York'"/> <sql query="SELECT DISTINCT X1.population FROM State AS X1 WHERE X1.name='New York'"/> </question> <question text="give me the largest state" status="success"> <sql query="SELECT DISTINCT X1.name,X1.population,X1.area,X1.density,X1.capital FROM (SELECT DISTINCT Y1.* FROM State AS Y1 WHERE Y1.area IS NOT NULL ORDER BY Y1.area IS NULL, Y1.area DESC LIMIT 1) AS X1"/> <sql query="SELECT DISTINCT X1.name,X1.population,X1.area,X1.density,X1.capital FROM (SELECT DISTINCT Y1.* FROM State AS Y1 WHERE Y1.density IS NOT NULL ORDER BY Y1.density IS NULL, Y1.density DESC LIMIT 1) AS X1"/> <sql query="SELECT DISTINCT X1.name,X1.population,X1.area,X1.density,X1.capital FROM (SELECT DISTINCT Y1.* FROM State AS Y1 WHERE Y1.population IS NOT NULL ORDER BY Y1.population IS NULL, Y1.population DESC LIMIT 1) AS X1"/> </question> <question text="give me the states that border utah" status="success"> <sql query="SELECT DISTINCT X1.name,X1.population,X1.area,X1.density,X1.capital FROM State AS X1, Border AS X2, State AS X3 WHERE X1.name=X2.state1 AND X2.state2=X3.name AND X3.name='Utah'"/> </question> <question text="smallest city in the largest state" status="success"> <sql query="SELECT DISTINCT X1.name,X1.state,X1.population FROM (SELECT DISTINCT Y1.* FROM City AS Y1, (SELECT DISTINCT Z1.* FROM State AS Z1 WHERE Z1.area IS NOT NULL ORDER BY Z1.area IS NULL, Z1.area DESC LIMIT 1) AS Y2 WHERE Y1.state=Y2.name AND Y1.population IS NOT NULL ORDER BY Y1.population IS NULL, Y1.population ASC LIMIT 1) AS X1"/> <sql query="SELECT DISTINCT X1.name,X1.state,X1.population FROM (SELECT DISTINCT Y1.* FROM City AS Y1, (SELECT DISTINCT Z1.* FROM State AS Z1 WHERE Z1.density IS NOT NULL ORDER BY Z1.density IS NULL, Z1.density DESC LIMIT 1) AS Y2 WHERE Y1.state=Y2.name AND Y1.population IS NOT NULL ORDER BY Y1.population IS NULL, Y1.population ASC LIMIT 1) AS X1"/> <sql query="SELECT DISTINCT X1.name,X1.state,X1.population FROM (SELECT DISTINCT Y1.* FROM City AS Y1, (SELECT DISTINCT Z1.* FROM State AS Z1 WHERE Z1.population IS NOT NULL ORDER BY Z1.population IS NULL, Z1.population DESC LIMIT 1) AS Y2 WHERE Y1.state=Y2.name AND Y1.population IS NOT NULL ORDER BY Y1.population IS NULL, Y1.population ASC LIMIT 1) AS X1"/> </question> <question text="3 tallest Mountains in alaska" status="success"> <sql query="SELECT DISTINCT X1.state,X1.name,X1.height FROM (SELECT DISTINCT Y1.* FROM Mountain AS Y1, State AS Y2 WHERE Y1.state=Y2.name AND Y1.height IS NOT NULL AND Y2.name='Alaska' ORDER BY Y1.height IS NULL, Y1.height DESC LIMIT 3) AS X1"/> </question> <question text="3 tallest mountains on west coast" status="success"> <sql query="SELECT DISTINCT X1.state,X1.name,X1.height FROM (SELECT DISTINCT Y1.* FROM Mountain AS Y1, State AS Y2 WHERE Y1.state=Y2.name AND Y1.height IS NOT NULL AND Y2.name IN('California','Oregon','Washington') ORDER BY Y1.height IS NULL, Y1.height DESC LIMIT 3) AS X1"/> </question> </corpus>