The table element
Tables in C-Phrase correspond exactly to the SQL notion of tables and include columns, primary keys, foreign keys, etc. Still, we encourage administrators to consider tables as representing either entities or many-to-many relationships in some ER model of the database domain (foreign keys represent many-to-one and one-to-many relationships).
A simple example entity table element from geo.cphrase
is:
<table id="Lake" lex="lake,lakes,sjö,sjöar"> <column id="name" type="text"/> <column id="area" type="int"/> <primary_key key="name"/> </table>
A many-to-many relationship between Lake
and State
is represented in the table:
<table id="Lakeinstate" silent="true" lex="lake-state membership record,lake-state membership records,accesses,within,has within,is within"> <column id="state" type="text" properties="none"/> <column id="lake" type="text" properties="none"/> <foreign_key from="state" to="State"/> <foreign_key from="lake" to="Lake"/> </table>
Finally, to complete the example, the many-to-one relationship between mountains and states is represented directly on the
Mountain
table element as a foreign key:
<table id="Mountain" lex="mountain,mountains,mount,peak,summit"> <column id="state" type="text" lex="state,states,located,where" properties="none"/> <column id="name" type="text"/> <column id="height" type="int" lex="height,heights,how tall" units="meters"> <max lex="highest,tallest"/> <min lex="shortest"/> </column> <primary_key key="name"/> <foreign_key from="state" to="State" lex="in,that have,of"/> </table>
We find that distinguishing tables as entities (nouns) or relationships (verbs) fits very well with NLIs. Still these are only recommended design principles and are not enforced in any way within C-Phrase.
Returning to the more abstract notion of a table, table
elements have the following DTD-based definition:
<!ELEMENT table (column|primary_key|unique|primary_value|foreign_key|virtual_column)*> <!ATTLIST table id CDATA #REQUIRED lex CDATA #IMPLIED writes CDATA #IMPLIED select CDATA #IMPLIED silent CDATA #IMPLIED explore CDATA #IMPLIED alive CDATA #IMPLIED description CDATA #IMPLIED>
id is the exact identifier of the table in the underlying database (e.g. "Lake").
lex gives a comma separated string that associates a sequence of
phrases with the given table. C-Phrase follows the convention that the first phrase of this list (e.g.
"product") is the preferred singular form while the second phrase (e.g. "products") is the
preferred plural form. If the table can be crossed and thus represent a relationship, the third phrase is
what is used in the forward traversal of the relationship while the fourth form is used in the
inverse direction. In Lakeinstate
above, the first form is "lake-state membership record",
the second is "lake-state membership records", the third is the
crossing from state to lake and is "accesses" (e.g. "Michigan accesses Lake Superior") and the fourth
phrase for lake to state is "within" (e.g. "Lakes within Michigan").
C-Phrase uses these phrases when generating paraphrases of queries.
writes takes the value of either "full", "insert", or (as default) "none".
The value "full" means that unrestricted natural language updates are allowed to the underlying table. The value
"insert" means that only single tuple inserts are allowed over the underlying table. The value "none", which is
default, means that updates via natural language are disallowed. In the example above
updates are thus not allowed over the Northwind products
table. Also, if the user specified
in the database connection is granted only read-only privileges over the underlying table, then updates will not
be accepted regardless of how writes
within C-Phrase is assigned.
select determines whether the table can be selected (in the SQL sense)
via natural language. Defaulting to "true", the value "false"
signals that
select is disabled. Most often select is false if you want to only allow a relation to be used as a relationship.
Still, it is fairly rare that you will need to turn select off.
silent determines whether the table can be used in a query without being explicitly mentioned.
This is normally something that attaches to a many-to-many relationship table. For the geo example, the
LakeInState
table has silent turned on. This then lets users type in questions like "Lake california".
Because it is marked as silent C-Phrase can include the table automatically during interpretation and infer that the
question is "lakes within the State named California". This is an example of ellipsis in natural language.
Currently, this is limited in C-Phrase to only one silent many-to-many relationship between any given pair of tables.
explore determines whether the table will appear in the explorer of the NLI Portal. This defaults to true, but for certain support tables, you may want to switch it to false.
Sub-elements
The sub-elements of tables consist of the familiar notions of columns, primary keys, uniques and foreign keys. Tables also represent the NLI-based notion of a primary values. Finally tables may also have virtual columns. These are treated almost exactly as regular columns, although the are obtained by applying substitutions of an SQL expression to derive the value of the attribute.