The table element

Tables in C-Phrase correspond exactly to the SQL notion of tables and include columns, primary keys, foreign keys, etc. Still, we encourage administrators to consider tables as representing either entities or many-to-many relationships in some ER model of the database domain (foreign keys represent many-to-one and one-to-many relationships).

A simple example entity table element from geo.cphrase is:

    <table id="Lake" lex="lake,lakes,sjö,sjöar">
        <column id="name" type="text"/>
        <column id="area" type="int"/>
        <primary_key key="name"/>
    </table>

A many-to-many relationship between Lake and State is represented in the table:

    <table id="Lakeinstate" silent="true" lex="lake-state membership record,lake-state membership records,accesses,within,has within,is within">
        <column id="state" type="text" properties="none"/>
        <column id="lake" type="text" properties="none"/>
        <foreign_key from="state" to="State"/>
        <foreign_key from="lake" to="Lake"/>
    </table>

Finally, to complete the example, the many-to-one relationship between mountains and states is represented directly on the Mountain table element as a foreign key:

   <table id="Mountain" lex="mountain,mountains,mount,peak,summit">
        <column id="state" type="text" lex="state,states,located,where" properties="none"/>
        <column id="name" type="text"/>
        <column id="height" type="int" lex="height,heights,how tall" units="meters">
            <max lex="highest,tallest"/>
            <min lex="shortest"/>
        </column>
        <primary_key key="name"/>
        <foreign_key from="state" to="State" lex="in,that have,of"/>
    </table>
    

We find that distinguishing tables as entities (nouns) or relationships (verbs) fits very well with NLIs. Still these are only recommended design principles and are not enforced in any way within C-Phrase.

Returning to the more abstract notion of a table, table elements have the following DTD-based definition:

<!ELEMENT table (column|primary_key|unique|primary_value|foreign_key|virtual_column)*>
<!ATTLIST table
          id CDATA #REQUIRED
          lex CDATA #IMPLIED
          writes CDATA #IMPLIED
          select CDATA #IMPLIED
          silent CDATA #IMPLIED
          explore CDATA #IMPLIED
          alive CDATA #IMPLIED
          description CDATA #IMPLIED>

id is the exact identifier of the table in the underlying database (e.g. "Lake").

lex gives a comma separated string that associates a sequence of phrases with the given table. C-Phrase follows the convention that the first phrase of this list (e.g. "product") is the preferred singular form while the second phrase (e.g. "products") is the preferred plural form. If the table can be crossed and thus represent a relationship, the third phrase is what is used in the forward traversal of the relationship while the fourth form is used in the inverse direction. In Lakeinstate above, the first form is "lake-state membership record", the second is "lake-state membership records", the third is the crossing from state to lake and is "accesses" (e.g. "Michigan accesses Lake Superior") and the fourth phrase for lake to state is "within" (e.g. "Lakes within Michigan"). C-Phrase uses these phrases when generating paraphrases of queries.

writes takes the value of either "full", "insert", or (as default) "none". The value "full" means that unrestricted natural language updates are allowed to the underlying table. The value "insert" means that only single tuple inserts are allowed over the underlying table. The value "none", which is default, means that updates via natural language are disallowed. In the example above updates are thus not allowed over the Northwind products table. Also, if the user specified in the database connection is granted only read-only privileges over the underlying table, then updates will not be accepted regardless of how writes within C-Phrase is assigned.

select determines whether the table can be selected (in the SQL sense) via natural language. Defaulting to "true", the value "false" signals that select is disabled. Most often select is false if you want to only allow a relation to be used as a relationship. Still, it is fairly rare that you will need to turn select off.

silent determines whether the table can be used in a query without being explicitly mentioned. This is normally something that attaches to a many-to-many relationship table. For the geo example, the LakeInState table has silent turned on. This then lets users type in questions like "Lake california". Because it is marked as silent C-Phrase can include the table automatically during interpretation and infer that the question is "lakes within the State named California". This is an example of ellipsis in natural language. Currently, this is limited in C-Phrase to only one silent many-to-many relationship between any given pair of tables.

explore determines whether the table will appear in the explorer of the NLI Portal. This defaults to true, but for certain support tables, you may want to switch it to false.

Sub-elements

The sub-elements of tables consist of the familiar notions of columns, primary keys, uniques and foreign keys. Tables also represent the NLI-based notion of a primary values. Finally tables may also have virtual columns. These are treated almost exactly as regular columns, although the are obtained by applying substitutions of an SQL expression to derive the value of the attribute.