Queries


The Art & Science of The Query

A query is a question. A query allows you to retrieve and manipulate information that meets certain criteria from the data stored within a table. However, a query isn't as simple as asking a plain english question. It is made up of criteria, operations, and conditions to compare that which is stored against that which is desired.

MadSearch2 uses a scaled-down form of the Query-By-Example (QBE) query language. QBE specifies queries through a graphical representation with a bit of formal syntax thrown in. It was chosen as MadSearch2's query language because of its approachability for the average user.

You may be wondering how queries fit within the grand scheme of the universe. We can't answer this. We don't understand this ourselves. However, we can tell you how they fit within an application.

First, the basics: MadSearch2 requires input to provide output. Input, in this case, will be a query provided by an HTML form. Output will be both an answer table containing the records applicable to our query as well as a report -- a graphical presentation of our answer records' data. The answer table is a temporary storage area for records that answer our query. However, we need never concern ourselves with it.

In this section, "Queries", we will discuss how to construct queries and provide them to MadSearch2 as input. We will look at the output of information through reports in the next section, "Reports".


The Elements of a QBE Query

Before we discuss the syntax, structure, and details of QBE queries, let's look at the components of any query. All queries made to MadSearch2 have certain elements in common. They're the things MadSearch2 needs to know in order to provide us with what we want. You need to specify:

  • the name and location of the table that you wish to query
  • the data fields within the table that are of interest to you
  • what you want to do with these data fields or records
  • why the data fields you've specified are of interest to you

Let's discuss two of the points a bit further before moving on.


What Is In A Name?


Plenty. MadSearch2 needs to know where it should go to obtain or manipulate the data specified by your query. Thus, you must provide both the location of table (i.e. - the directory in which it resides) as well as the table's name.

Now, for those of you who have peeked at how MadSearch2 stores it's tables, you may have noticed that MadSearch2 appends a ".msdb2" extension to it's data tables. When you specify the name of the table, you needn't -- in fact, you shouldn't include the table's ".msdb2" extension. MadSearch2 is on a strictly first-name basis with your tables. After all, it created them.

It's important to note that MadSearch2 is quite particular about the case in which you specify the name and location. This is because MadSearch2 works with the underlying unix file system which is case-sensitive. Therefore, "myTable" is not the same as "MYTABLE" which is not the same as "mYtAbLe". Got the idea? Good. If not, get it. This is one of the few occasions where MadSearch2 is case-sensitive.

Perhaps an example would be good at this point. Say, you've created a table in the directory "/www/data/cu/myData/" and the table's name is "guestBook", you would type the following to specify these to the query:

/www/data/cu/myData/guestBook

Hold on. "Where do I specify this?" you ask. Well, I'm not going to tell you just yet. Read on, gentle reader.


So, What Are You Asking Me To Do?


If you're creating a query, you must want to do something to or with the data. But just what is it that you want to do? MadSearch2 needs to know.

There are four types queries MadSearch2 can perform: reads, updates, insertions, and deletions. Each of these operates over the records within your table. Which records they operate on are specified by criteria. But we get ahead of ourselves. First let's define what these four types of queries:

  • Reads

    Read queries compare your table data against the criteria specified within your query. These types of queries return those records that correspond to your criteria. No modifications are made to table data. If a query type isn't specified within a query, a read query is assumed.

  • Updates

    As with all queries, MadSearch2 compares your table data against the criteria specified within the query and modifies the records that correspond to the criteria. In order to modify field values, both criteria and update values for the various fields are specified. MadSearch performs validity checks upon update values including data formatting, length, type, and required status. Fields of type AutoIncrement can not be modified.

  • Insertions

    With insertion queries, criteria are not specified. Instead the query is used to construct a new record to be inserted into the data table. MadSearch2 performs validity checks upon the values specified for the various fields of your data table include formatting, length, and type.

  • Deletions

    In deletion queries, MadSearch2 compares the table data against the criteria specified within your query and removes records that correspond to the criteria.


The Structure Of A QBE Query

QBE queries are very much in structure a graphical representation of the table they are to query. Each of the elements described come in to play. You can think of a QBE query as a rectangle divided into four pieces. Each of these pieces house one of the essential elements:

table name & location the table data fields that are of interest to you
the type of query why the data fields are of interest to you (criteria)

Ok. we've divided our QBE query into quadrants. But, things are not so beautiful as all this. We have to specify our query as a string of text. Let's proceed with a sample query. Pretend we're working with a table that stores people's name and addresses.

Our First Sample Query
/www/data/cu/myData/myTable | Name  | Address | Telephone | Age  |
read                        | check | check   | check     | > 21 |

In the upper left quadrant of the query is the name and location of the table. In the lower left quadrant, below the name and location of the table is the query type declaration. Note that we've specified that this is a read query on "/www/data/cu/myData/myTable". Technically, we didn't have to specify "read" for the query type as MadSearch2 assumes a read query if no query-type is specified within the lower left quadrant.

The right quadrants of our query specify the data fields of the table that we are interested in and why they are of interest to us. In the upper right quadrant we see that there are four data fields specified: "Name", "Address", "Telephone", and "Age". In the lower right quadrant we are told why these fields are of interest.

Note how the fields are divided from one another by "|" characters. Fields separated by "|"'s are called query fields. Query fields map to and reference our table's data fields. For example,

| Name  |
| check |

is a query field. The top of the query field specifies which data field with which we are concerned. The bottom part of the query field specifies why. In the "Name" query field, the word "check" is a query field operator that tells MadSearch2 to return this data field as part of the answer record. Now look above to the "Age" query field. There is no "check" operator specified. So, we're not interested in having MadSearch2 return it to us as part of the answer. Instead, we've specified a criterion, "> 21". We're interested in all the people who are older than age 21.

In the example above, our query fields line up nicely and everything fits within the width of our screen. What if things weren't lined up so nicely? MadSearch2 doesn't care so long as query fields are separated by "|" characters. Additionally, MadSearch2 ignores all the white space should you decide to line things up nicely.

As to those times when not everything fits within the width of our screen, MadSearch2 allows for queries to be specified on multiple lines. Take a look.

Our First Sample Query :: Revised
/www/data/cu/myData/myTable | Name  | Address | 
read                        | check | check   |

/www/data/cu/myData/myTable | Telephone | Age  |
                            | check     | > 21 |

To MadSearch2, it's exactly the same query as we specified before. However, there are a few things to note. The query lines above are separated by a blank line. This is not optional. MadSearch2 will expect this. Additionally, you do not have to re-specify the query's type. You can if you want to, but it's unnecessary as MadSearch2 already knows this. Why do you have to re-specify the table then? If I told you, I'd have to kill you. For now, let's just say it allows for MadSearch2 to continue to blossom and mature.


It's The Criteria That Are Going To Get You Stuff

We have already seen the "check" operator at work in query fields. We've even seen a small example of a criterion being specified as a search condition. Let's now look at how to create more complicated criteria expressions.

First, do you remember the operators AND and OR from high school truth tables? Okay, let's review. The AND operator is true only when both of the two expressions or events it joins evaluate to true. For example:

( 10 < 12 ) and ( 15 < 14 )  --> False
( 10 < 12 ) and ( 15 > 14 )  --> True
( 10 > 12 ) and ( 15 > 14 )  --> False

The OR operator is true if both expressions are true or one or the other of the expressions it joins evaluates to true. For example:

( 10 < 12 ) or ( 15 < 14 )  --> True
( 10 < 12 ) or ( 15 > 14 )  --> True
( 10 > 12 ) or ( 15 > 14 )  --> False

So how is this important? Well, this logic is what underlies MadSearch2's evaluation of whether the records in your data table fit the criteria you have specified. It is important to understand this evaluative process in order to obtain the results you want. There ends our review of high school logic.


Evaluation of Query Fields


When MadSearch2 evaluates whether a record fulfills the requirements specified by query's criteria, it does so query field by query field. For a record to fullfill your query, it must satisfy the criteria provided to each of its query fields. MadSearch2 first evaluates the criteria expression for each query field. Each query field is evaluated independently of other query fields. The results of these query field evaluations are ANDed together. If the criteria within every query field evaluates to true, then the underlying data record will satisfy your query. Let's take the following query:

Our Second Query
/www/data/cu/myData/myTable | Name                    | Address | Telephone | Age  |
read                        | check, "John" or "Luke" | check   | check     | > 21 |

This read query is being executed against our address book database as in our previous examples. It is asking MadSearch2 to return all records where the person's name is "John" or "Luke" and whose age is greater than 21. Now let's apply the query to the following records:

Name Address Phone Age
Mary NY, NY 555-1234 23
John NY, NY 555-2356 19
Luke Cambridge, MA 555-1234 32

MadSearch2 would first apply the query to the first record. "Mary" is not equal to "John" or "Luke", so the "Name" field for this record evaluates to false. 23 is greater than 21, so the "Age" field evaluates to true. MadSearch2 now ANDs the results of evaluating these query fields. A true expression ANDed with a false expression is false. Therefore, MadSearch2 rejects the record and moves on.

MadSearch2 doesn't actually evaluate things in this manner. It cuts some corners. Once a query field's expression evaluates to false, there is no way that the record will be selected. We're simply trying to emphasize the point that the criteria supplied to query fields are evaluated independently of criteria expressions provided to other query fields. Oh, the only record that would be selected above by our query would be Luke's.


Query Fields Are Quite Tolerant


Hey, here's a question. What is a query field is specified, but no criteria expression is provided for it? For example,

A blank query field
| Name |
|      |

Does MadSearch2 bug out? No, it will simply ignore it. MadSearch2 also ignores certain flavors of uncompleted expressions. After all, it's running in a web-type of environment. It has to be somewhat tolerant. For example, the following will not be flagged as an error:

An incomplete query field
| Name |
| >    |

MadSearch2 will shrug this off saying, "Oh, well. The dumby forgot to specify a criterion value."

But let's push the envelope of tolerance a little further. What if we leave all the query fields blank. For example,

Our Second Query
/www/data/cu/myData/myTable | Name | Address | Telephone | Age  |
read                        |      |         |           |      |

MadSearch2 isn't going to like this query. It's like asking MadSearch2 to wake up and do nothing. That's quite annoying. At the very least, the query must return something. As no fields have the query field operator "check" in them, no fields are going to be returned as an answer for this query. However, the following query is valid.

Our Second Query
/www/data/cu/myData/myTable | Name  | Address | Telephone | Age   |
read                        | check | check   | check     | check |

Even though no criteria are specified, the "check" operator has MadSearch2 return the checked data fields for all the records in the underlying table. Note that when we do not specify criteria for a query field, it's akin to asking everything to be a match for the field.


How To Specify Criteria


The criteria strings we place within query fields are composed of criterion operators, criterion values, and criterion conditions. Using MadSearch2 we can create arbitrarily long and complex criteria strings.

The most basic way to specify a criterion for a query field is simply to specify a criterion value that the underlying data field must have in order to have a record selected. For example, if we wanted all the people whose age is 21 within our address table, we could place the criterion value 21 within out "Age" queryfield:

Query to find all people whose age is 21
/www/data/cu/myData/myTable | Name  | Address | Telephone | Age  |
read                        | check | check   | check     | 21   |

That was easy enough. So, criterion values are used to specify a value that the underlying record data fields must possess in order to be selected by our query. Note that when we specify a criterion value by itself, it is implicit that we mean "equal to this value." For example, we could also have written the above query as follows:

Another query to find all people whose age is 21
/www/data/cu/myData/myTable | Name  | Address | Telephone | Age  |
read                        | check | check   | check     | = 21 |

With that one little equal sign above, we have used our first criterion operator and demonstrated how the "equal to" operation is implicit when no criterion operator is specified. It would be quite difficult and time-consuming were there no criterion operators. We often need to compare things against each other. No? Thus, MadSearch2 provides a rich set of criterion operators.

Criterion operators act differently depending upon the type of data field they are to operate on. For example, 900 is less than 1000 when comparing integral values, but not when comparing alphanumeric values since "1" comes before "9" when comparing character strings. Certain criterion operators only operate on certain data field types. For example, the "within" operator which determines whether a bit of text is within a data field value, only operates on alphanumeric fields.

The following table lists MadSearch2's criterion operators, the data fields on which they operate, and their purpose.

Operator Data Field Types Description
= All equal to comparison (default)
not not equal to comparison
changeto in queries of type update, if the a record is selected by the query, changes the value of the data field to the value specified by the criterion value which follows this operator
> AlphaNumeric
Float
Int
Date
Time
DateTime
AutoIncrement
User
Group
Entity
greater than comparison
< less than comparison
>=, => greater than or equal to comparison
<=, =< less than or equal to comparison
within AlphaNumeric
User
Group
Entity
determines whether the criterion value is within the data field alphanumeric string.
startsWith determines whether a data field's alphanumeric string begins with the criterion value specified.
endsWith determines whether a data field's alphanumeric string ends with the criterion value specified.
givePermission UserPermissions
GroupPermissions
OtherPermissions
EntityPermissions
much like the "changeto" operator decribed above. in queries of type update, in a record is selected by the query, adds a permission to the permissions already specified within the underlying data field. while the "changeto" operator sets an entirely new value for a data field, the "givePermission" operator adds a permission to the existing permissions.
removePermission much like the "givePermission" operator decribed above. in queries of type update, in a record is selected by the query, removes a permission from the permissions already specified within the underlying data field.
hasPermission tests whether the permission specified by the criterion value exists.
hasntPermission tests whether the permission specified by the criterion value does not exist.

You have already seen examples of how criterion expressions are specified. They consist of a criterion operator followed by a criterion value. If the criterion operator is omitted, it is assumed that the criterion operator is "equal to". But oftentimes, our comparisons aren't going to be as simple as comparing our data field values to one criterion value. Sometimes, we're going to want to obtain records from a limited range of values or records that meet this "or" that criterion.

In such cases, you'll use criterion conditions. Criterion conditions conjoin criterion expressions together to form criteria expressions. You've already seen an example of the conjunction of two criteria expressions in an example above. Remember this query field?

Our Second Query's Name Query Field
| Name                    |
| check, "John" or "Luke" |

Within this query field, the criterion expression "John" (= "John") is being conjoined to with the criterion expression "Luke" (= "Luke") to form a criteria expression that tells MadSearch2 we want all the records where the data field, "Name", contains "John" or "Luke".

As another example, suppose we wished to specify a range of values for the age field from our address book. Say we wanted to retrieve the contact information all those people whose age is greater than or equal to 21 and less than or equal to 65, but certainly not 33. Our query might look like the following:

Query to find all people whose age is greater than or equal to 21 and less than or equal to 65, but not equal to 33 using the "and" operator
/www/data/cu/myData/myTable | Name  | Address | Age                        |
read                        | check | check   | >= 21 and <= 65 and not 33 |

So we have now seen two examples in which we have used the operators "and" and "or". The comma operator is also often used to specify an "and" operation and can be used in place of the "and" operator. For example:

Query to find all people whose age is greater than or equal to 21 and less than or equal to 65, but not equal to 33 using the comma operator
/www/data/cu/myData/myTable | Name  | Address | Age                         |
read                        | check | check   | check, >= 21, <= 65, not 33 |

The Devil Is In The Details


So we now know how to specify criteria by which to query by. However, there are details still to be exposed. Some have to do with what values are acceptable to specific fields. Others have to do with special operators that may be used to make queries more flexible.

  • Special Operators

    Blank :: So far we've looked at operators that make comparisons on fields that contain information. But what if we want to find fields that don't contain data?

    For this MadSearch2 provides the "blank" operator. The "blank" operator can be used in place of a criterion value within a criterion expression. For example, say we wanted to find all the people within our address table for whom we have no address entered. We could create a query similar to the following:

    Query to find all people who have no address listed
    /www/data/cu/myData/myTable | Name  | Address        | Age   |
    read                        | check | check, blank   | check |
    

    Ok. This makes sense. But now you should be asking, how are blank fields evaluated in other comparisons? Well, a blank field has no value per se. For example, when speaking about integer fields it doesn't equate to 0 or even negative infinity. Yet, blank fields are always evaluated as being less than any other value in comparison operations. Good to know.


    Check :: You've already been introduced to the check operator. When the check operator is specified within a query field, it instructs MadSearch2 to return the underlying data field within the answer records generated.

    The check operator is always placed at the beginning of a query field. If a criteria expression is to be used within the same query field, a comma should separate the check operator from the criteria expression. You should remember from above that the comma is itself an operator within query fields specifying the "and" operation.


    CheckAll :: The checkAll operator is a query operator as opposed to a query field operator. It is placed before the query type to specify that all of the data fields from the underlying data table should be included within the answer records generated by the query. For example,

    Query to find all people who have no address listed.
    /www/data/cu/myData/myTable | Address |
    read, checkall              | blank   |
    

    Despite our having only specified a query field for "Address". the checkAll operator will return for us the fields "Name", "Address", and "Age" for those records which match our query. This query is equivalent to the the example query we provided above for the "blank" operator.

    The checkAll operator is equivalent to our creating a query field for every corresponding data field within out source table and then specifying "check" operator within each of these query fields. You can think of it as a short hand notation.

    Important syntactual point: the checkAll operator must follow the query type operator (in this case "read"). The comma separating the query type operator and the checkAll operator is additionally required to separate the new operators.


  • Special Fields


    AlphaNumeric Fields :: So far our examples for alphanumeric fields have been quite simple -- typically one word criterion values. But what if we need to specify multiple words and include operator words which shouldn't be evaluated by MadSearch2 as criterion or query field operators?

    For such cases, MadSearch2 requires that the values be placed within quotation marks. For example, say we were looking for the phrase "this or that" within quotations from a database of quotations. Let's try the following query:

    Bad query to find quotes containing the phrase "this or that"
    /www/data/cu/myData/famousQuotes | AttributedTo | Quotation                  |
    read                             | check        | check, within this or that |
    

    This query isn't going to work for us. Why? Because the phrase "this or that" is not surrounded by quotation marks, MadSearch2 is going to evaluate the "or" as the criterion condition operator "or". Our query asks MadSearch2, "find me all quotations where the word "this" is within the quotation or the quotation is equal to "that". Not quite what we were going for. The correct query is below:

    Good query to find quotes containing the phrase "this or that"
    /www/data/cu/myData/famousQuotes | AttributedTo | Quotation                    |
    read                             | check        | check, within "this or that" |
    

    Note: To specify quotes using html, you'll have to use the code (&quot;) for quotation marks, otherwise the browser will not interpret the form variable specifying the query correctly. We'll discuss html issues later.


    Date Fields :: Querying for dates involves specifying the criterion value in the following format:

    month/day/year
    

    where the year element is expressed as a 4-digit value (eg. 1999). However, there exists a special "today" operator that may be used with date fields. The "today" operator is subsituted by MadSearch2 when it evaluates the query with the current date.

    But wait, there's more to querying date fields. When placed within double quotes, one can have MadSearch2 query an offset from a given date. Perhaps, an example will help here. Let's say we require a query to retrieve all records last updated within the past week. We might express this as:

    Query to find all records updated within the past week
    /www/data/cu/myData/famousQuotes | RecordId | LastUpdated    |
    read                             | check    | >= "today - 7" |
    

    Note in the above example our use of the today operator as well as the subtraction operator with an integral value. Note that this entire criterion value expression is surrounded by quotation marks. If we were not to surround the expression, MadSearch2 wouldn't evalue the entire criterion expression properly. In addition to the subtraction operator on date values, we may also use the addition operator. The unit being added or subtracted is always expressed an an integral number of days.


    Time Fields :: As with dates, querying for times involves specifying the criterion value in a specific format:

    hours:minutes:seconds
    

    Hours are expressed using military time (the 24 hour clock). For example, a valid time criterion value for 2:34pm would be "14:34:00".

    Just as the date fields have the special "today" operator, time fields have the special "now" operator. When MadSearch2 comes upon the "now" operator as a criterion value, it substitutes the current time.


    DateTime Fields :: As with dates and time fields, querying by datetime fields involves specifying the criterion value in a specific format:

    month/day/year hours:minutes:seconds
    

    All the restrictions imposed upon date and time fields are carried over to datetime fields (eg. military time and 4-digit years). In addition, both the "today" and "now" operators are supported. However, in the context of a datetime field, these operators are substituted for the current system date and time.


    Permissions Fields :: The criterion values that may be specified to express permissions are what make user, group, entity, and other permissions fields special.

    There exist five permissions that may be granted or revoked on permissions fields. They are expressed in the following table:


    Permission Permissions
    Symbol
    Description
    Read R grants read privileges at the database level or record levels
    Update U grants update privileges at the database level or record levels
    Insert I grants insert privileges at the database level
    Delete D grants deletion privileges at the database or record levels
    Administration A grants administrative privileges at the database level

    In creating a criterion value for an expression field one concatenate together the permission symbols they wish to grant or revoke into a string. For example, suppose we wished to find all users who have only read and update privileges within our database. We might use the following query:

    Query to determine which users have only read and update privileges
    /www/data/cu/myData/securityTable | Entity | EntityName | EntityPermissions |
    read                              | check  | check      | check, "RU"       |
    

    The query above is asking for an equality condition on the permission field. Permissions don't always work that way. We might want to find users that have both read and update priviliges and perhaps other privileges as well. In this case, we would use the following query:

    Query to determine which users have read and update privileges, but maybe others as well
    /www/myData/securityTable | Entity | EntityName | EntityPermissions                |
    read                      | check  | check      | check, within "R" and within "U" |
    

    For more information regarding MadSearch2's security policies, please see the "Security" section of this manual.


    Entity Field :: The special concern related to specifying criterion values for the entity field is simply the values that may validly occupy the enity field.

    Recall that the entity field exists only within MadSearch2 security tables and is used to qualify entities as users, groups, or other. Therefore, the values stored within MadSearch2 will be either "User", "Group", or "Other".



Expressing Queries Within HTML

We've seen how to write out queries using "|" characters to express what records we wish to retrieve from our underlying data tables. But how do we send these query statements to MadSearch2?

Within HTML forms, we specify input fields using the input, select, textarea, etc. tags. Each time we specify an input field, we define two attributes for it: the name of the field and its value. For example, with the following bit of html we our defining an input field called "zipCode" whose initial value is blank.

<input type="text" name="zipCode" value="">

We refer to these input fields as variables for we can assign values to them. The name of the variable is specified by the "name" attribute of the input field while the value is specified by the "value" attribute or by what the user enters. For example, this documentation would refer to the input tag above as the zipCode variable.

So, having defined what a form variable is we can now explain how to express a query statment in html. A query statement is simply a variable whose name is query and whose value is a query statement.

Let's take a look at an example. The following is a simple, yet complete html form that directs a query to be sent to MadSearch2 for execution:


<form action="https://www1.columbia.edu/sec-cgi-bin/mad/mad2" method="post">

<input type="hidden" name="query" value="
/www/data/cu/myData | address | city  | state                 | zip   |
read                | check   | check | check, &quot;NY&quot; | check |">

<input type="submit">

</form>

There are a two things of interest to us from this example. The first interesting item is the query variable we define using a hidden input tag. Notice the &quot; character codes being used within the value attribute. Were we not to use these character codes, the value attribute might otherwise be cut short. It is important to remember to use the &quot; character code whenever we are placing information within a quoted html tag attribute value.

The other interesting item in this query is our <form> tag. The action attribute of the tag specifies the location of the MadSearch2 database engine. Currently MadSearch2 is located in two areas depending upon whether you require user authentication:

  • User Authentication :: https://www1.columbia.edu/sec-cgi-bin/mad/mad2

  • No User Authentication :: http://www.columbia.edu/cgi-bin/mad/mad2

This is the location to which the form input will be sent for MadSearch2 to execute. The method attribute of the tag specifies the posting method. In HTML, there are two posting methods: GET and POST. It's not important to understnad these. Simply remember to use POST as your method.

On The Syntax of Different Types of Queries

Up to this point, we've looked at many examples of queries of type read. But what if we wish to do more than simply retrieve information? We've already discussed that there exist four types of queries: reads, updates, inserts, and deletions. How do express the other types of queries?

You'll be glad to know that they're not too much different in form from read queries. Let's go over the differences.


Updates


Update queries are vey similar to read queries in that you specify criteria in order for MadSearch2 to select the records you wish to make updates to. The one obvious change is that you replace the query operator "read" with "update" in the lower left hand corner of the query.

But how do you change field values? Answer: By using the changeto operator which was described above. The changeto operator is a criterion operator whose purpose is to change the value of a field within a selected record. The changeto operator is always followed by a criterion value -- the value to which you wish to set the new field value.

When used in a situation in which other criteria have been specified within a query field, the changeto phrase formed by the changeto operator and its value should be set apart from the rest of criteria expression using the comma operator. Can you guess what the comma operator looks like?

Let's take a look at an update query. Let's suppose there's been a zip code change and we want to change the zip code of all people living in 10027 to 10025. We might write the following query.


Query to change the zip code of all those who live in 10027 to 10025
/www/data/cu/myData/addresses | Name | Address | Zip                       |
update                        |      |         | "10027", changeto "10025" |

Note that the check operator doesn't appear in the query above. This is because with all query types other than read queries, we do not specify fields to be returned. All the fields of the records modified will be returned to use among the answer records. Notice, too, the use of the comma operator to separate the changeto phrase from the criterion value specified in order to select the appropriate records.


Inserts


Insertion queries are probably the easiest of all the query types syntactually. There exist no criteria to specify. You need only specify the intial field values for a new records to be inserted.

That's right. No comparison operators. No ands. No ors. No comma operators. Nothing but values.

Let's take a look. Using our example table from above, let's suppose we wanted to insert a new address record in our addresses table. Our query might look as follows:


Query to insert a new address record
/www/data/addresses | Name          | Address          | Zip     |
insert              | "john grogan" | "mail code 5727" | "10027" |

See? Values alone are specified. We've also changed the query operator in the lower left hand corner to "insert". You may be wondering what is returned to us in the answer table that's generated for reports. After all, we haven't selected anything. The answer is simple. Our results table will contain the newly inserted record.

Values may not be specified for fields of type autoincrement. MadSearch2 will automatically assign an integral value to a record's autoincrement field should it exist.


Deletions


Our last query type looks in every way to be a read query, but for two aspects: 1) we do not specify and fields to be "checked" and returned to us, 2) we specify the query operator "delete" in the lower left hand corner of the query line.

It makes sense that our deletion query would look like a read query as we're trying to select records for deletion. Therefore, a deletion query's query fields are going to be filled with criteria expressions. Once MadSearch2 has selected the records that meet the crtieria, it will then remove the records from the underlying data table.

Deletion queries return an answer table filled with the deleted records for our reports.

A sample deletion query follows. Let's suppose we want to delete all individuals from our address table whose name contains "john" and lives within the 10027 zip code. Our query might look as follows:


Query to remove all those whose name contains "john"
and live within the 10027 zip code
/www/data/addresses | Name          | Address | Zip       |
delete              | within "john" |         | = "10027" |

For The Love Of Query Variables

We now know how to generate all manner of wonderful static queries. But this is exactly the problem. We only know how to create static queries. What if we want to allow users to specify criteria or update values? What if we want to create only a framework for much more dynamic queries?

When speaking of how to specify and send the query to MadSearch2 using HTML form input tags, we spoke of the notion of variables. In order to create dynamic queries, MadSearch2 provides query variables.

What is a query variable? A query variable is simply an HTML form input tag that has a name and a value. For example, the following bit of HTML creates two query variables: a "name" variable in which a user will provide their name and a variable named "classYear" whose value will be derived from the user's selection.


<form action="https://www1.columbia.edu/sec-cgi-bin/mad/mad2" method="post">

<input type="hidden" name="query" value="
/www/data/cu/myData | name  | classYear  |
update              | ~name | ~classYear |">

<input type="text" name="name" value="" size="30">

<select name="~classYear">
<option value="1">First Year</option>
<option value="2">Second Year</option>
<option value="3">Third Year</option>
<option value="4">Fourth Year</option>
</select>

<input type="submit">

</form>

This is easy. No? But how do we insert the values into our query string? Notice in the query above how we have placed on the criteria line the names of our variables with tilde ("~") characters prefixed to them.

Before MadSearch2 processes your query string to check syntax, it first looks for query variables. For MadSearch2, a query variable is a word prefixed with a tilde character. When MadSearch2 comes upon a query variable it substitutes the value of the query variable for the reference to the query variable within your query string. When setting the name attribute of an HTML form input tag to declare the name of query variable, the following names specify the same query variable:

~myVariable
denotes the same variable as
myVariable

Some people prefer to declare query variable names with the tilde as they will specify the query variable in their query string using the query variable. For those who wish to manipulate form values and query variables using javascript, however, the declaration without the tilde will be much easier with which to work.

There are few restrictions on query variables. You can use as many or as few as you like within a query string. They needn't be used for criteria and update values alone. They may be used to substitute for any part of the query expression. For instance, you could use a query variable to specify the column name to query instead of hardcoding the column name:


<form action="https://www1.columbia.edu/sec-cgi-bin/mad/mad2" method="post">

<input type="hidden" name="query" value="
/www/data/cu/myData | name  | ~columnName |
read                | check | check       |">

<select name="~columnName">
<option value="1">Address</option>
<option value="2">City</option>
<option value="3">State</option>
<option value="4">ZipCode</option>
</select>

<input type="submit">

</form>

Query variables can even reference one another so long as they do not create a endless loop in doing so. How would they reference one another? By placing a reference to a query variable within the value of another query variable. For example,

<input type="hidden" name="queryVar1" value=">= ~queryVar2 and <= ~queryVar3">
<input type="text" name="queryVar2" value="">
<input type="text" name="queryVar3" value="">

Query variables are evaluated and their values substituted in a top to bottom, left to right order. It is important to remember that after evaluation, the resulting query string must be complete and syntactually correct -- as if you had originally and statically specified the query.


Quoted Query Variables


Between HTML and the need for users to be able to provide input in an unrestricted manner, it is necessary for MadSearch2 to provide a special form of query variable whose value is not evaluated.

Need an idea of what we're talking about? Say the user were providing a description of their qualifications for a job. They might enter this information using carriage returns, quotations, spaces, and other english words that correspond to criterion operators (eg. "and"). This would be a problem. MadSearch2 would become very confused with the query string resulting from the insertion of the query variable's value.

Quoted query variables prevent such problems. MadSearch2 guarantees that whatever is provided as a value to a quoted query variable will not be evaluated as part of the query string. It will simply be treated as a value. However, it is important to note that because MadSearch2 does not evaluate the value of such a variable, that the quoted variable cannot reference another query variable.

How does one specify a quoted query variable? Simply by prefixing a variable name with the string "quoted". A sample query should help:


<form action="https://www1.columbia.edu/sec-cgi-bin/mad/mad2" method="post">

<input type="hidden" name="query" value="
/www/data/cu/myData | name  | qualifications          |
insert              | ~name | ~quotedMyQualifications |">

<input type="text" name="name" value="" size="30">

<textarea name="quotedMyQualifications">
</textarea>

<input type="submit">

</form>

It is advisable to use quoted query variables whenever user input is variable or unrestricted.

What happens when a query variable's names is referenced but no query variable with the name specified exits? MadSearch2 will simply assume that you're specifying a literal word prefixed for some reason with a tilde. There will be no substitution and the query variable reference will appear in the final, post-processed query string.


Special Query Variables


MadSearch2 additionally provides special query variables whose values are taken from MadSearch2's execution environment. The special query variables are specified within a query string in the same manner as regular query variables or quoted query variables. However, no value may be specified for these variables and no HTML form input tag may declare a variable of the same name as these variables.


Variable Name Variable Value
RemoteUser
Remote_User
UserId
The user id (kerberos handle) of the authenticated user. Should no user be authenticated, a blank is inserted.
More To Come... although not a special query variable, this is a reminder to john to add the others


On Dealing With HTML

We've already discussed one issue in constructing queries within HTML -- that of specifying quotation marks within the quotation marks that set off an html tag attribute's value. To review, internet browser become confused when a quotation mark appears as part of a value. The way in which we deal with this is to use the &quot; character code to instead specify a quotation mark as part of an html tag's value attribute.

There are two further cases in which we must substitute character codes in order to specify character when working with HTML. HTML tags are surrounded by < and > characters.

These characters are the very same characters with which we specify criterion comparison operations (eg. < is the less than comparison operation and > is the greater than comparison operation). How do we embed these characters within an HTML document without an internet browser interpretting these as an opening tag mark or closing tag mark? Then answer is simply to use the code &lt; in place of the less than sign and &gt; in place of the greater than sign. Although it is sometimes possible to get away with using the original character, it is usually safer to substitute the character code.