How to use stdgrep

Version 2.0

Table of Contents

  1. Overview
  2. Arguments in the Form
  3. Credits

Overview

stdgrep is a simple, easy to use, perl CGI-BIN shell script which searches a file for a user query string. In its most simple form, stdgrep performs an operation just like the program grep, which prints out lines of a file that match a particular pattern. Currently, it only supports the post method of information passing.

stdgrep is useful for searching moderately-sized files for data. At the development site, we have used it successfully to query mid-sized static representations of databases. This saves the trouble and system load of querying the actual database system, as is the case with other "gateway" search engines, such as GSQL. Also, since stdgrep returns the actual lines of the file matching the query, we have found it useful to construct these original files as HTML files themselves. This is particularly good for leaving users of the search engine a way to browse through the entire database if they wish.

With version 2.0 of stdgrep, the capability to address a search with multiple targets has been added as well. This is particularly useful in a static database environment, if you wish to provide your clients to be able to specify constraints on several fields on interest. fields


Fields in the Form

As with any WWW-based search engine, it is necessary to present a form interface to the user that ultimately also handles the interface to the search program itself. In this section I will explain the fields that are meaningful to the search engine. (note: all field names are case sensitive)

Required Fields

There are only two required fields. If you specify only these two fields, the program is essentially functioning as grep.

filename
File to search. If your webmaster installed the stdgrep script correctly, you need to specify this from the document root of your WWW server. For example if I wanted to have a search form to search the document at http://foo.bar.com/dir1/a.html, then the value for this field would be "/dir1/a.html".
target1
The first target to search. Currently, the search is case-insensitive. Usually, this field would be specified be the user in a input type=text form field.

Global Fields

dbname
the name that you want the search engine to report it is searching.

Search String Fields

For any search string "set" there can be three fields pertaining to that one search. The first search string has "1" appended to its fields (ie. the target string would have the field name "target1" and its start and end strings would be "start1" and "end1" respectively. In this way you may specify as many seperate search strings as you like. In order for the line to be printed, all of the conditions must be satisfied; it's if all of the strings and a boolean AND attached.

target
the text to be searched for
start
any delimiting text that should always be found right before the text for the target.
end
the ending delimiter to apply to terminate the framing text around the target.

You should note that the search strings are exactly literal strings, we are using perl's pattern matching facility, so really it is just a perl regular expression, and thus you can use special symbols understood by perl in the construction of your search fields.

In addition to these standard searches, you can also search for numeric values that are between certain values. For any serch of this case, we define a field in the form "option" to have the value "between". (In future versions of stdgrep, the "option" field may be expanded but for now the only useful value is "between".

Setting the "option" string for a search (i.e. for string target 2 to be a between search value, you would defined the field "option2" to be "between") makes the target string behave differently. Instead of being used as a literal value to search for, the target field now specifies the numeric minimum and maximum to search for. Thus the target should be of the form min-max where min is the field's lowest acceptable value and max is the highest acceptable value.

For example setting "100-200" for the target field with the option field set to "between" would match numbers between 100 and 200 inclusive.

Finally, if you would like the user to have the simplest access to the fields, you may fill in "user" in the target field of a between search. This will relegate the search maximums and minimums to 2 additional fields "high" and "low" that the user can specify.

For example, for the 2nd target to try to match, I might want to match a price restriction with the maximum and minimum specified by the user. So I define option2 as between. and can have the user either fill in a low end value and a high end value in two text fields called low2 and high2, respectively. If I think this is appropriate, then I define target2 as "user" to enable these two fields.

Otherwise I can have the user manually input the target in a text field, specifying that they must manually insert the dash symbol. Or perhaps you can predefine a couple ranges using radio buttons or select input fields.


Bugs


Credits


Last modified 12/26/06 19:44
Research and Development Group
Academic Information Systems
Columbia University
Help Line: 212 854.1919
Email: consultant@columbia.edu