stdgrepstdgrep is a simple, easy to use, perl
CGI-BIN shell script which searches a file for a user query string. In its
most simple form, stdgrep performs an operation just like the
program grep, which prints out lines of a file that match a
particular pattern. Currently, it only supports the post
method of information passing.
stdgrep is useful for searching moderately-sized files for
data. At the development site, we have used it successfully to query mid-sized
static representations of databases. This saves the trouble and system
load of querying the actual database system, as is the case with other
"gateway" search engines, such as GSQL. Also, since
stdgrep returns the actual lines of the file matching the
query, we have found it useful to construct these original files as HTML
files themselves. This is particularly good for leaving users of the
search engine a way to browse through the entire database if they wish.
With version 2.0 of stdgrep, the capability to address a
search with multiple targets has been added as well. This is particularly
useful in a static database environment, if you wish to provide your
clients to be able to specify constraints on several fields on interest.
fields
As with any WWW-based search engine, it is necessary to present a form interface to the user that ultimately also handles the interface to the search program itself. In this section I will explain the fields that are meaningful to the search engine. (note: all field names are case sensitive)
There are only two required fields. If you specify only these two
fields, the program is essentially functioning as grep.
stdgrep
script correctly, you need to specify this from the document root of your
WWW server. For example if I wanted to have a search form to search the
document at http://foo.bar.com/dir1/a.html, then the value for this field
would be "/dir1/a.html".
input type=text form field.
For any search string "set" there can be three fields pertaining to that one search. The first search string has "1" appended to its fields (ie. the target string would have the field name "target1" and its start and end strings would be "start1" and "end1" respectively. In this way you may specify as many seperate search strings as you like. In order for the line to be printed, all of the conditions must be satisfied; it's if all of the strings and a boolean AND attached.
You should note that the search strings are exactly literal strings, we
are using perl's pattern matching facility, so really it is
just a perl regular expression, and thus you can use special
symbols understood by perl in the construction of your search
fields.
In addition to these standard searches, you can also search for numeric
values that are between certain values. For any serch of this case, we
define a field in the form "option" to have the value "between". (In
future versions of stdgrep, the "option" field may be expanded
but for now the only useful value is "between".
Setting the "option" string for a search (i.e. for string target 2 to
be a between search value, you would defined the field "option2" to be
"between") makes the target string behave differently. Instead of being
used as a literal value to search for, the target field now
specifies the numeric minimum and maximum to search for. Thus the target
should be of the form min-max where min is the
field's lowest acceptable value and max is the highest
acceptable value.
For example setting "100-200" for the target field with the
option field set to "between" would match numbers between
100 and 200 inclusive.
Finally, if you would like the user to have the simplest access to the fields, you may fill in "user" in the target field of a between search. This will relegate the search maximums and minimums to 2 additional fields "high" and "low" that the user can specify.
For example, for the 2nd target to try to match, I might want to match
a price restriction with the maximum and minimum specified by the user. So
I define option2 as between. and can have the user either fill
in a low end value and a high end value in two text fields called
low2 and high2, respectively. If I think this is
appropriate, then I define target2 as "user" to enable these
two fields.
Otherwise I can have the user manually input the target in a text field, specifying that they must manually insert the dash symbol. Or perhaps you can predefine a couple ranges using radio buttons or select input fields.
| Last modified 12/26/06 19:44 Research and Development Group Academic Information Systems Columbia University |
Help Line: 212 854.1919Email: consultant@columbia.edu |