by
Thomas W. Baker
Arthur W. Diamond Law Library
Columbia University
Presented at the Innovative User Group meeting,
Oakland, Cal., April 1999
|
Let me begin by thanking Sandy Westall of Innovative for her presentation this morning on the basics of the SCAT table. In her presentation, she laid out all the information for you: what the SCAT table is for, and how it should be set up for use with Library of Congress and Dewey call numbers, the standard call number systems for which it was designed. But what do you do when your library uses call number systems other than those standard ones? Can you categorize them too? How far can you go, what can you manage to do, and just where do you finally come to that point where you have to stop and say, "it might be nice, but I will have to make do without that." Since there are probably some who did not hear Sandys presentation, lets do a very small amount of reviewing. [slide 2: Interpret SCAT] The abbreviation SCAT is so swift and handy that its sometimes easy to forget what it stands for: SCAT stands for Statistical Category. The SCAT table is used to create categories, or groups, or "buckets," as Sandy calls them, into which your call numbers will fall, so that you can generate meaningful statistics based on the call number of an item. You create a category by defining two boundaries for that category, a lower boundary and an upper boundary. [slide 3: How do you define a SCAT boundary in LC?] With LC call numbers, a boundary is made using the first set of letters and the first set of numbers, and it is also possible to make a decimal division of that first set of numbers. So, to take a couple of examples, you could choose to begin a category at KF1, or you might want to end a category with JX1974.5. You could not, however, divide your JX1974.5s between two categories, nor could you make a special category out of just one part of your JX1974.5s, say, for example, all the JX1974.5s by a particular author. Still, for most purposes, its fine enough. As for Dewey numbers, [slide 4: How do you define a SCAT boundary in Dewey?] we do not use them in our library, so I have no working experience with them, but as far as the machine is concerned, Dewey call numbers are just LC call numbers without the initial letters. (By the way, I want to be clear that my slides indicate what our machine accepts as input; I dont know how many decimal places your machine would accept, but for our purpose now, it really doesnt matter.) What is then the test a call number must meet in order to fall into a particular category? [slide 5: What is the test?] Our machine seems to apply this test: in order to fall into a particular bucket, a call number must be
(Again, this is the behavior which our machine exhibits.) There are three possible cases: a call number falls exactly on one boundary, or exactly on the other, or somewhere in between. We will take three examples, all drawn from one of our non-standard call number systems. We will start with a sample set of call numbers which the SCAT table mechanism handles perfectly well. Then we will take up two different but interrelated problem cases, cases where the SCAT table mechanism has limitations, but where we can at least separate the problem class marks fairly well from the class marks nearest to them alphabetically. [slide 6: Books on African Law]
In our library, our foreign law, that is, the law of another country, is classed using the Schiller classification system, which was developed in 1933 by Prof. A. Arthur Schiller, Professor of Law at Columbia University. In this system, there are letters which are derived from the name of the country, and then numbers always three digits, from 000 to 999. The same numbers are used across the countries for the same types of law, so that comparative shelf browsing is easy. The slide shows a list of five representative books which go from near the beginning to near the end of our class mark Af. The Af indicates that the book contains comparative discussion of law across all of Africa. The numbers are assigned in accordance with the main topic. The cuttering is sometimes by author, sometimes by title, sometimes by some other attribute, according to which class it is. Suppose that we wanted our Af books to fall into two categories, the second one beginning with the Af 300 book. How would we draw up the SCAT table? [Slide 7: SCAT with break at Af 300]
Here is what our SCAT table would look like. For the moment, I dont want to discuss the lower boundary of the first category. The reason why will become clear in a moment, so lets start with the upper boundary of the first category. The coding of this upper boundary, Af0[zero]299.999, says, "I want this category to contain all Af books lower than Af 300." The coding you see here for this upper boundary is simply the highest possible class number less than Af 300. (You must include the leading zero for the thousands column even though it does not appear on the book and is not part of the schedule.) As for the upper boundary of our second category, Af 9999.999 is the highest imaginable Af. In fact, its 9000 numbers higher than where our Af call numbers really end. Again, Af9999.999 is just a conventional way of writing the code, so that anyone who looks at it later will be in no doubt, "this person meant to go all the way to the end of Af," and of course if we were to start using numbers 1000 and above, were already covered. All of this works exactly as LC call numbers would work, so the system will handle these call numbers just as well as it handles the LC ones. But our Schiller system presents two problems, and the first of them will bring us back to the question of the lower boundary of the first category, which I didnt want to discuss before. There was something I didnt tell you about our Af category. Prof. Schiller treated Africa differently from Europe or Asia, evidently because he wanted all the books on African law to be shelved together. Whereas books in French law were labeled Fr, and books in German law were labeled Ger, individual countries in Africa were given sets of letters like this: [slide 8: First problem case: Letters followed by letters, first slide]
for Algeria: Af Al; for Zimbabwe: Af Zim. We are now looking at letters subdivided by more letters, not by numbers, which does not conform to the normal LC standard. This poses a problem for the SCAT table mechanism, because letters are supposed to be followed by numbers. The program seems to solve this problem as follows: in evaluating the call number, when it looks after the space, and finds more letters where it was expecting numbers, it "throws up its hands," as it were, stops evaluating, and discards these extra letters, and everything that follows them. What its then left with[,] is just plain Af, which it declares equivalent to Af0000. It then puts this plain Af into the first category it can find which contains Af plus all zeros, or Af0000. In other words, the machine treats any call number reading Af plus a space plus more letters as being, for practical purposes, identical to Af plus all zeros. The effect of this in our library is that all books on the law of any and all individual African countries, on any subject, will all go into one category, namely the first one which contains Af0000. We can separate books on French law and books on German law from each other perfectly well, and we can also subdivide books on French law and German law by subject if we want to, but we cannot do either of these things for the African countries. Does this mean that all is lost? No, because while we cannot separate the countries of Africa from each other or analyze them by subject, we can at least separate them almost completely from the other categories. [slide 9: First problem case: Letters followed by letters, second slide]
We do this by creating a special category which has the same beginning and ending boundary, namely Af0000. This specially-coded category will allow us to keep all call numbers consisting of Af + letters separate from other call numbers, but there are three problems which we cant solve and which you see summarized on the screen. We cant separate the African countries from each other, we cant separate them from our real Af 000s, which are encyclopedias on all African law, and we cant break this category down further by subject subdivision. Our SCAT table for Africa then looks like this: [slide 10: SCAT table for Af]
It now has three categories. Our first category consists of whatever evaluates to exactly AF0000, namely all of our Algerian, Nigerian, Zimbabwean law books, as well as the general encyclopedias of African law, which actually do have a call number of Af 000. Our second category holds our Af books less than Af 300 (except for those encyclopedias), and our third holds our Af books Af 300 and higher. Its not perfect, but its as good as we can get, and we can probably use the various Boolean list functions to work on our African collections when we want to. The Millennium reports would also allow us to download a report based on this table into some kind of a spreadsheet, where we could aggregate the data from Europe or Asia, so as to make comparisons with our African agglomeration. The Schiller system poses one other problem for the SCAT table mechanism, which is that while most Schiller class marks have two or three letters, some have four or more, [slide 11: Second problem case: Four or more letters, first slide]
for example, Bela for Belarus, or Eccl for church law. Here I found I needed some special coding, and I wouldnt blame you for thinking, when Im through, that weve suddenly slipped into the Twilight Zone. [slide 12: Second problem case: Four or more letters, second slide] The problem is this: category boundaries are expressed in one to three letters, which is the limit of the Innovative mechanism. How can we capture class marks of four or more letters, and isolate them from our other class marks? Lets take Bela as our example. First, we must begin with a lower boundary which will always be lower than any Bela call number. What will serve this function? The answer is: the highest possible Bel, or BEL9999.999. We usually expect to see this array of nines in the ending column, representing the upper boundary, but now it will appear in the beginning column. And how will we end our category? [slide 13: Second problem case: Four or more letters, third slide] We need something as low as possible in its own terms, but which will always be higher than any Bela call number. It turns out that what will work here is "BEM0000." Bem is alphabetically exactly one step higher than "Bel," the letters with which this category starts. [slide 14: Second problem case: Four or more letters, third slide] This gives us the rather odd-looking code of BEL9999.999 BEM0000. Any book with a call number of Bela anything will be higher than the lower boundary AND lower than the higher boundary, so the machine will put it here. The next slide picks up from this one. [slide 15: Second problem case: Four or more letters, fourth slide] Its quite true that we wont be able to analyze our Bela books by subject, but a category coded to run from BEL9999.999 to BEM0000 a "distance," if you will, of one one-thousandth of a call number - will isolate them. Note that this will not make conflicts with any of our books on Belgian law (in other words, with a call number beginning with Bel), because the highest Belgian class mark is Bel 999. We do need to code Belgium differently in one respect, however. We need to make our "Belgium" class end not with BEL9999.999, as our coding convention would call for, but with BEL9999.998. You can see this on my paper example, which highlights these same places in our SCAT table. That way, no Bela book can fall into Belgium, and no Belgian book can fall into Belarus. We may be grateful that Bemidji has not seceded from Minnesota and from the Union, so we do not need a class mark Bem. But if there were a class mark "Bem", only the books classed in Bem 000 could not be separated from Belarus. All the rest of our books on "Bemidjian law" would be separate, and that would no doubt be most of them. Again, its not perfect, but its what you can do, and ones statistics probably already contain greater impurities than that anyway. OK, you can turn off the Twilight Zone music now. Were almost done. I just wanted *to explain how to test a single call number to see where your SCAT table mechanism is putting it, and* to conclude by saying a few words about category numbers, as you can do some swift things with them. *To resolve problems with your SCAT table, all you really need to know is how to test where a particular call number is going. First you need to get the bib. record number of the record which contains the call number you want to test. Then you go down the following menu chain: M > MANAGEMENT Information S > Create STATISTICAL reports R > To enter a RANGE of record numbers. You supply your bib. record number as both the beginning and the end of your "range." You answer "Yes" to the question: Is the range correct? You then choose: C > Statistical reports based on CALL NUMBERS The machine flies into action and soon you have your answer, the category into which that particular call number will fall.* [slide 16: On category numbers] On the subject of category numbers, there are three things to remember: 1. On reports, the results for each category are printed by order of category number. This means that you can use your category numbers to create large blocks or sections in your report. In our library, we have three different call number systems, Hicks for basic Anglo-American legal materials, Schiller for foreign and comparative materials, and LC or modified LC for treatises on international and American law. For the first, I use category numbers between 500 and 999, for the second, category numbers between 1000 and 4999, and for the third, category numbers between 5000 and 9999, which is the highest possible category number. The category numbers must be "interfiled" in the table itself, because the table goes strictly in alphabetical order by class mark. This is a bit of a headache to code, but the result is a report in which the three call number systems appear as large blocks, which is how our librarians want to see them. You have a sample page in the printed examples. In this, as in most places, the Innovative system is highly configurable to your individual requirements. [2.] The second thing to remember about category numbers is that you can assign the same category number to different spans of call numbers. You would do this to indicate to the machine that books from these disjunct spans do in fact belong to the same category, and should be summarized together. On my sample sheet, there is an example of this toward the bottom of the page, in which books on Christianity and Christian denominations are brought together over a gap made necessary by the insertion of six categories, for Bru, BrV, BS, BT, Bul and Bur. I would like to leave you with one last piece of advice: dont use consecutive category numbers. If you do, you are forever giving up the possibility of subdividing your categories, and if you did subdivide them, the report lines which correspond to the categories you have just added will have to appear somewhere else in your report, and this may disturb the format you have so carefully constructed. Instead, it seems to me better to borrow a trick from old COBOL, in which one had to work with line numbers: leave a gap of 10 between each category number. After all, your SCAT table cannot contain more than 850 or so lines, and there are 9999 available category numbers. If instead you make all your original category numbers divisible by ten, they will be easily recognizable as such when you are looking at your table, and you will have plenty of space left over to handle the situation which arises when your needs for analysis change, or for when the call number schedules themselves change. [Last slide] And so, I hope you have enjoyed this little Etude in Scat Singing, and thank you very much. - - - - - * I omitted the italicized section at the IUG meeting for the sake of time, but I am including it here, because it contains the 'recipe' for how to test where an individual call number will fall in your SCAT table. Thanks to Amy First of Innovative for this helpful tip. I have omitted the slides themselves in order to keep the download time reasonable, but where a slide contains information which isn't spelled out in the text, I have added the contents of the slide at the appropriate place.
© Thomas W. Baker May 6, 1999 (minor revision, Oct. 14, 1999) |