When it comes time to computerize their
collections, registrars and curators of small museums commonly find themselves
in a dilemma. On one hand, the features and capabilities offered by the newer
commercial and professional collection control systems are more than they need,
more than they can support, and more than they can afford. On the other hand,
their own database management skills may be inadequate to address their problems
effectively.[2]
For these embattled museum workers there has been little
help. Discussions of databases for small collections rarely focus on the
mechanics of building data systems. Database management tools successful in
business may be inadequate for museum work. Questions of database structures,
internal controls, and strategies useful for building small systems and entering
data are often ignored at conferences and in papers. But these subjects, in
their own right are as important as concern for vocabulary control or subject
access, and must be addressed and understood if the difficulties of
computerizing collection management and collection research are to be solved
efficiently and skillfully.
* * *
This paper outlines the development of a simple object
database designed for a medium sized university art museum. As with other small
systems, this one had to use resident equipment and a prescribed database
management system, and had to adhere to severe limitations of time and budget.
Yet, in spite of these constraints, within eight months a significant portion of
the database had been constructed, over 33,000 inventory records had been
entered, a previous database had been adapted to fit a new environment, a set of
nearly 140 standard prepared reports defined, and full documentation had been
prepared. More important than the above, however, was the creation of a strategy
for development. Following this method, required functions could be developed
and specialized needs could be accommodated.
Although currently not in use, enough has been learned from
the experience so that the system can be presented as a model that small museums
may wish to consider as they plan their own automated programs. The results are
of special interest to museums considering automating their collections
management because they demonstrate how an inhospitable environment of old and
inconsistent paper records easily yielded to simple database techniques, and how
a complex and varied collection could be catalogued and indexed quickly,
efficiently, and inexpensively.
Especially significant for small museums was the cost of the
project. Today, all equipment, software and supplies may be purchased for about
$6,000. Most museums will already own most of the required equipment, so the
additional cost in most cases will be considerably less.[3]
The Method
[outline]
For the present purpose, the database itself is less
significant than the procedure used to build it--there will be no catalogue of
data elements here. Rather, we shall see how data modeling and file definition,
the design of entry and query utilities, and the entry of retrospective data
became integral elements of a uniform development strategy.
The technique used has been dubbed "the bottom up approach"
because its method of development is progressive and incremental. At the core of
this system, and essential for all future growth, lies an authoritative object
inventory. Each stage of development uses the inventory to produce a fully
usable collections module. The strategy does not require that all current
records be entered, and does not require most paper records and paper files
entered be entered fully.
Furthermore, and for this writer essential, the process
works with the museum's accumulated paper records as it tries to maintain the
integrity of the data found in the paper records. That is, it supports an
environment congenial to the needs of scholars and curators, as well as
providing necessary functions for the administrator.
Most readers will be aware that in many automated systems,
so often practical necessity leads to compromise. Consequently, the scholarly
staff cannot trust the automated data available, while the registrar, to make
his own data clear and concise, cannot accommodate the curator's need for
accurate information. Object administrators commonly find themselves faced with
the unwelcome prospect of rethinking traditional procedures, of keying in
decades of outdated data, and, worst of all, of having to decide what to
simplify, abridge, or omit.[4]
Database planners know the importance of controlling data
structures--of assuring that the content of the data held in the system is in
harmony with the structure of its files and the disposition of its fields. One
may think of data as the vocabulary, the data structure among files as the
syntax, and the rules which define their relations as the grammar of an
artificial communication system. Whereas business and administrative databases
emphasize the transactional life of their objects (e.g. sales, clients,
inventories), fine arts databases must place greater emphasis upon the structure
and definition of attributes assigned to their objects, this, in addition to
whatever administrative transactions the database must monitor. In arts
databases one should expect to find a synthesis of form and content: the form of
a database reflects its function, and the way it operates provides meaning to
its objects.
The customary way to initiate small-scale descriptive
databases begins with field selection, not by modeling the museum's data. This
process often ignores the influence that the database management system will
exert on the structure and disposition of its information.[5] Of course, the fields chosen must
satisfy the need for object description, for management functions, and for
locating designated points of entry into the collection database, but they must
be designed in concert with the logical structure of a database management
system.[6]
No matter how intricate these in-house systems may become,
the usual practice is to devote one file per logical data group: Object
information in the object file, and so on. In these, data are input
record by record--a process that sees the extant paper record as a fixed
inviolate unit, and that assumes that the database model reflects the structure
of the paper based files. This process may be said to work from the top,
down.
The information system described here takes a totally
different approach than this recognized standard (Fig.1).
Instead of slicing the process of data entry horizontally, so that museum
records are entered by object, this system slices the task vertically.
Data are entered by type. The file structure is determined by and echoes the
data entry method. It builds the database from the "bottom up." Rather than one
large file burdened with the task of holding the mass of object data, in this
system the object record is divided into linked groups of specialized files,
each dedicated to the task of recording specific types of information. In the
end, this process makes for faster queries, more efficient use of data storage
resources, and a more supple system all around.
SWAP in
Context [outline]
In this paper the program will be called SWAP, though that
was not the name under which it ran.[7] SWAP was developed as an independent enterprise for a museum
participating in the Getty Museum Prototype Project.[8] SWAP benefited indirectly from the technical assistance provided by
the Getty Trust and by Willoughby Associates, their advisers. SWAP was written
in Informix for MS-DOS, the database management system chosen for the
Prototype Project.[9]
The collection for which SWAP was written is diverse. It
holds the finds of archaeological expeditions; collections of figurines, glass
and ceramic vessels, both ancient and modern; tableware, furniture, university
portraits; Renaissance drawings and prints; Eastern manuscripts and decorative
arts; and paintings from the middle ages to the present, and more. Some objects
have been in custody of the university from its inception in the mid 18th
century; many were collected as early as the nineteenth, but inventoried for the
first time only in the early twentieth.
To create the database, the first obstacle was to overcome
the inconsistencies of past cataloguing traditions. The internal museum
catalogue was arranged by curatorial department and further broken down,
hierarchically, but sometimes illogically, into sub-divisions defined by
geographical origin, media, object type, stylistic and/or historical class. Some
curatorial divisions were defined by culture (Far Eastern, Ancient, Medieval,
European), and some by media or technique (Manuscripts, Prints, Drawings and
Photography).
Within curatorial classifications, accession cards were
further broken down into divisions defined by culture, media and object type.
The card catalogue has developed organically through the years, and consequently
manifests many inconsistent, non-parallel divisions. Whereas some areas are not
partitioned at all, arranging all objects in accession order, others, the
"Ancient" and "European" drawers, in particular, show many divisions. Compare
the following examples with each other:
| Class |
Level
1 |
Level
2 |
Level
3 |
Level
4 |
| Ancient |
Ceramic |
Vessel |
Greek |
Geometric |
| Ancient |
Metal |
Mirror |
|
|
| Ancient |
Metal |
Egyptian |
Gold |
|
| European |
German |
Metal |
Vessel |
|
Most object cards were filed thematically, not in
accession order.
The museum kept no shelf list or locations file,
though it did have sets of manuscript ledgers in which all objects were assigned
accession numbers. Unfortunately, these accession ledgers would rarely lead a
researcher to the location of the accession card for an object.[10] For this reason, if for no other,
the files needed an automated index of accession numbers and drawer
locations.
Historically, several accessioning schemes were used
to register objects. When the registration ledgers were begun in the 1920s, each
object then in the collection was assigned an undated inventory number. Later
acquisitions were labeled with the year of receipt, the sequence of accession,
and a "parts" suffix. An identical parallel system was applied to the collection
of works on paper. University portraits received their own identification
numbers, following a published inventory. Any single accession number,
therefore, could refer to several objects. Loans to the museum were registered
with a similar system.
The above technical difficulties notwithstanding, the
vitality of the museum made adoption of an advanced automated collections
management system advisable. As an academic institution closely tied to the
department of art history, unusual demands are placed upon the small museum
staff. Objects from storage are frequently pulled out for class use. Each
academic year sees more than twenty small didactic exhibitions of works culled
from the collection. Visiting scholars are treated with great respect and
hospitality, and given easy access to materials in storage. Moreover, hundreds,
perhaps thousands of unaccessioned objects, kept in storage or on display, have
been loaned to the museum by alumni and faculty. Many of the museum's own works
are lent out, sometimes to departments and exhibitions on campus, sometimes to
large travelling exhibitions. Objects continually move in and out of
conservation. But with all this, there is no public listing of the collection
contents. Researchers must be provided direct access to the accession cards;
ordinary students and the public usually have no access.
Obviously, the casual structure of object
classifications hindered efficient use of the collection catalogue, and made the
chores of the administrative personnel particularly time-consuming and
frustrating--especially since there was no locations file. The formidable loan
program needed automation, but because the museum had just built a new wing, and
was to move nearly every object from the old space to temporary storage in the
new wing while the old areas were renovated, an object and packing inventory was
the first priority.
SWAP Data
Structure
The
Inventory File [outline]
Because SWAP is a "bottom up" collections system, it
will be described here in the order it was created. SWAP began conventionally
enough as a simple accession number check-list prepared to monitor the move to
the new wing. This list was made from the accession card catalogue (Fig. 2 ).
Into SWAP's inventory file was entered each
object's accession number and filing classification--establishing the needed
index to the accession drawers. With this simple file, one could query by any
combination of classification tiers, by accession number, or by any of its
components.
Borland's Superkey, a keyboard macro enhancer,
was used to simplify the process of data entry (Fig. 3). A macro was designed to insert all repeating data, to send
the database through its addition routine and to open a new record for the next
object. Only those elements which changed from object to object were input by
hand. Indexed fields, that always impede data entry, were kept to a minimum.
Additional indices were added only after the basic inventory was complete.
Normally fewer than six keystrokes were sufficient to make the system create a
record. On a good day, working alone, I could enter one thousand
objects.
The accession number was cast as a "composite" set of
four fields: collection prefix, year of acquisition, acquisition sequence, and
parts identifier. This compartmentalization sped entry, facilitated queries, and
made sorting more convenient. To avoid a costly renumbering program, it was
vital that the accession number serve as the key identifier of the object. The
overlapping schemes were resolved with the addition of the collection prefix. In
this way, one system could identify all standard acquisitions, loans to the
museum, and any other category which may prove useful. In this system the
accession number, in all its four parts, also serves as the link connecting the
constituent database components and as a convenient point of entry into any file
using it. File integrity is maintained by a process which looks to the accession
number for authority to enter and change records.
Informix's "composite" field type contains no
data of its own, only pointers to its components. Although its length is
minimal, the composite name can be specified in any query in place of its
component field names. This is handy, of course, but even more importantly,
files can be joined on the composite, one composite can "lookup" another for
verification, and composites can be indexed, either allowing or disallowing
repetition of values.
Properly used, these features permit great control
over the contents of a database, and simplify its operation. For example, the
composite of accession number fields was used to supervise the process of
recording them. By indexing the composite of the accession number, and by
choosing the proper indexing strategy for each component of the accession
number, the system guaranteed uniqueness and sped searches. It took only 10 to
20 seconds to pull one record out of 33,000 on a slow IBM XT.
The requirement of uniqueness did cause the entry
process to be slowed considerably. When the paper records were incorrect, as
they often were, entry stopped and the correct number had to be researched.
Bothersome, to be sure, but the result of this effort was a highly accurate list
of accession numbers, free from typographical errors, that would serve as the
foundation on which to build the remainder of the database.
SWAP's inventory file used fewer than 120
characters per record, yet from this it was possible to produce important
inventory documents: sorted lists by collection category, by object type, by
accession year, by card-drawer order, and so on. Curators received their
first-ever check-list of objects in their domain. Summary reports tallied
objects by accession year and accession type. This modest flat-file database
provided the museum with its first accurate count of its documented holdings in
each area (Figs.
7, and 13 ).
Beyond the
Inventory File [outline]
When the inventory file was complete, SWAP's
second file was drawn out of the first (Fig. 7, top).
Without additional data entry, each unique set of classification values was read
into a master file. This file was turned into a "lookup" authority, against
which any new or changed classification had to be tested before the computer
would accept a new entry into the inventory file.[11]
Although the database now served as an authoritative
inventory mechanism and index, at this stage it addressed only the two most
important entry points for object designation: Accession Number and Object
Classification. It contained none of the traditional fields for object queries
or object description. In fact, there were no object descriptions at all, no
makers, no subjects, no titles, no media, no size, no dates, no styles, no
cultures, and no donors or valuations.
To include these elements, conventional wisdom for
fixed-field database users would have the inventory file expanded by
adding fields. Fields added in this way would be available even if they were not
needed.[12] This
procedure was not used. Instead, the inventory file was left as is, and
turned into an authority file--the "lookup" file to which object records would
refer (Figs.
7, bottom, 13).
Here is one of the practical benefits of the "bottom
up" method: With the inventory file complete, there was no need to
describe every object the museum owned. The thousands of accessioned objects of
no current interest, never shown, never cited, never even described in the paper
accession records, could be omitted from the object file--at least until
descriptions were needed or available.[13]
The inventory file and the object file
each prohibited duplicate accession numbers, so there could be only one object
record per inventory record. However, since every object part was given
its own unique entry in the inventory file, each part or accession
subdivision could be counted as an independent entity. A sketch-book, or a
portfolio of prints, for instance, might contain pictures by different artists.
Each drawing or print could be described and attributed independently, or not,
as desired, even as the item was described as a whole.[14]
Resource
Management: Bernoulli Box Strategy
[outline]
The entire SWAP system ran on modest equipment: an
original IBM XT with a 10 megabyte hard disk, a Bernoulli Box with two removable
20 megabyte cartridges, and a cheap Epson printer.[15]
The use of interchangeable media was essential to the
development plan of the database and offered a way to overcome some of its
inherited constraints (Fig. 8). Using the Bernoulli
Box, the database could be permitted to grow larger than the total online
capacity. To do this, one Bernoulli disk was designated permanent. It held the
large inventory and object files, and had to remain on line. The
other disk was considered interchangeable, and might hold various administrative
or descriptive files, plugged in as needed--manually.
Unlike some database management systems, such as
dBase III or R:Base, SWAP's system, Informix, permits its
data files to be distributed among device volumes. Furthermore, Informix
does not require the entire database to be available. If a file is not called,
it does not have to be on line. Thus donor, valuation, loans, specialized object
description, notes, inscriptions, bibliography, provenance, whatever, could be
available, or not, as determined by each task. Bernoulli disk management was
assigned to the system interface.[16]
SWAP's master Query-Update-Entry screen for the
object catalogue provides general access to its main object files and
facilitates interactive "many-to-many" queries (Fig. 9). The form is crowded because all data elements are displayed
on a single screen. It is not designed for rapid data entry, but, rather, for
general system maintenance and ad hoc form queries, allowing the user to
investigate the central portion of the catalogue and object/maker
relationships.[17]
The 20 megabyte Bernoulli disk imposed severe
restrictions upon the size of the object record, strictly limiting the list of
possible fields. This limitation was turned into an asset. The fields selected
had to be only those common to all objects and those required for the most
fundamental queries. Coded fields annotated dates and other field types.
Superkey "pop-up" help-screens, loaded by the interface, were used to
enter codes and explain their significance. Data-form programming would display
the meaning of important codes as they were entered or when they appeared in
queries.[18] As a
result, the object file was densely packed, with little waste, but not
overloaded with inexplicably obscure codes.
Use of the
Getty Prototype Project
[outline]
The Getty Prototype Project had given the museum a
database describing 900 of their paintings and 600 artists. SWAP used this data.
However, since the Prototype data model was designed specifically for paintings,
and the size of each object record had been allowed to approach the system
limit, to fit 33,000, records, or more, onto 20 megabytes, some alteration was
mandatory. SWAP's object file was begun by using only segments of the
Prototype data. To fit the demands of SWAP's stipulated object record size of
350 characters, the object file stripped the inherited data of every
nonessential field--in the end cutting the record size to less than 20 percent
of the original.
This reduction should not be considered a loss as
much as the first stage of a redistribution. Here, again, is where the Bernoulli
System was used to good purpose. If the new object record had no space for
acquisition information or inscriptions, this data was not deleted; rather, it
would be attached to the object data via associated files located on the other
Bernoulli disk. In fact, this new format would allow more flexible and efficient
fielding. For instance, one could record multiple inscriptions per object, and
not be forced to reserve space for data which did not exist.[19] One could conduct the museum's
object administration without forcing the program to sift through files of
curatorial data. In this way, this fixed-field relational database achieves some
of the benefits of database systems offering variable length and multi-valued
fielding, but without sacrificing the advantages of a relational
system.
SWAP's architecture and development plan is clear and
simple. Onto a highly controlled and authoritative core of object accession
numbers and filing classifications are attached modules as needed for new data
structures or functions. As long as the core data is complete, pure and
accurate, associated data may be entered voluntarily--as required for
administrative and descriptive needs. The presence of a complete inventory helps
monitor the development of associated records. Subtracting the accession number
list of new records from relevant portions of the complete set yields a list of
missing items. The user will be comfortable allowing the system to grow only as
circumstances demand and resources allow because "tickler" reports can always
reveal those items still missing from the database.
Informix
as a Relational Database Manager
Queries,
Entries and Updates [outline]
Much of SWAP's power must be attributed to
Informix's remarkable Perform program, its multi-purpose
Query-Update-Entry utility. Perform allows the user to create complexly
programmed forms that can control data entry and queries in many ways, including
testing for values, and acting on values entered. These are not R:base
QBE, or "query-by-example" forms, where queries are executed outside of the form
and displayed in it. Rather, Perform forms may be used for data-entry,
updating, record deletion, queries, and more. All functions are executed from
within the form, in mixed succession if need be.[20] Using this facility, it is
possible to implement limited Boolean searches (and and not),
range searches, last- and first-entry searches, double-ended wild-card searches,
or truncated indexed searches. Queries may operate in several fields at a time.
The field spaces used for data entry are the same as those used for queries.
Further, each Informix form can relate fields from as many as eight files
simultaneously, can implement complex "lookups", show programmed "display-only"
fields and manipulate data entry in standard and unusual fashions.
The routine of adding and querying within forms
becomes especially valuable when adding data, and is uniquely well suited to the
needs of collection management and cataloguing. For instance, one may add an
object record, then turn to the artist file to query for the object's
artist. If present, the artist record can be joined to the object on the fly, if
not, a new artist record may be added, and then joined to the new object, its
record still current.
Correcting newly entered or queried data is
especially convenient. When errors are detected, the user may page back to the
offending records, correct them and return to his work, all this, without
changing forms and without executing another query.
Perform's most powerful attribute, and the one
which makes the program so useful for scholarly use, is certainly its ability to
execute "master/detail" queries--Informix's term for managing
"one-to-many" and "many-to-many" relationships. With this feature, without
changing forms, it is possible to query in one file, and from the resulting
query-list choose one record on which automatically to query all linked records
in another file. The query on the join is usually executed with a single
keystroke, sometimes two. This powerful utility makes many otherwise routine
uses of a command query language unnecessary, and is ideal for working with
records composed of many fields and databases with many files. Joins are planned
in advance at the database dictionary level, but are not cited there; they are
defined and built into each Perform query form. After viewing the records
in the linked file, the user may return to the primary list from which he
started--still current--choose another record, and request another "detail," or
he can execute another linked query into a third file from his current point,
thus chaining linkages, each time choosing the direction of query. Operating in
a simple object catalogue, this feature allows the user to query for an artist,
inspect object records attached to him, choose one of these works and back-query
to find all attributions made to it.[21]
Help [outline]
The standard Informix customized help system
is rather spare, but Perform works well with MS-DOS macro-processor
environments, like Superkey, to make up for this lack. Coded fields,
often so unwieldy, but so handy when controlling a limited vocabulary set,
become nearly friendly when bundled with Superkey display screens.
Keystrokes may be passed through these screens into the current program to set a
code. Perform also provides for "display-only fields." The data in these
fields are in the form, not in the database, but appear when prompted by the
appropriate screen programming. Thus, if coded fields are used, the form may be
set to show their significance (Fig. 9, bottom, and
Fig.12).
Perform offers useful system help with the F1
function key. It permits user defined messages on the field level and supports
programmable "error" messages. Most of its own error messages are quite clear,
using the application names for fields and files when necessary.[22] In addition, full text screens may
always be written into the form itself to provide information specific to the
current application.
Development
[outine]
Informix programs are easy to develop, but the
DBMS does not offer the user a friendly environment such as the one provided for
R:base. Informix does not supply its own application generator and
screen painting utilities. Development takes place in the command-driven DOS
workspace. However, this allows the database developer great freedom to use
familiar tools. All Informix code may be written in ASCII on your own
wordprocessor. (I use XyWrite and Nota Bene, whose editors also serve as
handy shells and fine scholarly wordprocessors.)
Error handling is usually very good. When code fails
to compile, Informix creates error files. These are duplicate versions of
the code-file into which the compiler has inserted notes marking and annotating
errors, even giving advice about better programming practice.
Obviously, Informix merges easily with the DOS
environment. Data entry screens and reports written with Informix's
famous Ace report language can all be called as parameters to executable
programs. Thus standard DOS menu systems may be used to create user interfaces.
Informix includes its own menu utility which may be used to call
programs, run batch files, and provide necessary parameters for both.
Informix 3.3 does have several undesirable
features: Perform will not sort the results of a query; characters are
restricted to the seven-bit standard ASCII set, hence screen painting is
unattractive and international characters cannot be used; and queries are always
case sensitive. Some of these faults have been addressed in the latest SQL
version of the program. The new version supports diacritics and use of DOS box
drawing characters (with the separate 4GL programming language) and permits 16
files to be open at once, instead of just eight. All queries and sorting are
still case sensitive, however, sometimes forcing the user to define fields that
automatically turn all entries and queries into upper or lower case. But its
biggest disadvantage for collection management is one which it shares with most
database management systems designed for the PC environment: fields are single
valued and of fixed length, and must be coaxed into accepting certain forms of
data.[23]
SWAP
Attribution System
Principles [outline]
Discussion of SWAP's object attribution system has
been left to the end. Thus far SWAP has been presented as a utility for the
collection-manager. Its underpinning is its authoritative inventory and tools
with which to build collection management functions. How does this system
support a credible working environment for the curator and researching scholar
without compromising its administrative purpose?
To understand SWAP's object attribution system it is
necessary to follow its genesis out of the Getty Prototype database. This
experimental database addressed many complex questions regarding the merging of
institutional data, but it was not a registrar's tool, and was never intended to
be one.[24]
Rather, emphasis was placed on object description and attribution--curatorial
tools.
To this end the Prototype database used a
conventional three file system to define "many-to-many" relationships among
artists and objects (Fig.10). With this mechanism
any single object could be connected to any number of artists, and any
single artist could be connected to any number of objects. A linker file
made the "many-to-many" connection possible.[25]
SWAP used this system, too, but refined it, by making
the link file map the route connecting object and maker. For example,
SWAP's linking file might specify the maker's role in the creation of the work.
The benefit of this procedure is best seen in the way SWAP handled the problem
of works attributed to anonymous artists working under the influence of a known
person, for instance, as when a work is attributed to "The School of Peter Paul
Rubens."
In these cases the Prototype database defined a
generic artist entry for each designation—"Rubens, Peter Paul, Follower of"
being one (Fig.
11). Occasionally, such "Follower
of Rubens" attributions would be consolidated into a single artist record,
although, quite clearly, different personalities may be assumed to be implied by
the term. Similarly, "School of Rubens", "Manner of Rubens," and other
attribution subtleties create distinct artist records, each with their own
family of objects attached.
This procedure is a carry-over from old methods. When
these phrases are written on accession cards, or appear in flat-file databases,
their meanings are clear. In a relational system, where a single artist record
(axiomatically a unique biographical entity) may be attached to a number of
objects, their significance becomes clouded and confusing, obscuring the path to
objects and falsifying the relation of objects to one another.[26]
In contrast, SWAP's attribution procedure is modeled
on a simple grammatical principle. The artist file and the object
file contain distinct elements; the links define the relationships between them.
"Artist-Relation-Object" corresponds to "Subject-Verb-Object." Nominative
entities are not inflected by the objects to which they relate.[27]
SWAP's method isolates all significant conditioning
terms and puts them in the linkage file, not the artist file (Fig. 12).[28] Each
qualifying anonymous object is ultimately tied to the record of the artist
named. Thus a query for Rubens, for instance, will direct one immediately to all
the attributions to him and his circle. The initial search is not conditioned by
whatever "uncontrolled" vocabulary the curator and historians have chosen to
qualify the attribution. When one realizes that the linkage file can be used to
designate unofficial and official attributions, the source of attributions, and
even to record information regarding the function of multiple makers, details of
patronage, or the pictorial sources for objects copied after other works, the
system, for all its value as an inventory and administrative tool, also begins
to serve fundamental research needs of the museum's academic staff and its
scholarly public.[29]
Recognizing that, like artist, school
is an assigned attribute, not a "property" of objects, and consequently subject
to different naming practices, different specifications, and different opinions,
SWAP turned this entity into a file unto itself. The linking file was therefore
given the dual function of describing just how each entity (school and/or
artist) was associated to the object.
Because attribution opinions and similar assignments
of objects to persons and schools are inserted into the attribution link
file with real art-historical terminology as found in the paper records, the
artist and object files are simplified. The object file
contains no artist information; the artist file contains no object
information.[30]
Artist
Pseudonyms [outline]
Artist name authorities have always created problems
in simple databases. One museum's practice of naming an artist might conflict
with recommended usage; names of many artists do not fit into western naming
traditions; names by which artists are commonly known do not correspond to the
"first name" "last name" structure of name fields. SWAP sidestepped this entire
issue by providing a means of entering virtually any name into the artist record
and providing a way for nearly any query to meet success.
This was achieved partly by using Informix's
ability to index composite fields, and partly by redefining the nature of the
fields into which the artist name was entered. Rather than calling the two
fields for artist name first and last, the fields were named
sort name and other name. The primary artist record, record `A,'
would use these fields to render the artist name just as the museum wished it to
come out. It did not matter whether a user could use this version to achieve a
successful query or not, the only principle followed was one which produced
proper form in reports. Since SWAP was constructed to allow any artist to be
known by up to twenty six pseudonyms, nearly any useful combination of names
could be entered in addition to the one required. The museum may prefer
"Barbieri, Francesco", but the database may also contain simply "Guercino." For
the sake of users, Rembrandt's given name could be placed in the sort
name field, without abandoning the museum preference for "Hermansz. van
Rijn, Rembrandt."[31]
Behind the scenes, the artist file and the
attribution link file are joined by an arbitrary artist number assigned
sequentially. Artist pseudonyms share a common artist number but are each
assigned a unique name variation code. The composite of artist number and
name variation code is indexed to prohibit repetition. This procedure prevents
repetitions of number/variation-code combinations, but permits repetitions of
artist numbers. The program is instructed to expect more than one artist record
per artist number, just as it expects to find more than one attribution
link file record per artist.
Similar methods were used to prevent duplicate links
between artists to objects. A later version of SWAP allowed automatic sequential
assignment of new artist numbers even though these values were allowed to
repeat.[32]
Controls
on file integrity [outline]
As systems created from the bottom up grow in
complexity and add more functions and more files, there is justified concern
that data can wander away from its associated records. SWAP defined a protocol
that governs the maintenance of linkages and maintains the authority of the data
(Fig.13). For the files cited
thus far, here are some of the rules.
For the Inventory File: No object
record could be entered into the inventory file unless its accession
number was unique. No accession number could be entered unless it was assigned
an authorized set of classifications corresponding to extant drawer
divisions.
For the Object File: No object record
could be entered into the object file unless its accession number was
unique and already existed in the inventory file. Only one object record
could exist per inventory record. The inventory record could not be deleted or
changed when an object record was attached to it.
For the Attribution System: When an
object file record was linked to an artist file record, the artist
record could not be deleted without unhooking all link file attachments
to the artist. Similarly, objects could not be deleted when they were attached
to artists or schools. Links joining them must be destroyed first.
These, and other programmed controls, help make the
database difficult to corrupt with sloppy maintenance. Data is secure by virtue
of the rules required to create and remove entries. This means that the system
may be administered by several people in succession, with full confidence that
the private working habits of a former operator will not have seriously
compromised the ability of the database to perform for its current
workers.
Most importantly, here is a database that began as a
simple inventory project. In a short time, with minimal financial commitment, it
evolved into a system of moderate sophistication with potential for further
growth. SWAP is the kind of system that can be put together by someone who has
not been trained in database techniques or theory. It does not offer the bells
and whistles found in some of the commercial systems, but it goes further than
some in respecting the differing needs of all those who must use or manage
object data within the small museum environment.
Figures [outline]
Notes [outline]
| 1. |
A shorter version of this
paper was read at the 1988 conference of the Museum Computer Network at Santa
Monica, California. [text] |
| 2. |
Within the last few years
museums have seen the advent of ever more capable automated management and
cataloguing systems. Among other features, these programs allow the user to
execute sophisticated searches through structured lexicons and thesauri, and
submit routine administrative procedures to computer control. Although some of
these systems are wonderfully sophisticated, they may be costly. [text] |
| 3. |
Prices are quoted at
approximate mail-order discount: XT clone MS-DOS computer with 20 or 30 megabyte
hard-disk: $1500, Bernoulli Box: $1700, 10 Bernoulli disks: $800, DBMS and
related software $1500, Printer: $350. Miscellaneous supplies, about $350. This
assortment would form the bottom-line minimal configuration necessary to yield
acceptable results for collections of up to forty or fifty thousand objects.
Newer MS-DOS computers based on the '286 or '386 Intel chip will work faster.
The Bernoulli Box is not the fastest storage device, but its speed can be
increased for AT class and faster computers by decreasing the interleave value.
[text] |
| 4. |
Many readers may find
espousal of this dual ambition surprising, for database managers commonly have
been forced to adopt schematized or abbreviated notations for objects. It was
thought that the hard facts required for clean reports and efficient
object management, could not co-exist with the soft facts--the opinions
and value judgments--the critical object history prized by
curators. |
| |
The following hypothetical situation
illustrates how real-world records might describe an object:
Over the years, historians and our curators have
attributed this painting to three artists, Tom, Dick, and Harry. The present
curator, however, believes this work to be from Harry's workshop with traces of
studio hands Larry and Carrie. However, under the terms by which the work was
given to the museum, it must be identified in the museum and museum publications
as by Dick, whose lost composition our painting probably imitates. Incidentally,
Harry, is commonly known by several pseudonyms, but the museum insists on using
his given name, by which only few know
him. |
| |
Not many modest collection databases can
accommodate this kind of data in their artist/object attribution apparatus.
[text] |
| 5. |
A potentially intricate
and frustrating endeavor when diverse collections must be accommodated.
[text] |
| 6. |
In the MS-DOS world the
resulting database system often is of the fixed-field variety, the file
structure is flat--one file, or related--essentially joined flat files. Advanced
home-grown arrangements might include separate "maker" and other files or tables
to reduce the occurrence of repeating data. Transaction records are kept in
similar linked files. Some systems might apply this concept to other situations
where data tends to duplicate, including, perhaps, cultural or stylistic
groupings, donors, bibliography, exhibition history--all of which must be linked
to object descriptions.
Some of the more sophisticated formats of this
genre use a three file system to establish "many-to-many" relations. With these,
one object can be linked to multiple makers, and one maker linked to multiple
objects. One borrower record can be linked to all items borrowed, and one item
can be linked to all those who have borrowed it. But servicing these systems and
getting them running often require use of intermediate data entry forms, or
other slow and intricate methods of data entry. The consequential systematic
transfer of complete object records are often unproductive and unnecessary; the
process ignores the fact that only a small portion of a museum's accumulated
data will be useful for current operations.
Although remarkable
techniques for rapid data entry have been developed, and clever design of data
entry screens will go a long way to make life pleasant, the end product may
still be very wasteful of computer resources, yielding slow stodgy systems,
packed with empty fields and repeating data. [text] |
| 7. |
The museum for which the
program was built wishes to remain anonymous. SWAP was a spare-time enterprise
undertaken midst this writer's other duties as the museum's Prototype Project
administrator. [text] |
| 8. |
The Museum Prototype
Project, sponsored by The Getty Art History Information Program (AHIP), studied
the feasibility of creating a unified collection catalogue for paintings
residing in eight trial museums. When project funding ceased, SWAP's development
halted. The program was abandoned by the museum, functional but incomplete.
PCPHASE, the database provided to the museum by the Prototype Project, was
abandoned too. [text] |
| 9. |
Informix, Version
3.3 subsequently has been replaced by a more advanced and versatile release.
Some of its features are discussed on page. [text] |
| 10. |
Records in the accession
ledgers were fixed, but the order of cards in the drawers were movable, and
tended to obey the logic of whomever cared for the collection. [text] |
| 11. |
Additionally, as each
object was packed for transfer, its accession number and packing-box number were
entered into a another, coördinated database. The goal was to compare the list
of objects to the list of cards--in order to discover those cards for which no
objects were found, and those objects having no cards. This packing database was
to form the basis of a true location list as unpacking commenced, but this step
was probably never realized. [text] |
| 12. |
For example, if an artist
field were added, each record would be given field spaces for an artist name, or
spaces in which to indicate school, or media, or donor. Some flat-file and
relational databases designate fields for several artists, even adding areas for
printers, publishers, manufacturers, and so on--wasting more data storage
resources and introducing unnecessary complications for queries and reports.
[text] |
| 13. |
The object file
might better be considered a "Working Object File," at least provisionally.
[text] |
| 14. |
Whereas some objects,
such as the sketchbook cited above, required expansion, others needed
conflation. Although the museum would try to assign consecutive accession
numbers to objects which belong in a group, as in a portfolio of prints, there
was no way to identify groups. Sometimes such lots would be given to the museum
in stages, so their numbering could not be consecutive. For these situations,
another field, not part of the accession number composite, served to unite
disparate objects. Arbitrary numbers entered into a lot field, and
described in a specially reserved portion of the title field would allow
users to collect each such grouping with a single query. This field was a data
element in the catalogue file, not the inventory file. [text] |
| 15. |
The Bernoulli system
makes an excellent backup device. It takes less than five minutes to produce an
exact duplicate of a 20 megabyte disk. Copies made this way need not be
"restored" for use. When the Bernoulli Box is used for the database, it need not
interfere with other uses since the entire database or any other application can
be removed easily. This means that if the collection manager already has an
MS-DOS computer and printer, the only equipment he need purchase is the
Bernoulli device.
The printer does not have to be fast, but it must be
reliable. I chose the Epson FX-85 because of its proven worth and low cost, and
because it was possible to make it print the entire IBM character set, including
characters with diacritical marks. [text] |
| 16. |
The system interface
prompted for insertion of proper disks, loaded selected Entry-Update-Query
forms, and custom help files, and ran reports and utilities. Building the
interface required no specialized programming ability. It was created out of DOS
batch files and Superkey macros, all called through the Informix
menu utility. When a query or report form was called, a batch file would check
for the correct disposition of disks, prompt for a missing disk, load the proper
Superkey help screen set, and finally load the form or run the report.
[text] |
| 17. |
Specialists in rapid data
entry will be quick to tell you that this type of crowded form is not
efficiently designed for their purpose. They are right, of course; but placing
all fields on a single screen does make it very easy to use Informix's
form-Query-Update-Entry utility. The compactness disappears with familiarity.
Rapid data entry screens should have simple layout, address a limited number of
fields, and minimize cross file relations, but they are not efficient tools for
general object administration and collection queries. [text] |
| 18. |
These values were not
data elements. They were programmed into the form and did not take up precious
data storage resources. [text] |
| 19. |
Not every object contains
an inscription. No database planner would want to waste valuable space in every
object record for inscription text, notes about its meaning, language, location,
author, calligraphy, style, media, etc. An inscription file can link its records
to the small number of inscribed objects. Another benefit: no compromise need be
made when one object has several inscriptions; just link several inscription
records to one object. [text] |
| 20. |
Any number of forms may
be created for the same materials. One form may bar designated activities in
specified files and fields, while another permits it. One form may be designed
for rapid data entry and another for general system use and occasional entry.
[text] |
| 21. |
A typical query session
might proceed as follows: The goal is to find the complete set of attributions
for an object we know only as the work of one particular artist. We do not know
if other attributions to this object exist in the database. We do not know the
object's accession number, and may not even remember its title. The strategy is
to query the artist file under the name we know, and explore the links to
objects, to chose the object required, and then find all its links to artists
and schools.
| (1) |
|
Query artist file for artist. Artist record
is current. One link and its one object record also show. |
| (2) |
|
Ask for detail of link file from
artist file. Link file is now active, all links to artist are
current. |
| (3) |
|
Page through link file records to find target
object. (Object records joined to link records show as each new link record is
displayed). |
The above procedure is standard for finding all
objects attached to an artist. Now we continue, searching for all links to a
particular object.
|
(4) |
|
Ask for detail of object file from
appropriate link record. One target object record is now
current. |
|
(5) |
|
Ask for detail of link file from
object record to see all artist and school records attached to object. Page
through links to see all connected schools and artists. Any current query list
can be output to system files or printer. Its form copies the screen
format. | |
| |
However, if we knew the accession number of the
object, we could have queried for it in the object file and moved directly to
step five. These instructions are more complex in the telling than in practice.
The sequence outlined above requires input of only the artist's name, or a
truncated part of it. The rest of the query procedure is accomplished with
simple system controls, no data are input, no linking fields are filled for
queries. [text] |
| 22. |
In contrast,
Informix's system error messages are opaque. [text] |
| 23. |
Informix for
MS-DOS environments is a single user system with no concurrency, password, or
security controls. Dedicating specific disks to sensitive information,
including, but not limited to Donor and Vendor files, Valuation files, etc.
overcomes this limitation somewhat--by default creating a poor-man's security
system. One of the virtues of the removable Bernoulli cartridge is the de
facto database security it provides. If physical disks are under lock and
key, they cannot be accessed. If sections of the database are to be made
available at other locations on a read-only basis, duplicates of the required
files may be run on independent machines. Unauthorized access cannot occur if
data is not present. Unauthorized writes on satellite machines never altar the
main data files. Transferring whole files to satellite locations is much easier
than down- and up-loading files or tables into alternate databases. Single-user
MS-DOS Informix applications may be upgraded to a multi-user UNIX system,
with security password controls, if necessary. [text] |
| 24. |
Utilities were not
provided with which to register object location, object valuation, loan
activity, etc. [text] |
| 25. |
This is a standard
database management tool, such as is commonly used to associate students to
courses and courses to students; clients to services, services to clients,
etc.
The Prototype object record included some fields that might better
have been conceived as relations. One of these was donor, another was
school. The latter could not be repeated within the record for a single
object. When these fields are defined as part of the object record, they must be
understood as "properties." Properties such as size, offer little
opportunity for subjective conditioning and uncontrolled multiple nomenclatures,
though they may be multi-valued. Multi-valuedness, in itself, does not
constitute a relationship. Lists of media, where nomenclature problems do exist,
are best handled by multi-valued and variable length fields tied to a lexicon.
Object titles offer additional problems. Although often determined by opinion,
tradition, and scholarly interpretation, titles are usually thought to be a
property of the work of art, even when one work may be known by many titles.
Although overlapping is to be expected, the title field should not be confused
with fields created for subject access. SWAP made no attempt to offer
opportunity to create multi-valued titles or provide subject access, but there
is no reason why a subject file cannot be linked to object records for this
purpose. See Figure
1. [text] |
| 26. |
In cases where objects
are truly anonymous, be they Roman glass vessels or Renaissance paintings, the
user is encouraged to define no artist at all. In these situations, all relevant
information derives from the object. The database of the Prototype Project,
insisting on the metaphor of the maker, defined anonymous artists with
two-hundred year active lifespans for objects attributed to the seventeenth or
eighteenth centuries, and active lives of a single year for anonymous makers of
dated paintings. In contrast, breaking its own rules, SWAP offered a single
anonymous artist record to which any such object could be attached, though
technically, anonymous works should be left unattached. [text] |
| 27. |
The following variations
on attributions to Rubens and his circle were found in the databases of several
of the museums participating in the Prototype Project:
Rubens, Peeter Pauwel Rubens, Peeter Pauwel (Copy
after) Rubens, Peeter Pauwel (Follower of) Rubens, Peeter Pauwel (Imitator
of) Rubens, Peter Paul Rubens, Peter Paul (Studio of) Rubens, Peter
Paul, Attributed to Rubens, Peter Paul, Copy after (Flemish, probably XVII
century) Rubens, Peter Paul, Copy after (XVII century) Rubens, Peter Paul,
Copy after (probably XVIII century) Rubens, Peter Paul, Sir Rubens, Peter
Paul, Sir (Studio of) Rubens, Peter Paul, Workshop
of |
| |
It seems obvious from the above list, that anonymous
objects provide the source of the artist name. These names do not stand for
identifiable, but anonymous personalities, as, for example, the "Master of the
Housebook" does. Rather, they are generic attributions, personifications or
projections drawn out of the objects, or analogies made to styles of known
masters. In these cases, the database is not relational but tautological.
[text] |
| 28. |
SWAP's attribution
link file provides four filters. The first identifies the kind of artist
record attached: Thus one may specify that the object is a) by the hand of the
linked artist, b) anonymous, but attached to the name of the named person, c)
anonymous, and linked to a school or cultural tag. The second field designates
the manner of relation: "School of," "Tradition of," "After a design by," etc.
The third provides opportunity to register an opinion about the stated relation:
"Verified," "Rejected," "Traditional," "Dubious," etc. The fourth offers space
in which to cite the artist's role and is used to specify printers, publishers,
founders, assistants, and so on.
Potentially, each link could be signed:
Scholar A says the object is a "copy after;" scholar B says it is "autograph;"
scholar C says it is "definitely Jordaens;" but the museum has officially called
it "School of Rubens." If desired, additional programming could connect each
link to a bibliography or citation file. [text] |
| 29. |
In the Getty Prototype
system a user could not execute an indexed query on "Rubens" if he wished to
obtain a single list of all objects attached to his name. To find all works
given to Rubens and his school, etc., the researcher would have to query the
artist file for every occurrence of the name Rubens, then issue a new query on
each permutation bearing his name to find attached objects. The method is
awkward, and quickly brings the user near his frustration threshold. Although
the distinctions between "manner of" and "style of," may be meaningful
art-historically, and certainly must be respected, the method does not recognize
how databases are used to find objects and attributions. Rather than aiding the
user, this procedure stymies him and places impediments before him. [text] |
| 30. |
Following the practice
established in the Prototype database, each assignment of artist to object was
numbered in the link file. This procedure enables related schools,
artists and makers to be sorted in any specified sequence. A similar field in
the object record indicated the total number of links attached. The data entry
forms were programed to prohibit entry of more attribution links than had been
authorized in the object record. If the database grew to such proportions that
the scholarly trappings overwhelmed the more limited needs of the registrar, any
query made from the object file could be conditioned by limiting the
number of artist or school records allowed to appear. [text] |
| 31. |
Persons do not need given
names. Master of Flémalle, Hand G, and Monogrammist CC are all acceptable.
[text] |
| 32. |
Because these values are
allowed to repeat, the artist number cannot be a serial field. Although the user
is free to assign any number he wishes, any request for a new number
automatically offers to increase the last number assigned by the value of one.
[text] |