Abstract - The University of Arizona recognizes data as an asset. This
paper covers the University's efforts to manage institutional data and the
events that led to developing an Information Architecture (IA). This paper will
explore what an Information Architecture is; the process for developing the IA;
the arguments for doing an IA; the benefits of an IA; and, issues that have
surfaced during the course of this project.
Introduction
In 1994 the University of Arizona engaged in a comprehensive assessment of
its current information technology environment and identified its major goals in
a plan that looked to the twenty-first century. That plan, "Strategic Directions
for the Year 2000" focused on the development of an integrated information
environment. This included integrating the operational systems, developing
information systems (i.e., a data warehouse), and consolidating redundant
business processes. Essentially this shift, from a process based systems
environment to a data centric information environment, meant a change in our
historical approach to system development. No longer would systems be purchased
or developed in isolation. Collaboration would be the cornerstone of integrated
data, consolidating business units, and refining data into information.
The effort to integrate data began with the formation of a working group of
University Data Stewards. Before data could be integrated some understanding of
what data were essential to the operation of the university was needed. The Data
Stewards Work Group began by exploring the discipline of data administration.
While learning about data administration they began establishing some of the
fundamental tools that would support their efforts. To-date they:
- adopted data naming standards
- established standard abbreviations
- wrote an employee data access policy
- formulated data access rules
- compiled a glossary of data and computing terms
- examined methods for identifying, collecting and documenting business rules
- started exploring the issues and problems surrounding data integrity.
The Data Stewards Work Group needed a way to store the inventoried data. They
considered the products that were available but found none that provided the
functionality that they were looking for. The decision was made to develop an
application in-house. The application that was developed utilized the latest
technologies: object oriented language; relational database; client server;
graphical user interface; and, more recently has been converted to a Web
application.
The IDC contains entities and entity definitions; entity attributes and
attribute definitions; alias information about each attribute (i.e., where it's
stored, how it's formatted, how it's named); data steward names and contact
points; administrative system names and descriptions; and, rules that control
the access of certain pieces of data (i.e., "student grade"; is classified as
"limited access" data), and other useful information. The IDC can be accessed
through this URL: http://da.arizona.edu/ds
In 1995, during the time the Data Stewards Work Group worked on their tasks,
an RFP was written seeking professional development strategies in Business
Process Engineering and Information Architecture development. The contract was
awarded to Texas Instruments Inc. The University formed two teams; one to learn
Business Process Engineering techniques and one to learn the to develop an
Information Architecture (IA).
A team comprised of data analysts from the Computer Center and
representatives from the Data Stewards Work Group met with a consultant from
Texas Instrument to develop an IA for the University.
What is an Information Architecture?
An IA is a comprehensive view of the business activities of the enterprise
and the data required for its operation. An IA is comprised of logical data
models based on the activities which have been confirmed as a complete, correct,
and stable statement of the business as it currently operates and is likely to
operate in the foreseeable future. The IA is developed independent of
organizational structure and technology trends.
How is an Information Architecture Developed?
An IA is developed through interviews with University employees who are most
knowledgeable about a particular business area. Typically two to three employees
from a business area meet with two IA modelers. A minimum of two meetings are
scheduled with each business unit, one for conducting the interviews and one for
verifying the accuracy of the information collected. More than two sessions may
be required in more complex business situations. The steps for developing an IA
are:
- Through the interviews the modelers identify the subject areas important to
the business unit. These subject areas are called "entities". An entity is
loosely defined as anything that the University is interested in. It can be a
concept, person, place, event, or thing.
- An Entity Relationship Diagram (ERD) for each business activity is drawn.
This models the relationship that exists between entities as well as the
cardinallity (i.e., occurs once and only once, occurs one or more times), and
optionallity (i.e., may occur, must occur), of the relationship. [Note: At this
time we are using LBMS’s System Engineer data modeling tool to draw the logical
data models.]
- The attributes or descriptive characteristics of each entity are identified.
- A Data Item Set Diagram (DIS) for each super entity is developed. The DIS
exhibits the super entity and it’s subtypes and their attributes.
- Every component of the IA (activities and data) is defined to eliminate
duplication and erroneous assumptions.
- Finally, the entity and it’s attributes are added to the Institutional Data
Catalog.
The IA identifies the salient activities and data once and only once. It’s
important to continually review and analyze the activities and data as they are
collected to prevent duplication.
Why do an Information Architecture?
Oceans of data - We are essentially drowning in an ocean of data. But
while the quantity of data is ever-increasing, the availability of reliable,
accurate information (esp. for sound business decisions) seems to be
increasingly suspect. The solution is to control the rate of acquisition of
data. An institution can not effectively do that unless they have an "inventory"
or "catalog" of institutional data.
Duplication of data - Another dimension to the information inventory
problem involves duplication of data across organizational and political
boundaries. Duplication of data spawns many secondary evils. When multiple
agencies collect the same data, there is duplication of effort. Data is
collected in an inconsistent manner and errors become built-in. Reconciliation
of data for decision support or reporting purposes becomes difficult and in many
cases impossible. Data duplication has a high cost that few public institutions
can afford.
Information redundancy - Redundant information is a direct result of
duplicated database development efforts within individual organizational units.
Information repositories containing redundant data will result when there is no
overall information architecture in place to orchestrate development.
To achieve a general understanding - Understanding the processes and
data needed to fulfill the University's mission seems on the surface a simple
thing. However, the tendency is to think of the institution in terms of it’s
organization and political structure, both of which are complex and ever
changing. The IA provides a view of the business that does not change as the
infrastructure changes. Unless the business of higher education changes the IA
will change little over time. What we come to understand today will be true well
into the future.
Institutions require reliable data for decision making - Institutions
compete for students, research dollars, private and public donations, and
faculty. This requires current accurate information.
Insure compliance with regulatory reporting - Funding from the Federal
and State governments for research, student financial aid, and operations are
based on reported data. Inaccurate reporting can result in reduced funding or
penalties being assessed against the institution.
Information is an important and expensive corporate asset - For all of
the reasons stated above, shouldn't we employ more techniques to improve the
quality of data and reduce data redundancy?
Benefits of an Information Architecture
- Facilitates integration of systems, processes, data, and information
- Documents processes and data in a central repository
- Supports data control, data management, and data inventory functions
- Increases the understanding of the business and promotes a common data
vocabulary
- Establishes data as a corporate asset
- Identifies redundant data and processes
- Documents the most elementary business rules
Issues
Long-term project in a quick-fix culture - Changing from process-based
systems to a data centric information environment requires a major shift in
system development strategies. While technology increasingly supports rapid
application development the IA is affecting cultural, political, and
organizational changes, all of which occur gradually.
The "not invented here" mind-set -. Application development teams are
producing their own logical models rather than using those already in existence.
They either don’t trust or understand the work that’s already been done. As the
IA proves itself to be a true and complete statement of the business activities
and data from which to base integration, and as more people understand and use
the IA as it was intended, this issue will lessen.
The transformation of a logical model to a physical model - Only
practical experience in this area will allow us to fully actualize the process.
At times the logical model has elements of a physical model and the physical has
elements of the logical. This in itself creates confusion and mistrust in the IA
and the value of the logical data model. Clearly establishing the scope of each
of these views will reduce duplication of effort and permit the completion of
the cycle from logical to physical and back to logical.
Maintaining the IA over time - Procedures need to be put in place for
maintaining the IA as new data and information requirements are established. The
success of this will depend on our ability to market the IA to the campus.
Education - Educating the campus on the purpose and use of an IA is
our greatest challenge. Continued use of the historical system development cycle
will produce the same results; unplanned redundant data and duplication of
effort across the organization. Integrated systems may result, integrated data
and processes will not.
Strategies and Plans
Early Involvement with New System Development Initiatives - The fruits
of the labor of an information architecture project pay off when IA becomes a
part of the organization's computing strategy. The IA needs to be seen as the
first stop on the development path or a measuring stick for any purchased
application software.
Make the IDC Available over the Web - The Institutional Data Catalog
can best serve the widest audience if it is a Web-based (update and query)
application. Not only can the Data Stewards maintain their data via the Web
using a skinny client but this provides a link for other data initiatives
interested in using the definitions contained in the IDC.
Identify and Catalog Business Rules - Business rules are defined as
"collection(s) of specific rules or business policies that govern the
enterprise's behavior. Since these rules govern changes in the state of an
enterprise, they translate directly into updating rules for its databases". A
rule processor from Pinnacle Software Corporation is being evaluated to
determine if it can reasonably provide a central store for business rules. The
IA defines the relationship between entities. These relationships are the most
elementary business rules.
Make Code Tables Accessible - Identify and make available code tables
from the transactional systems through either hot links on a web site or within
an appropriate data warehouse.
Build a University Vocabulary for Data - It makes life much easier
when common computer and data related terms are defined and when those
definitions are more or less commonly agreed upon. Toward that end, the data
catalog has a glossary section composed of computing and data related terms,
easily accessible on the WWW so that they can be shared and critiqued as
necessary. Defining terms is a cooperative effort involving diverse populations
working together.
Complete the Information Architecture - It is our intention to fulfill
our charter, "to document the Information Architecture of The University of
Arizona".