Ariel Glenn & David Millman
Columbia University
ABSTRACT.We describe the construction of access management "broker" software for web-based services in a university setting. The broker works with an existing institutional ID and directory infrastructure, permits delivery of complex remote services from providers outside of the home organization, and provides user attributes to remote service providers. We discuss ways in which the broker might be used to develop a cross-organizational access management system.
1. INTRODUCTION.
Cross-organizational access management for web-based resources has emerged as a topic of great interest among many information consuming institutions and information resource providers. These organizations wish, as precisely and as flexibly as possible, to enable access to particular networked resources to particular members of institutional consumer communities. Access should be simple for the user, should guarantee a large measure of privacy to the user, should not depend entirely on the user's location or network address but rather on the user's membership in appropriate communities, and should provide management and demographic information to institutional consumer administrators and to resource providers. Here, we will describe several architectural models for such cross-organizational access management services now under development at Columbia University.
A flexible and robust access management service is more than a technical architecture; it must address a number of other difficult issues, including policy and infrastructure considerations, deployment of technology in an uncertain market and broad consensus and development of standards among key players. Clifford Lynch [Lyn98] has recently provided an excellent summary and discussion of the issues and the state of the art in cross-organizational authentication and access management. While we will touch on many of these issues as they pertain to our motivation and our work, we refer the interested reader to Lynch's comprehensive and thoughtful document for additional context.
We will discuss below the history of our institutional access management systems and our motivation for our current work; we will describe our proposed architecture models; we will discuss some outstanding issues out of the scope of our proposal; and we will indicate the directions we feel are appropriate next steps or areas requiring new research.
2. HISTORY and MOTIVATION
Two technical infrastructure components are minimally required for an institutional access management system: the ability of a user to obtain an identity on the network, known as authentication; and the ability to correlate a user's identity with rights and permissions to use various services, called authorization.
Often these two services are combined in simple ways which blur their distinction, such as the UNIX implementation of file permission policy through group membership, "uids" and "gids." More robust and scalable authentication and authorization services may instead arise independently and be supported by special purpose systems rather than as side-effects of particular operating systems or other technology.
Management systems for computer identities at Columbia University began in 1983, by deploying a simple database to consolidate and manage email accounts for a quickly growing population. Over the next several years the process became increasingly automated and complex, with data feeds of potential email users arriving from the personnel and student administrative systems, from affiliate institutions, from the libraries. By 1990 we found our database to be the most authoritative directory of individuals on campus, so we initiated an online "phone book" lookup service. Soon after, we participated in the NYSERnet X.500 directory pilot. We also began an experiment with the Kerberos authentication software [Col92]. Kerberos does not require a directory entry, only a unique identifier for an individual and the person's password. In "pure" Kerberos environments, a "Kerberos login" enables secure network communication within a local jurisdiction, and can be logically decoupled from email accounts or particular timesharing systems. But we did not have such an environment and had no immediate practical use for our Kerberos service.
Meanwhile, Columbia had developed a terminal-based "Campus-Wide Information System," called ColumbiaNet, in 1988. At the time it was one of a number of such systems underway at several universities, offering anonymous online public access to such things as class, shuttle bus & gym schedules, campus events and, of course for us, the phone book directory. When the "Gopher" protocol emerged as a standard for exchange of public information over the Internet, ColumbiaNet embraced it and extended it. ColumbiaNet became our transition software to offer multiple library catalogs in a single interface; and we began plans to develop it into a gateway to remote licensed online resources, such as the RLG and OCLC catalogs (RLIN, FirstSearch) and full-text reference books from our university press.
For licensed services, totally anonymous access was no longer possible. We had the Kerberos infrastructure available to identify, with id and password, any of the now 60,000 individuals known to us from our many directory feeds--our extended community in some sense. But only a subset, the students and employees of Columbia, were covered by our licenses. We began to employ our directory service in order to screen individuals for access. In 1992 the ColumbiaNet application became the first "customer" of combined authentication and authorization services. Kerberos authenticates individuals, but for services only as deemed appropriate through the screening process. And ColumbiaNet, a terminal-based gateway, was able to invisibly "script" the login negotiation with remote service providers, thus acting as both an institutional filter for incoming users and as a trusted institutional "representative" to remote services.
Lately, our nearly total migration to the web has incorporated most of these same mechanisms, with end-to-end encryption on the network (SSL), updated to the current directory standards (LDAP) and with our institutional Kerberos identity infrastructure intact by local modification to our web server (Figure 1). But the web architecture, while providing tremendous new capabilities for so many and so many new users, has largely disabled our institutional ability to act as a trusted mediator, to offer-up our pre-screened population to remote service providers.
The prevalent web architecture today relies on a more fundamental, and for us an older, method of access policy: by Internet address ("IP source address"). This method identifies the topological location of a user on the network. Many institutions have also deployed web proxy servers, which alleviate some of the access management difficulties of the basic IP source address method. The drawbacks and trade-offs of these methods have been discussed at length elsewhere, and most recently in the Lynch paper. But neither of these methods enable access management based on the characteristics of the user. They provide authentication and authorization in a single, imprecise step, based primarily on network location.
As remote service providers create increasingly sophisticated services which are customized to individual users, they find they must implement independent user "registration" infrastructures: essentially building duplicate id, password, and user-profile systems for populations which are already part of well established and carefully maintained institutional id systems and directories at their "home" institutions. This is an unfortunate duplication of effort for the provider and annoyance to the user (who must login to each such service independently). Its problems are compounded by its underlying security model: still largely by IP source address or proxy web server.
3. ARCHITECTURE.
We have been investigating alternative architectures which can leverage the existing authentication and authorization databases at the local organization based on the following guidelines:
In the descriptions of architecture models below, we use a few terms in specialized ways:
3A. Initial Model.
As described above, our web service includes a locally customized server module. In this model both the user and the service provider are members of the same institution.
A single transaction in this scenario would proceed as follows:
This is an improvement over the plain vanilla web authentication provided by a typical web server; it allows the server to leverage the existing infrastructure of the institutional ID system and directory. It assumes that there is a secure communication channel (SSL) between the browser and the web server.
The access management module maintains a cache of user credentials and attributes so that it need not contact the authentication and directory systems for every user request. A user might request a document containing 20 restricted images; the institutional validation and directory systems will be contacted only once.
Public facilities pose their traditional problems in this model. User credentials are kept in the browser until the browser is closed or until the user logs out (typically by requesting a URL which asks the user to explicitly erase the stored password). In this model user credentials do not expire, and there is no provision in the web protocol for the server to demand fresh credentials. Therefore, if the user does not remember to log out or exit the browser, the next user who sits down at the same workstation will inherit their credentials and capabilities.
This model does not scale well. If more than one such web server exists at the institution, each must be modified to comunicate with the institutional authentication and directory systems. Any change to institutional authentication or directory systems requires a change to all modified servers. If there is more than one institutional authentication or directory system in use, the server has no means of choosing which one to use. And this method cannot be scaled to cross-organizational access management. Each web server providing restricted services would need to support the authentication and directory systems of all subscribers to its services; this would rapidly become unmanageable.
3B. Broker Model.
Almost immediately, we required more than a single directory system: the alumni offices and certain financial centers at our institution already maintained independent directory systems, wished to continue using them and wished to incorporate them into our access management process. We attempted a more scalable architecture by introducing a "broker" service to consolidate and generalize access management. This service includes a new Access Management Broker server and a new plug-in module for web servers.
In this model, requirements for granting service are encoded, in advance, in sets of "rules." The new Access Management Broker server uses a particular rule to decide which authentication and authorization components to use (perhaps several), and how to combine and interpret the information retrieved from them. The protocol between web server and broker server permits the web server to suggest a preferred rule.
A single transaction would thus be:
This model has some nice properties. Maintenance is much easier: web servers need only to talk to the broker through a single protocol; changes made to institutional ID or directory systems are reflected by changes in the broker software and nowhere else. The broker maintains a cache of credentials and attributes, preserving the performance of the previous model. A web server may request that a cache entry be considered stale beyond some interval, or it might request live verification every time for increased security at the expense of performance.
User privacy is also stronger in this model. The entire directory (names, demographics, all user attributes) is no longer known to the web server by default. A web server must explicitly request any user attributes it requires for business (e.g., a fax number if the service to be performed requires fax to the user, or a cost center identification). Broker server rules can therefore implement useful access management policies in a manner similar to that suggested by Arms [Arm98]. (It is conceivable that a rogue web server could make multiple broker requests and derive considerable user demographics by statistical methods, but this is detectable by auditing broker transactions and is in any case unlikely within an institution.)
3C. Cross-Organizational Access Management by Proxy--an Interim Approach?
The above methods cannot be extended out of an institution because the user name and password still move through the web server unencrypted. In an cross-organizational setting, that web server would be operated by a remote, or "third-party" service provider, i.e. a provider at another institution, commercial or otherwise. As a first attempt to solve this, we proposed to channel all requests for such third-party services through a proxy.
This interaction might work like this:
(Note step 0: the user must have previously authenticated with the proxy in order to use it. These credentials are present in every request the user sends to the proxy and are kept in the proxy's cache.)
Future requests from the user for the same service must be intercepted by the proxy so that the institutional credential can be passed along to the third-party service provider with the request.
This setup is relatively easy to roll out quickly, as it requires modifications only to the institutional proxy server. But it has the usual drawbacks associated with proxies: if the proxy does not handle all requests, the user is initially directed to the proxy through another resource at the user's institution. In this case, absolute URLs in documents returned from the third-party through the proxy must be rewritten to point back to the proxy; and relative URLs must be rewritten so that the proxy recognizes them as requests to forward. If the proxy handles all traffic it may become a bottleneck.
There is no mechanism in this method for the third-party service provider to retrieve user attributes when required. We considered this a big disadvantage. And this method encourages service providers to allow access based on a fixed password, the institutional credential. While that seems to be the state of the art today, we prefer not to encourage its continued use.
Since many institutions are moving towards the use of proxies as gateways to third-party services, this model may be a natural first step for them. However, we continued our investigation in search of a model better suited to our needs.
3D. Cross-Organizational Access Management via Cryptographic Module
This model is still in development but appears to have the best long-term promise. As of the Spring of 1998, both the Netscape and the Microsoft Explorer browsers are able to incorporate cryptographic plug-in modules.
We are now developing a PKCS#11 module [RSA1] for the Netscape browser in Unix. Once this has been tested we will look into moving it to other platforms and browsers.
Before any services are requested the user must activate the browser module. This is currently done in Netscape by choosing the module and selecting "login." The user will be prompted for a "PIN" and responds with an id, password and an identification of the local institutional access management broker. The browser module, acting over a secure channel, contacts the broker and presents the user's credentials (id and password) and the preferred processing rule for the broker. The broker validates the user, as described above, and then obtains or generates a temporary private key and a temporary digital certificate, which are returned back to the module. The certificate contains the address of the broker and an opaque identifier of the user, again temporary, and known only to the broker.
This new certificate is then available within the browser for subsequent use with third-party services. In current browsers, the user must then explicitly select it, indicating a desire to be "known" by this identity.
Again we have provided a server plug-in which handles the broker communication at remote, third-party providers. Third-party service providers use this server plug-in and use SSL (secure channel). The user's new certificate is sent to the third-party server during initialization of the SSL channel. The plug-in will use the certificate as the user's credentials rather than id and password.
Because the certificate contained the address of the broker, the third-party server can establish contact with that broker, send to it the user's certificate, the class of service it plans to provide and any further user attributes to be retrieved back for business purposes. The broker can perform authentications as above and return authentication and any requested (and permitted) attributes back to the third-party.
This model appears the most secure and most flexible of any. We continue to refine the protocol and requirements for third-parties.
4. Related Issues and Future Work.
Much work is being done in this area as other large research institutions face similar problems. Several web access management systems leverage pre-existing Kerberos ID infrastructures, either by using a callback to the user's workstation, invoked by the web server, as Stanford's WebAuth does [STANFORD1] or by using a browser plug-in, invoked by a document with a special MIME type, as in CMU's Minotaur [CMU1]. Others, such as North Carolina State's Web Realm Authentication Protocol ("WRAP") [NC1], use HTTP's basic authentication to retrieve a user id and password and then, by a combination of setting "cookies" and redirecting from one server to another, enable the user to authenticate with the user's institutional "Web Authorization Server" and then return to the original resource.
None of these approaches quite met our needs; they were designed to solve somewhat different problems. None was explictly designed for general inter-organizational access management. WRAP permits a limited form of inter-organizational management, but without the ability to use arbitrary user attributes (it employs only the user's name and institutional affiliation). We wanted to enable any subset of a user community to access a service. Additionally, we were reluctant to rely on cookies as part of our access management mechanism: cookies were designed explicitly for user tracking and so may not be shared across Internet DNS domains, which limits their utility for our purposes, and their use in user tracking has raised legitimate privacy concerns [EPIC1].
Several areas remain to be explored as we continue work on this project.
Since all of the entities in our model authenticate using digital certificates we need a public key infrastrucutre for managing those certificates and their associated keys. A first step might be the establishment of a trusted directory from which broker certificates could be retrieved and where revocation lists might be posted. Real-time modifications of entries in this directory must be supported while maintaining data integrity.
We need to think about ways to link disjoint ID and directory systems that operate within the same institution, for purposes of attribute retrieval. Attributes must be associated with the same user entity no matter which of several different disjoint ID systems may have been used to validate the user's credentials. Attributes must be marked by their source directory.
While attribute names and values may differ from one institution to another (and even within institutions), we should standardize the names of user attributes that might be requested from a broker. X.500 may be a useful starting point for this [X.500]. Translating standard attribute names into the varying names used within an institution is yet another task. We might also consider standardizing the format of service names that are reported to brokers, depending on the type of service provided.
A user's attributes may be distributed across more than one institution, i.e. the user is a member of more than one community. In this case, one organization's broker may needs to consult another broker, at another organization, to retrieve certain user attributes.
Generally, we see a need for a better understanding of user identity and communities in the networked environment. Membership in networked communities is increasingly important and many areas of research, including legal and public policy, play as critical a role as the technology work we have described here.
References.
[ALA94] OITP, American Library Association. Principles for the Development of the National Information Infrastructure. June 29, 1994. http://www.ala.org/oitp/principles.html
[Arm98] W. Arms. Implementing Policies for Access Management. D-Lib Magazine. February, 1998. http://www.dlib.org/dlib/february98/arms/02arms.html
[CMU1] Computing Services, Carnegie Mellon University. Project Minotaur. http://andrew2.andrew.cmu.edu/minotaur/
[Col92] Columbia University - Presbyterian Hospital Committee on Data Security. Network User Authorization and Authorization Architecture. June 16, 1992. ftp://ftp.columbia.edu/cpsec/nua.ps
[Col97] Student Services Office, Columbia University. "Appendix A: Policy on Access to Student Records under the Federal Family Educational Rights and Privacy Act (FERPA) of 1974." Facets -- Facts About Columbia Essential to Students. 1997. http://www.columbia.edu/cu/facets/50.html
[EPIC1] Epic Privacy Information Center. The Cookies Page. http://www.epic.org/privacy/internet/cookies/
[Lyn98] C. Lynch, ed. A White Paper on Authentication and Access Management Issues in Cross-organizational Use of Networked Information Resources. Coalition for Networked Information. Revised Discussion Draft of April 14, 1998. http://www.cni.org/projects/authentication/authentication-wp.html
[NC1] Web Services, NC State University. Welcome 2 WRAP. http://www.ncsu.edu/wrap/
[RSA1] RSA Laboratories. PKCS#11: Cryptographic Token Interface Standard. April 15, 1997. http://www.rsa.com/rsalabs/pubs/PKCS/html/pkcs-11.html
[STANFORD1] Stanford University Web Authentication.
[X.500] ISO/IEC 9594-1/ITU-T Recommendation X.500. "Information Techonology - Open Systems Interconnection -- The Directory: Overview of Concepts, Models and Services". 1997 edition.