<?xml version="1.0" encoding="UTF-8"?>
<!-- DOCTYPE article SYSTEM "file:/Users/tingle/Desktop/ox/docbook/dtd/docbookx.dtd" -->
<!DOCTYPE article SYSTEM "file:/home/tingle/Oxygen3.1/docbook/dtd/docbookx.dtd">
<article>
    <articleinfo>
        <title>Managing Content Diversity with METS Profiles</title>
        
        <othercredit>
            <contrib>Prepared for METS Opening Day West Coast; Stanford University, April 7-8, 2004</contrib>
        </othercredit>
        <author>
            <firstname>Brian</firstname>
            <surname>Tingle</surname>
            <affiliation>
                <jobtitle>Content Management Designer</jobtitle>
                <orgdiv>California Digital Library</orgdiv>
                <address>
                    <otheraddr>University of California, Office of the President</otheraddr>
                    <street>415 20th Street, Floor 4</street>
                    <city>Oakland</city>
                    <state>CA</state>
                    <postcode>94612</postcode>
                </address>
            </affiliation>
        </author>
        <corpauthor>California Digital Library</corpauthor>
        <copyright>
            <year>2004</year>
            <holder>The Regents of the University of California</holder>
        </copyright>
        <abstract>
            <para> The California Digital Library maintains a growing collection of Metaadata Transmission
                and Encoding Standard (<acronym>METS</acronym>) Documents. These METS Documents are
                representations of digital library objects that are made available through a variety of management
                and end user 
                services. Extant METS Documents are mostly the results of conversions of non-METS
                representations of a diversity of digital objects, of various types and formats, from
                various sources. METS Profiles provide a systematic way of identifying species of
                METS Documents, and recording their important characteristics. </para>
        </abstract>
    </articleinfo>
    <section>
        <title>Profile source files</title>
        <para>For each METS Profile supported in production on ark.cdlib.org, there is a master
            XML source file describing the profile. That master file has the extension
            '.profile.xml.' That master, when any XML Includes are expanded, and any resulting
            'xml:base' attributes removed, will validate against the METS_Profile schema. </para>
        <section>
            <title>Narrative Descriptions of Profiles</title>
            <para>The schema is expressive enough to record detailed specifications that must be
                met in order for a METS instance to conform to that PROFILE. There are two audiences that we
                hope would be very interested in this content; people programming systems to create
                METS objects that match a profile, and people programming systems that will have to do things
                with the objects. </para>
            <para>As of April 2004, all METS used in production access services at CDL have been
                created at CDL, usually through an iterative process. The METS Profiles that exist
                now are more like field notes on observations of object behaviour in the wild than
                formal specifications that someone could use to create new objects that meet a
                profile. Time and resources permitting, they will be modified to increase their formality and
                utility as specifications. </para>
        </section>
        <section>
            <title>Machine Readable Descriptions of Profiles</title>
            <para>ark.cdlib.org uses two different XSL Transforms on METS documents. One extracts a
                Dublin Core record from the object. The other is used to create the HTML index page
                for the object in the display service.</para>
            <para>The tool section of the METS_Profile schema provides a URI element. For a profile
                to get "picked up" in the system, the METS_Profile needs to have a tool element with a tool/description/p[@ID='toQDC'] and
                a tool with tool/description/p[@ID='toHTML'].  The tool/URI of these tools must reference an XSLT stylesheet that extracts
                a Dublin Core record and an XSTL to create an HTML page from the METS, respectivly.</para>
            <screen><![CDATA[
  <tool>
        <agency>California Digital Library</agency>
        <URI>http://ark.cdlib.org/xslt/extract-dc/kt3v19p5bk.dc.xslt</URI>
        <description>
            <p ID="toQDC">Ths XSLT is used to extract a
            Dublin Core record from the object</p>
        </description>
    </tool>
    <tool>
        <agency>California Digital Library</agency>
        <URI>http://ark.cdlib.org/xslt/mets-page/kt3v19p5bk.html.xslt</URI>
        <description>
            <p ID="toHTML">This XSLT is used to create an 
            HTML index page for the object.</p>
        </description>
    </tool>]]></screen>
            <para/>
        </section>
        <section>
            <title>BASE Profiles and XML Includes</title>
            <para>Lots of profiles are similar in certain ways. Common elements can be included from
                a Base.</para>
            <screen><![CDATA[
<structMap>
<xi:include 
href="./BASE-DynaXML2003.profile.xml
#xpointer(/METS_Profile/structural_requirements/structMap/requirement)"/>
</structMap>]]></screen></section>
        <section>
            <title>Profiles from outside CDL</title>
            <para>While all the profiles and profile URIs in the system as of April 2004 are CDL assigned and maintained,
                this scheme will accommodate profiles from other sources. The published METS_Profile
                will be saved to the master source directory. If the required &lt;tool&gt;
                sections do not exist, the 2 XSLTs will be created and the sections added to the
                profile. Then, that profile will be ready for use in the system. </para>
        </section>
    </section>
    <section>
        <title>Files Generated from source files</title>
        <para>Right now, everything that has been assigned an ARK at ark.cdlib.org has a METS
            Document representation acting as its binding record. Since modern CDL profiles have
            been assigned ARKs, they also need to be transformed into METS files for their ARKs to
            work. To enable this, there is a METS Profile for METS Profiles. The METS Profile for
            METS Profiles includes an XSLT to generate an HTML representation of the METS Profile instance.</para>
        <para>Multiple systems at CDL are profile aware. The METS Profiles are the authoritative
            source of information for these applications. They either access the files directly, or
            batch programs process the directory of profiles into a configuration file. </para>
        <para>The official list of CDL profiles will be generated by batch processing METS Profiles. </para>
    </section>
    <section>
        <title>Related URLs</title>
        <section>
            <title>Standards</title>
            <itemizedlist>
                <listitem>
                    <para>
                        <ulink url="http://www.loc.gov/standards/mets/">METS Official Web Site</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://ark.cdlib.org/mets/profile_schema_documentation/">HTML
                            version of METS Profile Schema</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink
                            url="http://www.loc.gov/standards/mets/profile_docs/METS.profile.requirements.rtf">
                            Rich Text Format version of METS Profile Documentation</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://www.cdlib.org/inside/diglib/ark/">Archival Resource Key
                        (ARK)</ulink> page on Inside CDL.</para>
                </listitem>
            </itemizedlist>
        </section>
        <section>
            <title>Human user web services "powered by" METS Profiles</title>
            <itemizedlist>
                <listitem>
                    <para>
                        <ulink url="http://www.oac.cdlib.org/">The Online Archive of California</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://texts.cdlib.org/ucpress/">University of California Press:
                            eScholarship Editions</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink
                        url="http://www.californiadigitallibrary.org/">www.californiadigitallibrary.org</ulink>
                        (portal to UC websites with public content in CDL's METS collection </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://jarda.cdlib.org/">Japanese American Relocation Digital Archives</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://www.bampfa.berkeley.edu/moac/">MOAC</ulink> California
                        museums working with libraries and archives to increase and enhance access
                        to cultural collections</para>
                </listitem>
                <listitem>
                    <para>
                        <ulink url="http://sunsite.berkeley.edu/CalHeritage/">The California
                            Heritage Collection</ulink>
                    </para>
                </listitem>
                <listitem>
                    <para>The <ulink url="http://calcultures.cdlib.org/">CalCultures Project</ulink>
                        will have content in production Summer 2004. This will be the first project
                        where CDL will be ingesting METS from a third party into production. All the
                        METS for this project will be generated by GenX at UC Berkeley. The exact
                        details are not resolved yet, but I imagine that we will have one profile
                        set up at CDL for all METS that are generated with GenX, unless we find out
                        we need more.</para>
                </listitem>
            </itemizedlist>
        </section>
        <section>
            <title>CDL Profiles</title>
            <para>http://ark.cdlib.org/mets/profiles/ will be maintained with an up-to-date listing
                of profiles in production, with links to available documentation.</para>
            <para>
                <glosslist>
                   
                    <glossentry>
                        <glossterm> BEPRESS repository export (http://ark.cdlib.org/ark:/13030/kt200014dk)</glossterm>
                        <glossdef>
                            <para>This profile is under development. It is not active at this time. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> DC OAC image (OAC-LSTA-DC) (http://ark.cdlib.org/ark:/13030/kt4g5012g0)</glossterm>
                        <glossdef>
                            <para>Image objects created for LSTA from Dublin Core source. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> DC OAC text (OAC-ETEXT) (http://ark.cdlib.org/ark:/13030/kt7j49p867)</glossterm>
                        <glossdef>
                            <para>Profile for OAC texts with Dublin Core metadata. Structure is
                                optimized for dynaXML. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> DDI Table (http://ark.cdlib.org/ark:/13030/kt1g5010zb)</glossterm>
                        <glossdef>
                            <para>DDI object for Counting California</para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> EAD DAO* extracted object (http://ark.cdlib.org/ark:/13030/kt3q2nb7vz)</glossterm>
                        <glossdef>
                            <para>Encoded Archival Description provides a mechanism to define
                                objects in is dao and dapgrp tags. This profile is for METS object
                                created by a batch extraction process from EAD Finding Aids in the
                                Online Archive of California.</para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> EAD Finding Aid (http://ark.cdlib.org/ark:/13030/kt0t1nb6x7)</glossterm>
                        <glossdef>
                            <para>This profile is used internally by CDL during the ingest of EAD
                                encoded Finding Aids.</para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> MODS OAC image (http://ark.cdlib.org/ark:/13030/kt400011f8)</glossterm>
                        <glossdef>
                            <para>Images created for LSTA project from MARC source use this profile. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> MODS OAC text (http://ark.cdlib.org/ark:/13030/kt5k40135s)</glossterm>
                        <glossdef>
                            <para>Profile for OAC texts with MODS metadata. Structure is optimized
                                for dynaXML. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> MODS eSch text (oceans) (http://ark.cdlib.org/ark:/13030/kt5z09p6zn)</glossterm>
                        <glossdef>
                            <para>Same as the OAC MODS profile, but with different branding. We do
                                not really want to trigger branding with a profile, so this is just
                                temporary until we have better branding. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> Profile for Profiles (http://ark.cdlib.org/ark:/13030/kt8s20152f)</glossterm>
                        <glossdef>
                            <para>A Profile for METS documents describing profiles of METS documents</para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> UCPEE netlib book (http://ark.cdlib.org/ark:/13030/kt3v19p5bk)</glossterm>
                        <glossdef>
                            <para>Objects created by the UC Press eScholarship Editions project. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> pre MODS (crs reports) (http://ark.cdlib.org/ark:/13030/kt667nb8wm)</glossterm>
                        <glossdef>
                            <para>This is a place holder profile to assign ARKs to these Items. Now
                                that these have ARKs, SCP can create MARC records, and then we can
                                create a MODS based profile for these. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> submission package profile (http://ark.cdlib.org/ark:/13030/kt4k40124g)</glossterm>
                        <glossdef>
                            <para>place holder for profile in development. </para>
                        </glossdef>
                    </glossentry>
                    
                     <glossentry>
                        <glossterm> BASE Profile for DynaXML 2003 (BASE-DynaXML2003)</glossterm>
                        <glossdef>
                            <para>This is not a Full Profile. It is just a BASE to build on. It
                                documents how DynaXML 2003 (not XTF) requires METS. This will change
                                in 2004 with XTF. </para>
                        </glossdef>
                    </glossentry>
                    <glossentry>
                        <glossterm> BASE Profile for LSTA images (BASE-DynaXML2003)</glossterm>
                        <glossdef>
                            <para>This is not a Full Profile. It is just a BASE to build on. It
                                documents how simple/moderatly complex image objects designed from
                                LSTA are structured. </para>
                        </glossdef>
                    </glossentry>
                    
                </glosslist>
            </para>
        </section>
    </section>
</article>

