XML 领 XML-related Issues Covered: MARC -- Metadata MARC -- Web-based MARC Applications for Other Technical Services OAI-PMH FRBR RSS
Pure Librarian Who are We? Metadata Specialist Hybrid Librarian Knowledge Facilitator WWW -- Hybrid Library -- Digital Library Access Catalog -- Holding Catalog Person-to-Person Reference Service -- Online Services (online help, tutorial, Electronic Cooperative Reference Service -- Question Point)
TEI SGML HTML XML http://public.ptl.edu.tw/publish/suyan/45/text_05.htm Digital Library
Digital Library : CD-ROM Digital Library -- : Surrogate( ) -- ( metadata ) : : Tiff, Gif, Pdf : txt, (mark-up language) :??
Digital Library -- : Html Pdf Java & JavaScript CGI PHP ASP,? ( IP )
MARC 1970 1980 MARC 1990 OPAC MARC 856 MARC Leader/06: Type of Record m (Computer File -- computer software (including programs, games, fonts), numeric data, computeroriented multimedia, online systems or services 006: Additional Material Characteristic 00: Form of material(m--computer File) 09 - Type of computer file (008/26)
MARC 008: Fixed Length Data Element 26 ( type of computer file) a (Numeric Data) b (Computer Program) e (Bibliographic Data) j (Online System or Service) MARC 256: COMPUTER FILE CHARACTERISTICS $a Computer file characteristics Example 256 $a Computer data (2 files : 876,000, 775,000 records). 256 $a Computer programs (2 files : 4300, 1250 bytes). 256 $a Data (1 file : 350 records).
MARC 516: TYPE OF COMPUTER FILE OR DATA NOTE $a Type of computer file or data note Example 516 $a Computer programs 516 $a Numeric (Summary statistics) 516 $a Numeric (Spatial data: Point) 516 $a Text (Law reports and digests) MARC 538: SYSTEM DETAILS NOTE Technical information about an item, such as software programming language, computer requirements, peripheral requirements $a System details note Example 538 $a Data in extended ASCII character set. 538 $a Written in FORTRAN H with 1.5K source program statements. 538 $a System requirements: IBM 360 and 370; 9K bytes of internal memory; OS SVS and OSMVS. 538 $a Disk characteristics: Disk is single sided, double density, soft sectored. 538 $a VHS 538 $a Compact disc
MARC 856: ELECTRONIC LOCATION AND ACCESS $k Password $l User Id $u Uniform Resource Locator (NR) Example 856 $u http://www.flysheet.com.tw/$kquest$lguest MARC Holdings Data 5xx & 84x: Note Fields 852 & 856: Location & Access Field 853-855: Caption & Pattern Field 863-865: Enumeration & Chronology Field 866-868: Textual Holding Statement Field 876-878: Item Information Field
MARC MARC 10,, 856 856 MARC MARC HTML ( ) MARC OPAC Z39.50 MARC MARC Alternatives to the Issues: MARC vs. Metadata (Semantic -- ) Web-based MARC (Structure -- )
DC MARC21 DC Publisher Publisher place MARC21 260$b 260$a
DC Subject Language MARC21 6XX 650 651 600 546 041 008 35-37 DC < Relation > Is Format Of Has Format Of Contributor CorporateName MARC21 776( ) 710( )$a $b
DC Date Modified 500 DC MARC21 DC MARC21 DC DC MARC21 DC MARC
MARC21 DC MARC21 DC MARC21 MARC21 DC MARC21 MetaData DC MARC21 Metadata
Metadata Metadata Metadata RDF/XML Metadata RDF / XML Metadata XML MARC21 XML RDF / XML Metadata MARC DC
MARC vs. Dublin Core MARC vs. XML semantic & structure perspectives: element vs. tag; qualifier vs. subfield. Syntax perspectives: Dublin Core: W3C XML MARC: ISO/2709. Metadata -- : : Source metadata ( ), (target) metadata : Z39.50 (metadata: Bib-1 Attribute Set) OAI (metadata: Dublin Core) OpenURL (metadata: self-defined)
Alternatives to the Issues: MARC vs. Metadata (Semantic -- ) Web-based MARC (Structure -- ) HTML Display -- MARC Display -- ISO-2709 A Bibliographical Record s HTML Display MARC Display ISO-2709
This is a test.txt This is a test.rtf This is a test.doc MARC Test.txt -- test.rtf -- test.doc Iso-2709 WWW (XML)
Plain Text_1 What is ISO-2709 Plain Text_2 Where What & How?
Independent of WWW is ----- Operating system Database Structure File format WWW markup language Content: (Web Page) (Web Resources) Presentation (Appearance) Representation (Display) Markup Language: HTML: (A Single, Predefined markup language) A language for specifying the layout of web pages. Markup specifies how content appears Level of headings Emphasize (with bold or italic) portions of the content XML SGML
What is XML XML, a formal recommendation from the World Wide Web Consortium (W3C) A MetaLanguage -- A Language for Describing Other Language -- which lets you design your own customized markup languages for limitless different types of documents XML (Extensible Markup Language) is a flexible way to create common information formats and share both the format and the data on the World Wide Web XML is "extensible" because, unlike HTML, the markup symbols are unlimited and self-defining XML describes the content in terms of what data is being described logically & semantically
(Semantic) MetaData (Structure) DTD; XML/Schema (Storage, Syntax) XML Overall
XML DTD -- XML -- XSL OUTPUT XML -- Database
ISO-2709 -- MARC -- XML XML OAI_MARC -- OAI_MARC DTD -- xml schema MARCXML -- MARCXML DTD -- xml schema MODS -- MODS Schema What is MODS ISO 2709 XML -- -- ISO 2709 XML ISO 2709 data ( metadata)
What: XML Schema (DTD)? A Schema describes what one or more XML documents can look like, and defines: The elements the document contains, and the order in which they appear The element content, and element attributes if any Why: The purpose of a schema is to allow machine validation of document structure. Instead of using the syntax of XML 1.0 DTD declarations, schema definitions use XML element syntax. A correct XML schema definition is, therefore, a wellformed XML document. DTD vs. Schema XML XML XML XML DTD SCHEMA. ( ) ( )
DTD vs. XML Schema Schema -- XML Document or XML Document -- Schema
So what! And Then? Application of XML for Technical Service Librarian Cataloging: Bibliographical Information Resources for both Physical and Virtual Collections Z3950 and XML Z3950 and OpenURL Controlled Vocabularies ( Classification, Thesaurus, and Subject Headings) Selection & Acquisition: Publisher -- Aggregator -- Dealer -- Trader -- Acquisition Module -- Cataloging Module Library
Application of XML OAI: Reader Services: Technical Services: RSS: Reader Services: Technical Services FRBR: Application of XML: OAI-PMH What is OAI-PMH? How is OAI-PMH working? Why do we need to know about OAI-PMH?
Definition Open Not free, unlimited Facilitate the availability of content from a variety of providers Archive A repository of scholarly papers A repository for stored information Mission To facilitate the efficient dissemination of content To enhance access to e-print archives Opening up access to a range of digital materials
Data Provider Maintain one or more repositories (Web server) that support the OAI protocol as a means of exposing metadata about their content Data Provider Data Provider Data Provider Repository Service Provider User Service provider Issue OAI protocol requests to data providers and use the metadata as a basis for building value-added services Data Provider Repository Service Provider Data Provider Data Provider User
Repository A Network accessible to which OAI protocol requests, embedded in HTTP Data Provider Repository Service Provider User OAI Metadata Harvesting Protocol Verbs Identify: return administrative information about a repository ListMetadataFormats: return list of metadata formats supported by repository, or for specific record in repository ListSets: return a list of sets supported by the repository GetRecord: return 1 record given an identifier & format desired ListIdentifiers: return a list of record identifiers, optionally filtered by date or set ListRecords: return a list of records in a given metadata format, optionally filtered by date or set Can t Retrieve/Filter by Subject or Keyword
OAI-- Metadata Harvesting
OAI -- Metadata XML & Stylesheet OAI is Used Differently!
Why do I, as a Technical Services Librarian, need to care about OAI? Application of XML FRBR What is FRBR How it works! Why?
EQUIVALENT Family of Works DERIVATIVE DESCRIPTIVE Microform reproduction Simultaneous publication Copy Exact reproduction Reprint Facsimile Translation Editions Revision Variations or versions Illustrated edition Abridged edition Expurgated edition Arrangement Slight modifications Summary Abstract Digest Free translation Dramatization Novelization Screenplay Libretto Change of genre Adaptations Parody Imitations Same style or thematic content Review Casebook Criticism Annotated edition Evaluation Commentary Original Same work New work Same Expression New Expression Cataloging Rules cut-off point New Work B. Tillett Dec. 2001 FRBR Group 1 Content Relationships Equivalent Derivative Descriptive Work to work relationships are inherited by hierarchically related Expressions Manifestations Items
Work Expression Manifestation Group 2 FRBR Item is owned by is produced by is realized by is created by Person Corporate Body many Work has as subject Work Expression Manifestation Item Group 3 FRBR has as subject has as subject Person Corporate Body Concept Object Event Place many
FRBR Entity Levels Work: The Novel The Movie Expression: Orig. Text Transl. Critical Edition Orig. Version Manifestation: Paper PDF HTML FRBR Entity Levels Work: The Novel The Movie Expression: Orig. Text Transl. Critical Edition Orig. Version Manifestation: Paper PDF HTML Item: Copy 1 Autographed Copy 2
FRBR Entity Levels Family of works Work: The Novel The Movie Expression: Orig. Text Transl. Critical Edition Orig. Version Manifestation: Paper PDF HTML FRBR--- Sister Carrie
1 Sister Carrie Sister Carrie Theodore resiser (1871-1945). Johnson Reprint Corp, 1969 2 Sister Carrie Theodore resiser (1871-1945). Doubleday, 1997 3 Sister Carrie Theodore resiser (1871-1945). University of Pennsylvania Press, 1981 4 Sister Carrie Sheldon Norman Grebstein Everett. Edwards, 1970 expression manifestation manifestation expression
Sister Carrie MARC FRBR Alogrithm 1.Title and Author: Work: author (100 110 111 ) title(240 243 245 ) 2.Title only: when no 100 or 110 or 111 Match display 240 243 245 3.Expression level match Leader/06 (type of record) 008/35-37 (language) 4.Manifestation level 008/07-10 ( publication date)
Where is XML in FRBR? Iso-2709 -- XML (slimfrbr.xml) Clean.xml Match.xml Result.html Result.xml http://tw.news.yahoo.com/rss/
http://www.kclibrary.org/rss/ http://lbxml.ust.hk/na/na_display.pl
RSS(Really Simple Syndication) Really Simple Syndication ( ) XML
Latest Research from OCLC ERRoL OAI Registry Extensible Repository Resource Locators (ERRoLs) for OAI Identifiers "info" URIs Naming Addressing LC name authority file FAST Finally! But, What do we Have? Why do we Have These? How do we Use These? XML is not HTML (Present vs. Represent) We make a HTML File, but We Generate (systematically) an XML file 3s for XML (Semantic, Structure, Syntax) Semantic-- Attributes for the Objects Semantically (MODS) & Structured (MARCXML) Perspectives Structure-- ComplexType Element Attribute Type Value (Defined from DTD or XML Schema) Syntax Reference to each other for Elements (XMLNameSpace) XML is Displayable but not Readable -- Stylesheet XML & Database are Highly Related to Each Other
No more! That s It. Q & A