Showing posts with label UNT. Show all posts
Showing posts with label UNT. Show all posts

Tuesday, January 30, 2007

UNT Call Number

Well, it's not Rolling Stone but I got my picture on the cover of Call Number.

Wednesday, December 28, 2005

MARC Content Designation Utilization Project

The MARC Content Designation Utilization (MCDU) Project is now making available the first set of results from analyses of the MCDU dataset of more than 56 million MARC bibliographic records from OCLC's WorldCat database. Separate data reports containing basic frequency count analysis are posted on the project website; go there to view the reports.

Tuesday, October 04, 2005

MARC Field Usage

A few days ago I mentioned the UNT MARC study of field usage in MARC records. It seems a similar study was done in 1997 in Germany. They found 33 elements that appeared in more than 1% of the records. Only 5 fields were used in 100% of the records (245, 260, 300, 050, 008). The study was of over 4 million records from LC, the UNT study is using a much larger sample and breaking it down to the subfields. Still using the Walt Crawford, study in 1988, the German study in 1997 and the UNT study will give us a view over time. Someone strong in statistics could combine the studies to do a meta-analysis. The German study is in German, naturally, but since MARC uses numeric tags, even those of us not able to read German can grasp some of the results.

Thursday, September 29, 2005

MARC Content Designation Utilization

Seymour Lubetzky asked "Is This Rule Necessary?". Now a study is being done asking "Is this field necessary?" The MARC Content Designation Utilization study is examing just what fields are being used. This could inform decisions on what fields to include in a Core Record, what fields to include in MODS. When we went from cards to computers we carried over the card into a new format. Now as we are moving from MARC to FRBR and MARCXML we should not continue to carry over unnecessary fields and practices. This study will provide an empirical basis for those and many other decisions.
The results from a recent analysis of 400,000 MARC records conducted as part of an IMLS National Leadership Grant to establish a Z39.50 interoperability testbed indicated less than 50% of nearly 2,000 MARC 21 fields/subfields occurred even once in the records, and that only 36 of the fields/subfields accounted for approximately 80% of all use. These preliminary results have sparked interest by catalogers, managers of cataloging operations, standards developers, people involved in machine generation of metadata, and others. We are proposing a research project that builds upon the initial analysis to carry out a systematic analysis of MARC content designation use in large random samples of format-specific MARC 21 bibliographic records.

Thursday, July 21, 2005

MARC Content Designation Utilization

The MARC Content Designation Utilization (MCDU) project has posted 2 documents.
  • MCDU Project MARC Records Dataset: Decomposition Specification, Database Design, and Parser Software
    This document provides information about the MARC dataset, the specifications for decomposing the MARC record, the design of the database to hold the decomposed records, and the parsing software that was designed to decompose the records and load the data to the database.
  • Validation Procedures for MARC Record Parsing Software
    This document describes the procedures for testing the parsing scripts used to decompose the MARC records. A sample of the raw MARC records from the dataset and the resulting parsed records are subjected to the validation procedures detailed below to verify the integrity of the software and ascertain that the data from the MARC records are correctly represented prior to loading into the database.

Wednesday, April 13, 2005

MARC Records Under the Microscope

The University of North Texas (UNT)-Texas Center for Digital Knowledge (TxCDK) announces a project investigating the coding of information in MARC records from the OCLC WorldCat database. The Institute of Museum and Library Services, an independent Federal grant-making agency dedicated to creating and sustaining a nation of learners by helping libraries and museums serve their communities, is funding the project with a National Leadership Grant of $233,115. TxCDK Fellows Dr. William E. Moen and Dr. Shawne D. Miksa, both from the UNT School of Library and Information Sciences (SLIS), are the Principal Investigators of this project entitled MARC Content Designation Utilization: Inquiry and Analysis (MCDU Project). SLIS Ph.D student Serhiy Polyakov and Masters students Amy Eklund and Gregory Snyder serve as Research Assistants.

During the course of the 2-year project, Drs Moen and Miksa will investigate the extent of catalogers' use of MARC 21 from an empirical perspective and will provide the first publicly available data on its usage. In the Z-Interoperability project, funded in 2003 by another IMLS National Leadership Grant, Dr. Moen discovered strong indications that only 36 of the approximately 2000 MARC fields/subfields accounted for 80% of all utilization, and that less than 50% of the available fields/subfields occurred even once in the records. These preliminary findings have important implications for library catalogers, standards developers, and people involved in the machine generation of metadata.

The Online Computer Library Center (OCLC, www.oclc.org) initially agreed to supply 1 million records for this project. After recent discussions, however, OCLC has agreed to provide the project with all of its approximately 55 million bibliographic records. This new development will significantly increase the accuracy of the research results. The OCLC WorldCat database contains unique bibliographic records shared by more than 50,000 libraries in 84 countries and territories around the world. For this project, only those records which are created by OCLC member libraries and contain original cataloging will be examined. The MARC records will be placed into study samples based on format of the material and record date-of-creation. The format-specific samples will allow determination of content designation use among similar types of records. The date-of creation samples will intersect with project activities to document how MARC content designation use by catalogers has changed over time.

The project has three goals:

  1. to provide empirical evidence to document MARC21 content designation use;
  2. explore the evolution of MARC content designation for patterns of availability and adoption/use level; and
  3. investigate a methodological approach to understand the factors contributing to current levels of MARC content designation use and relationships with the cataloging enterprise.
The results of the research will be disseminated to the LIS community through periodical publication of findings, including a methodology that could be applied to similar studies of utilization of MARC or other metadata schemas. The MCDU Project group will also work on designating a set of "core elements" based on occurrence in the samples and comparison with PCC and FRBR initiatives core record recommendations. A database application containing MARC 21 content designation specifications is currently under construction that will allow for the analysis of trends and patterns. This tool will be made available to the LIS community after the project's completion.

Dr. Miksa describes how the project's research strategies will examine MARC records as artifacts of the cataloging process. Data resulting from the project will greatly inform cataloging education and curricula which is critical to the continued development and improvement of information retrieval systems in libraries worldwide.

Details of the MCDU Project can be found at a website created and maintained by SLIS Masters student Bryce Benton. Any additional inquiries regarding project activities can be directed to Bill Moen (wemoen@unt.edu) or Shawne Miksa (smiksa@unt.edu).