[XML4Lib] Getting MARCXML into a relational database

Bigwood, David bigwood at lpi.usra.edu
Fri Jun 12 12:34:19 EDT 2009


Dao,

Not sure what kind of analysis you want to do but MARC RTP
http://xrl.in/2ggq gives some basic information about the fields and
subfields used and the length of each. Gives a decent overview of the
database. I load our database in it every year or two to see if there
are any fields or subfields used that are not valid. For example, any
852u subfields most likely should be 856u.

It uses MARC records in the transmission format, not MARCXML. Terry
Reese's nifty MarcEdit can make that conversion.

Sincerely,
David Bigwood
dbigwood at gmail.com
http://catalogablog.blogspot.com
Twitter: LPI_Library


-----Original Message-----
From: xml4lib-bounces at webjunction.org
[mailto:xml4lib-bounces at webjunction.org] On Behalf Of Gong, Dao Rong
Sent: Friday, June 12, 2009 10:50 AM
To: Eric Lease Morgan; xml4lib
Subject: RE: [XML4Lib] Getting MARCXML into a relational database


Thanks Eric and all for the responses.

We will retire an old record keeping system (a home grown system called
MicroMarc). The data format is somehow close enough to MARC. We want to
export it into a database and analyze the data kept there. Since I don't
know any quick way to get MARC into a database, I output them as XML. I
started with MS-ACCESS but as Dave said, they often end up in a text
column. I'm now trying to write a XSLT to transform it into ACCESS
friendlier format but wondered if there are better ways to handle this,
or some tools if possible. I haven't heard about the Postgre XML data
type, maybe it is the way to go? 

Dao
Libraries, Michigan State University 


-----Original Message-----
From: xml4lib-bounces at webjunction.org on behalf of Eric Lease Morgan
Sent: Thu 6/11/2009 8:11 PM
To: xml4lib
Subject: Re: [XML4Lib] Getting MARCXML into a relational database
 

On Jun 11, 2009, at 4:56 PM, Gong, Dao Rong wrote:

> Has anyone had successful experience importing MARCXML file into a  
> relational database?

The short answer to your question is, "Yes, many of us have experience  
doing this sort of work."

The long answer is, "What is the problem you are trying to solve?"  
Putting a single MARCXML file containing a single MARCXML record into  
a single text field of a (relational or flat file) database is  
straight-forward. If the single MARCXML file contains a collection of  
many MARCXML records, then the text field might need to be rather  
large -- megabytes and megabytes in size.

If you want to parse each MARCXML record into distinct fields (title,  
author, subject, etc.), then you will probably want to apply some sort  
of XSL processing against the file. If you wanted a challenge, then  
you could convert the MARCXML into "real" MARC records and parse it  
that way. Pulling the data out in a cursory way is easy. Pulling it  
out in a more finely grained way is more difficult because there are  
literally thousands of options. Then of course, to what degree do you  
want to exploit relational database techniques to your problem? Join  
tables are fun and productive, and the design of databases like these  
are not difficult, but inserting data into them requires the creation  
of keys and the use of SQL to do inserts. More programming.

Again, what is the real problem you are trying to solve?

-- 
Eric Lease Morgan
University of Notre Dame



_______________________________________________
XML4Lib mailing list
XML4Lib at webjunction.org
http://lists.webjunction.org/mailman/listinfo/xml4lib




_______________________________________________
XML4Lib mailing list
XML4Lib at webjunction.org
http://lists.webjunction.org/mailman/listinfo/xml4lib






More information about the XML4Lib mailing list