[XML4Lib] Database schema for MARCXML?
Kyle Banerjee
kyle.banerjee at gmail.com
Wed Aug 2 13:31:38 EDT 2006
The best structure depends on what you want to do with the data and
how much you have.
For example, if you want to be able to search individual subfields of
a huge bibliographic set, you'll want to index them separately.
However, if you only need to be able to search and display a few
fields, it might be much easier to just index those individually and
store the entire MARCXML record in one field. If you're interested in
what the fields mean as opposed to what they are (for example, do you
want to search for "authors" or do you really want people to specify
all the 1xx and 7xx fields), you'll want to assign tag types to
simplify searching.
I'd be inclined to retain the whole record and design for the need at
hand. If needs change, you can always add more tables and indexes.
kyle
As Mike hints, the joins get expensive if you
On 8/2/06, Mike Rylander <mrylander at gmail.com> wrote:
> On 8/2/06, jkglenn at umich.edu <jkglenn at umich.edu> wrote:
> > Hello,
> >
> > Hopefully the list is willing to entertain a naive question from a MARC
> > and XML newbie. I'm trying to figure out the best way to store MARCXML
> > records in a relational database, and am wondering if there are any
> > standard database schemas out there for doing this. In particular I
> > would like to avoid having to write database queries to replicate the
> > functionality of CCL commands that search a particular set of MARC
> > fields. I've found some limited info on the Library of Congress
> > website, but nothing very helpful. If someone could point me in the
> > right direction I would be most grateful.
>
> Unfortunately, MARC doesn't translate to a relational model very well
> ... not in a normalized, scalable way, at least.
>
> There are basically 2 routes you can take:
>
> Flat, denormalized table of field-subfield-data, something like this
>
> CREATE TABLE marcdata (
> recid int,
> tag char(3),
> ind1 char(1),
> ind2 char(1),
> subfield char(1),
> data varchar(1024)
> );
>
> Or you can go to a normalized, multi-table setup, along these lines
>
> CREATE TABLE marcdoc (
> id int,
> source text
> );
>
> CREATE TABLE marctag (
> id int,
> record int,
> tag char(3)
> );
>
> CREATE TABLE controlfield (
> tag int,
> data text,
> ),
>
> CREATE TABLE datafield_indicators (
> tag int,
> ind1 char(1),
> ind2 char(2)
> );
>
> CREATE TABLE datafield_subfield
> tag int
> subfield char(1),
> value text
> );
>
>
> Hope that helps! :)
>
> >
> > Thanks,
> > Jacob Glenn
> > _______________________________________________
> > XML4Lib mailing list
> > XML4Lib at webjunction.org
> > http://lists.webjunction.org/mailman/listinfo/xml4lib
> >
>
>
> --
> Mike Rylander
> mrylander at gmail.com
> GPLS -- PINES Development
> Database Developer
> http://open-ils.org
> _______________________________________________
> XML4Lib mailing list
> XML4Lib at webjunction.org
> http://lists.webjunction.org/mailman/listinfo/xml4lib
>
More information about the XML4Lib
mailing list