[XML4Lib] Database schema for MARCXML?

Clay Redding credding at princeton.edu
Wed Oct 11 15:25:10 EDT 2006


Hi Brian,

That's kinda what's going on, but they've built custom functions for use 
in SQL out of libxml and libxslt2, so that you can use XPath functions 
inside your queries.  I think this newly supported stuff even can do a 
XSLT inside the SQL and return a result, bypassing the need for any 
external transformation tool.

Here's an old example (and, of course, you could easily add in some 
XML::LibXSLT functionality here) of how I used this way back when in 2003:

#!/usr/bin/perl

use DBI;
use DBD::Pg;
use Time::HiRes qw(gettimeofday tv_interval); # for timing, included in 5.8

use vars qw($dbh);

#Location of data
my $data='/var/www/html/ead/stellabloch.xml';

#XML source, slurped in as string.
my $xml_content=&get_xml($data);

#Connection info
my $dbConnect="dbi:Pg:dbname=xml;host=localhost;port=5432";
my $dbUser="xxxx33333";
my $passwd="";

my $query = "UPDATE xml SET data=? where id=1;";

my $dbh ||= DBI->connect($dbConnect, $dbUser, $passwd)
                or die "Couldn't connect to database: " . DBI->errstr;

my $sth = $dbh->prepare($query)
                or die "Cannot prepare: " . $dbh->errstr();

my $results = $sth->execute($xml_content)
                or die "Cannot execute: " . $sth->errstr();

$sth->finish();

$query2 = "select xpath_string(data,'//ead/\@audience') ";
$query2 .= " FROM xml WHERE id = 1;";

my $t0 = [gettimeofday];
$sth2 = $dbh->prepare($query2);
$sth2->execute();
while (($xpath_string) = $sth2->fetchrow()){
       $xpath_value = $xpath_string;
}
$elapsed = tv_interval ($t0) * 1000;
print "The value of //ead/\@audience is \"$xpath_value\" \n";
print "query time was $elapsed ms. \n";

$sth2->finish();
$dbh->disconnect;

sub get_xml {
                # Pull in an XML document from the filesystem
                my ($xml_source) = (@_);
                local $/ = undef;
                open(IN,$xml_source);
                my $xml_content=<IN>;
                close(IN);
                return $xml_content;
}





Brian Sheppard wrote:

> My reading of this is that one uses an SQL query to retrieve a well- 
> formed chunk of xml, then processes that separately with whatever xsl  
> tools you prefer (e.g. perl's XML::LibXSLT).
> -Brian
>
> On Oct 11, 2006, at 1:05 PM, Andrew Nagy wrote:
>
>> Clay, Hope all is well across the river in Princeton.
>>
>> I was wondering if you have used the Postgres XPath functions?  I  
>> saw the site that you referenced below and it says that the xpath  
>> library is now apart of the core package.  But after searching  
>> Postgres's site for "XPath", i find nothing.  Do you know of any  
>> documentation about this anywhere?  I'd like to give it a shot, but  
>> I am curious as to what capabilities are available.
>>
>> Thanks!
>> Andrew
>>
>> Clay Redding wrote:
>>
>>> Hi Jacob,
>>>
>>> If you are required to use a relational database for this, what I  
>>> have to say might not help.  I've long since given up trying to  map 
>>> common XML metadata formats to RDBMS -- especially ones that  don't 
>>> translate very well (like Mike said).  These days I largely  use 
>>> native XML dbs (such as eXist or X-Hive) to do this sort of  XML 
>>> work, and the development time (even factoring in learning  XQuery) 
>>> seems shorter than working out a RDBMS schema.   Alternatively, I've 
>>> used PostgreSQL, which allows you to store the  XML as a whole file 
>>> inside a table.  Then you use a hybrid of SQL  and XPath/XSLT to 
>>> interact with the data:
>>>
>>> http://www.throwingbeans.org/postgresql_and_xml_updated.html
>>>
>>> Clay
>>>
>>> jkglenn at umich.edu wrote:
>>>
>>>> Hello,
>>>>
>>>> Hopefully the list is willing to entertain a naive question from  a 
>>>> MARC and XML newbie. I'm trying to figure out the best way to  
>>>> store MARCXML records in a relational database, and am wondering  
>>>> if there are any standard database schemas out there for doing  
>>>> this. In particular I would like to avoid having to write  database 
>>>> queries to replicate the functionality of CCL commands  that search 
>>>> a particular set of MARC fields. I've found some  limited info on 
>>>> the Library of Congress website, but nothing very  helpful. If 
>>>> someone could point me in the right direction I would  be most 
>>>> grateful.
>>>>
>>>> Thanks,
>>>> Jacob Glenn
>>>> _______________________________________________
>>>> XML4Lib mailing list
>>>> XML4Lib at webjunction.org
>>>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> XML4Lib mailing list
>>> XML4Lib at webjunction.org
>>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>
>>
>> _______________________________________________
>> XML4Lib mailing list
>> XML4Lib at webjunction.org
>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>
>>
>
>
>
> --------------------------------------------------
> Brian Sheppard
> University of Wisconsin Digital Collections Center
> bsheppard at library.wisc.edu    (608) 262-3349
>
>
>
>



More information about the XML4Lib mailing list