[XML4Lib] Database schema for MARCXML?
Clay Redding
credding at princeton.edu
Wed Oct 11 15:25:10 EDT 2006
Hi Brian,
That's kinda what's going on, but they've built custom functions for use
in SQL out of libxml and libxslt2, so that you can use XPath functions
inside your queries. I think this newly supported stuff even can do a
XSLT inside the SQL and return a result, bypassing the need for any
external transformation tool.
Here's an old example (and, of course, you could easily add in some
XML::LibXSLT functionality here) of how I used this way back when in 2003:
#!/usr/bin/perl
use DBI;
use DBD::Pg;
use Time::HiRes qw(gettimeofday tv_interval); # for timing, included in 5.8
use vars qw($dbh);
#Location of data
my $data='/var/www/html/ead/stellabloch.xml';
#XML source, slurped in as string.
my $xml_content=&get_xml($data);
#Connection info
my $dbConnect="dbi:Pg:dbname=xml;host=localhost;port=5432";
my $dbUser="xxxx33333";
my $passwd="";
my $query = "UPDATE xml SET data=? where id=1;";
my $dbh ||= DBI->connect($dbConnect, $dbUser, $passwd)
or die "Couldn't connect to database: " . DBI->errstr;
my $sth = $dbh->prepare($query)
or die "Cannot prepare: " . $dbh->errstr();
my $results = $sth->execute($xml_content)
or die "Cannot execute: " . $sth->errstr();
$sth->finish();
$query2 = "select xpath_string(data,'//ead/\@audience') ";
$query2 .= " FROM xml WHERE id = 1;";
my $t0 = [gettimeofday];
$sth2 = $dbh->prepare($query2);
$sth2->execute();
while (($xpath_string) = $sth2->fetchrow()){
$xpath_value = $xpath_string;
}
$elapsed = tv_interval ($t0) * 1000;
print "The value of //ead/\@audience is \"$xpath_value\" \n";
print "query time was $elapsed ms. \n";
$sth2->finish();
$dbh->disconnect;
sub get_xml {
# Pull in an XML document from the filesystem
my ($xml_source) = (@_);
local $/ = undef;
open(IN,$xml_source);
my $xml_content=<IN>;
close(IN);
return $xml_content;
}
Brian Sheppard wrote:
> My reading of this is that one uses an SQL query to retrieve a well-
> formed chunk of xml, then processes that separately with whatever xsl
> tools you prefer (e.g. perl's XML::LibXSLT).
> -Brian
>
> On Oct 11, 2006, at 1:05 PM, Andrew Nagy wrote:
>
>> Clay, Hope all is well across the river in Princeton.
>>
>> I was wondering if you have used the Postgres XPath functions? I
>> saw the site that you referenced below and it says that the xpath
>> library is now apart of the core package. But after searching
>> Postgres's site for "XPath", i find nothing. Do you know of any
>> documentation about this anywhere? I'd like to give it a shot, but
>> I am curious as to what capabilities are available.
>>
>> Thanks!
>> Andrew
>>
>> Clay Redding wrote:
>>
>>> Hi Jacob,
>>>
>>> If you are required to use a relational database for this, what I
>>> have to say might not help. I've long since given up trying to map
>>> common XML metadata formats to RDBMS -- especially ones that don't
>>> translate very well (like Mike said). These days I largely use
>>> native XML dbs (such as eXist or X-Hive) to do this sort of XML
>>> work, and the development time (even factoring in learning XQuery)
>>> seems shorter than working out a RDBMS schema. Alternatively, I've
>>> used PostgreSQL, which allows you to store the XML as a whole file
>>> inside a table. Then you use a hybrid of SQL and XPath/XSLT to
>>> interact with the data:
>>>
>>> http://www.throwingbeans.org/postgresql_and_xml_updated.html
>>>
>>> Clay
>>>
>>> jkglenn at umich.edu wrote:
>>>
>>>> Hello,
>>>>
>>>> Hopefully the list is willing to entertain a naive question from a
>>>> MARC and XML newbie. I'm trying to figure out the best way to
>>>> store MARCXML records in a relational database, and am wondering
>>>> if there are any standard database schemas out there for doing
>>>> this. In particular I would like to avoid having to write database
>>>> queries to replicate the functionality of CCL commands that search
>>>> a particular set of MARC fields. I've found some limited info on
>>>> the Library of Congress website, but nothing very helpful. If
>>>> someone could point me in the right direction I would be most
>>>> grateful.
>>>>
>>>> Thanks,
>>>> Jacob Glenn
>>>> _______________________________________________
>>>> XML4Lib mailing list
>>>> XML4Lib at webjunction.org
>>>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> XML4Lib mailing list
>>> XML4Lib at webjunction.org
>>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>
>>
>> _______________________________________________
>> XML4Lib mailing list
>> XML4Lib at webjunction.org
>> http://lists.webjunction.org/mailman/listinfo/xml4lib
>>
>>
>
>
>
> --------------------------------------------------
> Brian Sheppard
> University of Wisconsin Digital Collections Center
> bsheppard at library.wisc.edu (608) 262-3349
>
>
>
>
More information about the XML4Lib
mailing list