[XML4Lib] Question: Getting XML records of web resources

Jon Gorman jonathan.gorman at gmail.com
Wed Oct 1 09:35:06 EDT 2008


> We are looking at options in bulk ingesting resources and this
> requires XML of metadata of resources. I am looking for possible
> sources of XML records that we are dealing with. LOC is one source,
> but it limits the result to 10,000 and not all web-resources are
> available therein. If XML records are not readily available, can there
> be a method to convert the records into XML format.

I think you may have to clarify your question a little.  What
schema/format of XML?  Are you trying to find MarcXML records of
things like blogs, videos, music and other documents available online?
 This list pretty much revolves around (correct me if I'm wrong)
manipulating common XML formats in libraries.  So we might share some
tips on working with EAD's XML and converting it to MarcXML, or talk
about what type of parser is good for large collections of records.

You may be better off asking on a list more dedicated to cataloging or
harvesting records for good sites for good places for finding
information on websources.  Then take it in whatever form you can get
it and convert it to an xml scheme if necessary.  There's several
tools out there for converting records, but without knowing your input
format it's difficult to state with any certainty.  MarcEdit and the
yaz tools both have converters out there for MARC and there are
libraries out there in many programming languages.

Jon Gorman




More information about the XML4Lib mailing list