[XML4Lib] Re: ignoring doctype

Eric Lease Morgan emorgan at nd.edu
Sun Oct 29 10:13:44 EST 2006


On Oct 29, 2006, at 7:42 AM, Eric Lease Morgan wrote:

> How do I get my XSLT program to ignore the DOCTYPE declaration in  
> an HTML file?
>
> I amy trying to use xsltproc to transform my valid HTML files into  
> a format that can be easily indexed by the Alvis filter of Zebra  
> but my transformations only work if I remove the DOCTYPE definition  
> from the HTML. I have the following HTML snippet:
>
>   <?xml version="1.0" encoding="utf-8"?>
>   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
>     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
>   <html>
>     <head>
>       <title>Communication is the key to our success / Eric Lease  
> Morgan</title>
>       <meta name="identifier" content="musings-91" />
>       <meta name="author" content="Eric Lease Morgan" />
>       <meta name="title" content="Communication is the key to our  
> success" />
>     </head>
>     <body><h1>Hello, World!</h1></body>
>   </html>
>
> I have this XSLT file:
>
>   <xsl:stylesheet
>     xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
>     xmlns:z   = "http://indexdata.dk/zebra/xslt/1"
>     version   = "1.0">


I think I can answer my own question:

   1) HTML snippet above is not accurate. My original content  
includes an unnamed namespace and looks like this:

     <html xmlns="http://www.w3.org/1999/xhtml">

   2) By declaring the namespace in my XSLT the problem goes away:

     <xsl:stylesheet
       xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
       xmlns:h   = "http://www.w3.org/1999/xhtml"
       version   = "1.0">

   3) Last, I need to include the namespace in my templates:

     <xsl:template match="h:meta">

Namespaces have always been confusing to me. Sigh.

-- 
Eric Morgan




More information about the XML4Lib mailing list