[XML4Lib] Re: ignoring doctype
Eric Lease Morgan
emorgan at nd.edu
Sun Oct 29 10:13:44 EST 2006
On Oct 29, 2006, at 7:42 AM, Eric Lease Morgan wrote:
> How do I get my XSLT program to ignore the DOCTYPE declaration in
> an HTML file?
>
> I amy trying to use xsltproc to transform my valid HTML files into
> a format that can be easily indexed by the Alvis filter of Zebra
> but my transformations only work if I remove the DOCTYPE definition
> from the HTML. I have the following HTML snippet:
>
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
> <html>
> <head>
> <title>Communication is the key to our success / Eric Lease
> Morgan</title>
> <meta name="identifier" content="musings-91" />
> <meta name="author" content="Eric Lease Morgan" />
> <meta name="title" content="Communication is the key to our
> success" />
> </head>
> <body><h1>Hello, World!</h1></body>
> </html>
>
> I have this XSLT file:
>
> <xsl:stylesheet
> xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
> xmlns:z = "http://indexdata.dk/zebra/xslt/1"
> version = "1.0">
I think I can answer my own question:
1) HTML snippet above is not accurate. My original content
includes an unnamed namespace and looks like this:
<html xmlns="http://www.w3.org/1999/xhtml">
2) By declaring the namespace in my XSLT the problem goes away:
<xsl:stylesheet
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform"
xmlns:h = "http://www.w3.org/1999/xhtml"
version = "1.0">
3) Last, I need to include the namespace in my templates:
<xsl:template match="h:meta">
Namespaces have always been confusing to me. Sigh.
--
Eric Morgan
More information about the XML4Lib
mailing list