<div dir="ltr">Hi John,<br><br>I suggest htmltidy, which is a utility that does just what you want.&nbsp; It converts HTML to XHTML.<br><br>Google htmltidy and batch and you should get what you need.<br><br>Best,<br><br>David.<br>
<br><div class="gmail_quote">2008/7/15 John Fitzgibbon &lt;<a href="mailto:jfitzgibbon@galwaylibrary.ie">jfitzgibbon@galwaylibrary.ie</a>&gt;:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">









<div link="blue" vlink="purple" lang="EN-GB">

<div>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Hi,</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Is it possible to convert a folder of HTML files to XML
without having to edit each file with a text editor that supports regular
expressions? In the past this is how I accomplished this task but I am hoping
there is an easier way.</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">The process would have to change tags like &lt;br&gt; to
&lt;br/&gt;. Input tags in forms would also have to be closed.</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">It may have to close tags like &lt;p&gt; and &lt;li&gt;.</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Finally, attribute values are not necessarily bounded by
quotes. For example, width=200 will have to become width="200".</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Am I searching for a holy grail?</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Any advice would be much appreciated.</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Regards</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Jon</span></font></p>

<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">&nbsp;</span></font></p>

<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">w: <a href="http://www.galwaylibrary.ie" target="_blank">www.galwaylibrary.ie</a></span></font></p>

<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">e: <a href="mailto:info@galwaylibrary.ie" target="_blank">info@galwaylibrary.ie</a></span></font></p>

<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">p: 00 353 91 562471</span></font></p>

<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">f: 00 353 91 565039</span></font></p>

<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;">&nbsp;</span></font></p>

</div>


<hr>
This e-mail message has been scanned for Contentand cleared by <font color="#400080"><b>MailMarshal Hosted at Galway County 
Council</b></font> 
<hr>
</div>


<br>_______________________________________________<br>
XML4Lib mailing list<br>
<a href="mailto:XML4Lib@webjunction.org">XML4Lib@webjunction.org</a><br>
<a href="http://lists.webjunction.org/mailman/listinfo/xml4lib" target="_blank">http://lists.webjunction.org/mailman/listinfo/xml4lib</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>David Kane<br>Systems Librarian<br>Waterford Institute of Technology<br><a href="http://library.wit.ie/">http://library.wit.ie/</a><br>T: ++353.51302838<br>M: ++353.876693212
</div>