<div dir="ltr">Hi John,<br><br>I suggest htmltidy, which is a utility that does just what you want. It converts HTML to XHTML.<br><br>Google htmltidy and batch and you should get what you need.<br><br>Best,<br><br>David.<br>
<br><div class="gmail_quote">2008/7/15 John Fitzgibbon <<a href="mailto:jfitzgibbon@galwaylibrary.ie">jfitzgibbon@galwaylibrary.ie</a>>:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div link="blue" vlink="purple" lang="EN-GB">
<div>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Hi,</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Is it possible to convert a folder of HTML files to XML
without having to edit each file with a text editor that supports regular
expressions? In the past this is how I accomplished this task but I am hoping
there is an easier way.</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">The process would have to change tags like <br> to
<br/>. Input tags in forms would also have to be closed.</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">It may have to close tags like <p> and <li>.</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Finally, attribute values are not necessarily bounded by
quotes. For example, width=200 will have to become width="200".</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Am I searching for a holy grail?</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Any advice would be much appreciated.</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Regards</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;">Jon</span></font></p>
<p><font size="2" face="Arial"><span style="font-size: 10pt; font-family: Arial;"> </span></font></p>
<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">w: <a href="http://www.galwaylibrary.ie" target="_blank">www.galwaylibrary.ie</a></span></font></p>
<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">e: <a href="mailto:info@galwaylibrary.ie" target="_blank">info@galwaylibrary.ie</a></span></font></p>
<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">p: 00 353 91 562471</span></font></p>
<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;" lang="EN-US">f: 00 353 91 565039</span></font></p>
<p><font size="3" face="Times New Roman"><span style="font-size: 12pt;"> </span></font></p>
</div>
<hr>
This e-mail message has been scanned for Contentand cleared by <font color="#400080"><b>MailMarshal Hosted at Galway County
Council</b></font>
<hr>
</div>
<br>_______________________________________________<br>
XML4Lib mailing list<br>
<a href="mailto:XML4Lib@webjunction.org">XML4Lib@webjunction.org</a><br>
<a href="http://lists.webjunction.org/mailman/listinfo/xml4lib" target="_blank">http://lists.webjunction.org/mailman/listinfo/xml4lib</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>David Kane<br>Systems Librarian<br>Waterford Institute of Technology<br><a href="http://library.wit.ie/">http://library.wit.ie/</a><br>T: ++353.51302838<br>M: ++353.876693212
</div>