[XML4Lib] batch conversion of HTML files to XML
John Fitzgibbon
jfitzgibbon at Galwaylibrary.ie
Tue Jul 15 04:47:27 EDT 2008
Hi,
Is it possible to convert a folder of HTML files to XML without having to edit each file with a text editor that supports regular expressions? In the past this is how I accomplished this task but I am hoping there is an easier way.
The process would have to change tags like <br> to <br/>. Input tags in forms would also have to be closed.
It may have to close tags like <p> and <li>.
Finally, attribute values are not necessarily bounded by quotes. For example, width=200 will have to become width="200".
Am I searching for a holy grail?
Any advice would be much appreciated.
Regards
Jon
w: www.galwaylibrary.ie
e: info at galwaylibrary.ie
p: 00 353 91 562471
f: 00 353 91 565039
#####################################################################################
This e-mail message has been scanned for Content and cleared
by MailMarshal Hosted at Galway County Council
#####################################################################################
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.webjunction.org/wjlists/xml4lib/attachments/20080715/33acb3ca/attachment.htm
More information about the XML4Lib
mailing list