The problem of converting Word documents to HTML is rather acute, but the only possible way to parse Word files without using COM - is to parse RTF files. I'm often asked if it is possible to create good-looking HTML files from RTFs automatically. I've done some tests, and here is the result.
You can test the script by yourself, but mind, that it won't understand too complicated documents. Currently, only simple document formatting is supported: bold, italic, underlined text and font size. Documents having page numbers, headers, footers, images or any graphic elements - won't be parsed correctly.
Tables are not supported too, but may be I'll found a way to insert this feature.
Using the form below you can select any RTF file and view it in HTML format. If you have any comments - you may contact me by e-mail.
Warning! If you're trying to test this script using the RTF, generated by my RTF Generator - it won't work - as RTF Generator creates very complicated documents for this test script, and the test document demonstrates almost all of the features.
NOTICE - This is a frozen project. At the moment I gave up the idea to complete it, as I have more interesting and useful projects at hand. So you can download the source files as they are, with no support, and can do with them whatever you want.
download zip-file (4kb)