* Tip published on eHelp website.
This month: converting large Microsoft Word documents to HTML. We've explored this subject before, but it's worth looking into again since I've discovered a workaround for converting long Word "DOC" files into HTML topics.
This tip is useful for converting Word documents (any version) to HTML topics and is especially if you've been frustrated by Word 2000's use of XML. This technique removes the extraneous XML code produced by Word 2000 out of your topics automatically, saving you a lot of time.
For those unfamiliar with the problem, Microsoft Word 2000 uses eXtensible Markup Language (XML) in Word 2000 documents. When you attempt to import a DOC file into RoboHELP HTML, the XML tags can cause older versions of RoboHTML to literally "choke" without the use of Microsoft's HTML Filter 2.0. RoboHELP 9 has been re-engineered to allow DOC files to be converted to HTML files, but it does not filter the XML from the HTML. The resulting HTML becomes ladened with XML and embedded style sheets. The result is a loss of formatting control of your documents, which makes it nearly impossible to control the appearance of your topics On the surface the document looks and works fine, but the TrueCode reveals hundreds of lines XML tags and embedded style elements.
Here's a list of solutions to the Word 2000 DOC to HTML dilemma:
Stick with Word 97. If you have a copy of Word 97 (Version 8), stick with it. Word 97 converts DOC files to HTML without much of a problem.
Copy and paste. A reliable method, but you lose all the original formatting. Simply copy the text from a Word 2000 document and paste the text into an HTML topic in RoboHTML. This technique ultimately gives you a lot of control, but you must reformat everything. For an occasional topic this method works great, but for hundreds of topics this technique is a lot of work.
Third party conversion tools. You can use conversion utilities to DOC files to HTML. I haven't test them all, but limited testing resulted in very "clean" HTML. You can import the resulting HTML into RoboHTML without a problem. I'll highlight a few programs this month. Maybe eHelp will add one of these converters to RoboHELP some day.
Finally, Import the DOC to WinHelp, then convert to HTML-Help. You can import a DOC file into a WinHelp project and create hundreds of WinHelp topics in a matter of minutes. Once imported, simply run the Single Source option to convert the project to HTML-Help. If you have RoboHELP Office, you should have both programs required for this conversion: RoboHELP Classic (WinHelp) and RoboHELP HTML Edition. Once converted, you either import the resulting HTML files to an existing project or open the HTML-Help project file (hhp) in RoboHELP HTML to start a new project.
First, let's look at third party conversion utilities