Impression to HTML
Dave Holden explains how to use our RISCWorld HTML converter
Impression to HTML
!Imp-HTML is Copyright © David Holden 2002-2007.
This program was originally written to enable me to convert long, multi-chapter, manuals for programs that were written in Impression into multi-file HTML documents complete with illustrations and an Index file with links to appropriate places in the text. Other people have a similar requirement and have probably found, like me, that the various other tools available to do this just don't do a very good job and require too much editing and alteration to the HTML afterwards.
Companion programs
There is a program to do the same for Ovation Pro. This being upgraded to work in a similar manner as this program and as soon as it is ready it will be made available.
There is also HTML-DTP which does the opposite, that is, it takes an HTML file or series of files and converts them into a form suitable for loading into Impression or Ovation Pro.
Finally there is HTM_link. This is used to insert image tags and links into HTML files and makes this very easy and avoids those silly mistakes we all make when 'hand coding' HTML.
What will it convert?
This program does not attempt to produce a DTP-like HTML file. This is simply not appropriate. HTML is not really a suitable medium for complex layout structures. The original aim was to convert program manuals, where the layout is fairly straightforward, and this it does very well,
Imp-HTML is not suited to 'fancy' documents with multi-column text running around lots of pictures or where formatting and layout are of prime importance. This can be done in HTML with frames, tables, specifying fonts and font sizes, etc. and some programs, especially on PCs, try to do this. What you usually end up with is a complicated, bloated, HTML structure that will look lovely in its target browser and platform (usually Internet Explorer on a PC) and take ages to render (if it renders at all) in other browsers on other platforms and often look nothing like the original anyway
Briefly, if you have something with more pictures than text, a complex layout and which requires specific fonts or font sizes then this program will not be able to reproduce the document in HTML. If it's a book-like document which is mainly text with or without illustrations then Imp-HTML will almost certainly do it.
How Imp-HTML works
The program does not use the original Impression document file. Instead it works with text files saved from Impression. If you use 'Save text story' and ensure that 'With styles' is ticked then Impression will save the complete text story with all its Styles and effects, so headings, bold and italicised text, etc. can be correctly interpreted.
The disadvantage of this system is that all graphics are lost. However, in practice this isn't really a problem. With a printed manual the writer will often include a lot of graphics that are not absolutely necessary simply because it's easy to do so. Too many graphics in an HTML document can make it slow to download or render and, because you don't have such exact control over layout, may make it look very messy.
So, by separating the conversion of the textual part of the document from the graphics you have precise control over what graphics are inserted, where, in what format and whether you wish to apply a scaling factor. For example, with a booklet it may be best to position a graphic slightly before or after the text which refers to it to suit page layout. As the reader will have the pages open before them when reading it won't matter if they have to look across at a facing page to see the picture. With HTML, if the picture is in the same place relative to the text, this might well place it completely out of sight when reading the document on screen.
What needs to be done
There are several main operations that must be performed when converting an Impression Text Story to HTML;
- Any 'special' characters such as the pound sign, < and >, etc. must be converted to their appropriate HTML equivalents.
- Bold, italic and centred text must have appropriate tags inserted.
- The various styles used in the Impression documents need to be 'mapped' to suitable HTML 'styles' so that headings, sub headings, etc. are given appropriate HTML equivalents.
- Create an 'index' which provides links to selected headings and sub headings and break the file up into chapters if required.
- Where necessary create 'list' structures, either 'bullet point' unordered lists or numbered ordered lists.
Some of these operations will be the same for almost all Impression files. For example, character translation and bold and italic effects. Others, such as mapping Styles to HTML and when to start a new chapter and which items should appear in the index file may be different for each Impression document.
All of these parameters are set in external text files within !Imp-HTML so they are not fixed. To cope with different Impression files you can have many different mapping files These are the files which define which Impression Styles are mapped to which HTML tags and which appear in the Index. These can be selected from a menu in Imp-HTML which also includes tools to make creating these files very simple.
Dave Holden
|