For a conversion of a scanned PDF to a neat, tidy, clean html file I recommend the following process. OCR recognition (recommend OmniPage Ultimate). Conversion to MS Word. Edit all layout errors (mid-line paragraphs, missing images, alignment, etc. ). Convert to HTML (best using a special dedicated software; recommend Word Cleaner by Freshideas). Tidy up using Dreamweaver or KompoZer (compare the result with the source html). Best,
See also Other sites you might find helpful:.