What a tortuous web we weave. My books are prepared using Open Office Writer and also Microsoft Office Word 2007. My editor works on them in Word 2007 and then returns the file for final editing and review. We then take the .doc files and import them (place files) into the templates in Adobe's InDesign CS4 to provide the typesetting in preparation for printing. When we are ready to print we use CS4 to produce a pdf file which is then sent to our printers. They work from the pdf file to print out book blocks. We use CS4 because it is great at producing accurate output and it also is great at getting rid of the various print fonts that Microsoft Word seems to want to maliciously slip into the body of your text without your knowledge and consent.
That all works fine.
Now we decided to publish an E-book version of our publications, Boy, does the immaturity of this branch of technology show or what?! Eventually we decided to use Amazon's Kindle format as a distribution platform. It takes a wide range of input formats and converts them to a Kindle format of e-book. The process is pretty slick, but has some bugs.
Firstly there is no point in using the pdf files prepared for a print house. The e-book readers are no great at handling all the different styles employed in paper printing. The book layout really has to be much more primitive. InDesign CS4 nicely produces an e-book format called .epub. That format is used by many ebook readers, but it isn't used by Kindle. So the first thing we did was to produce a different edition of our books for the e-versions, We stripped out text justification, tables etc ready for the ebooks. We removed things like always starting the next chapter on an an odd-numbered page. It just isn't relevant for modern day generic ebooks. Sure you can do this type of formatting for e-book readers, but you'd be tied to the physical screen size of a particular reader.
Great - now all we have to do is to zap out a pdf file and upload it to Amazon, then wait 2 days and Voila your e-book is available to the public. That is what appears to happen, but it doesn't. Your pdf file can look just fine on visual inspection, but when it is shredded by the Amazon conversion programs, odd things can happen. (I think the cause of this lays back somewhere in Microsoft Word.) We had Chapter headings mysteriously disappear from the e-book. When I inspected downloaded files from Amazon, the xhtml code generated by the Amazon programs things started looking worse. Lines of text that should have been continuous were terminated with a [br /] html code for a new line. In some cases where the [p ... /p] paragraph markers were expected they wouldn't be present. What was worse was that there was no apparent pattern to the failures. The pdf & InDesign stuff looked fine, it certainly printed okay.
Amazon's help files suggested that you manually edited the HTML code and then reloaded it on to their site. If your book is 180,000 words long that just isn't workable.
After a Sunday spent investigating a way around this we think we've found a way through!! I'll resubmit the upload to Amazon, and let you know how it works out.
Here's a hint.
Here's a hint.