Book Reversions — Kindle to Word Conversion

My wife, Margaret Watson, requested reversions for 22 older Harlequin/Silhouette books that were no longer in print.  Harlequin granted this request on 10 books and immediately removed the books from on-line marketplaces.

We asked Harlequin for digital copies so that we could easily make them available for sale.  They said they charge $500 per book.

This seemed liked a lot, since we just wanted an email with 10 files attached.  Five thousand dollars?

Since I’m retired, I have time to figure out how to re-create digital versions of hard-copy books.  I describe this process in a separate post.

We’re hoping Harlequin will soon grant reversion on the 12 other books that are still listed for sale for $3.99 each.  When they do, I don’t want to lose my chance to buy digital versions for $3.99 apiece.

So I bought all 12 books from Amazon and researched how to convert them to Microsoft Word format.  It turned out to be quite simple.

First I “bought” the “Kindle for PC” software from Amazon.  It’s free.  You buy it, download it, and install it on your PC so that you can download books in digital format.

Then you buy the Kindle version of your Harlequin book from Amazon.  This is where you need to fork over $3.99 per book.  But that beats $500.

Then use your PC (or Mac) to launch the Kindle app.   You should see your newly purchased book listed in the library.  By doubleclicking the book, it gets downloaded to your PC.  It’s then marked with a checkmark in the library.  These downloads are stored in folder “Documents\My Kindle Content\” in Kindle format (*.azw) and named using Amazon’s unique ASIN ID.

Now you just need to convert from Kindle format to Word.  My research indicated that Calibre software could do this conversion.  Calibre is shareware that I already had installed on my PC.  It lets you read and convert digital books (e.g., EPUB, MOBI, etc).

But when I tried converting, I found that DRM caused a problem.  DRM (Digital Rights Management) is technology used to prevent piracy.  Additional research suggested that there’s a “DeDRM” plugin for Calibre from “Apprentice Alf’s” that can remove the protection so that the file can be read and converted.

I doubted that it could be that easy to thwart the DRM protection, but I downloaded DeDRM_Plugin.zip and followed the instructions to install it.  It worked like a charm.

To complete the conversion from Kindle to Word, launch the Calibre software, use the “Add Books” button, point to the Kindle book folder, and select the book (ASIN ID).  After the book appears in the Calibre library, use the “Convert Books” icon to convert to DOCX format.  The Word document will be found in the Calibre book folder — “Documents\Calibre Library\AuthorName\BookName\”.  I even found I could add ten books at once to Calibre (hold control key while selecting books) and convert 10 books at once using the Bulk convert.  I was pleasantly surprised how smoothly this conversion went.

Book Reversions – Hardcopy to Word Version

My wife, Margaret Watson, recently got reversions of 10 older Harlequin/Silhouette books.  This means she now has the right to publish them herself.  Unfortunately, by the time she was notified, digital copies of these books were no longer available because Silhouette had already removed them from on-line sites.

She asked Harlequin to send her digital versions, but they wanted $500 per book.  So we decided to create digital versions from the hardcopy books.

In 2011, we used Blue Leaf Book Scanning to digitize a hard copy book for about $25.  You mail them the book, they cut off the spine, scan it and digitize it using OCR (Optical Character Recognition).  Then you download the Word format from their site.

To digitize these 10 Silhouette books, we used 1DollarScan.com, which costs $6 per book.  I was initially disappointed by the OCR accuracy, so paid an additional $6 for their high quality OCR option.  This feature, which they call HQT (High Quality Touchup), produced very good results.  Before applying the OCR engine, it slightly rotated any page that was tilted, to improve character recognition.

I downloaded the books from their website in pdf format.  The files contained both the scanned pages, plus the OCR results (i.e. text).

I first used copy/paste to put the text into Word, but the paragraph breaks did not come across.  The OCR engine does not interpret blank lines or indentation as anything.

I searched online and found a free utility called UniPDF, which is a PDF to Word converter.  It was easy to use.  From each pdf, it produced a Word document with the proper paragraph breaks.

Silhouette indicated scene changes in our books by inserting a blank line between paragraphs.  The OCR doesn’t pick this up, so I had to review the hardcopy and manually add scene changes as a line with “***”.

The OCR was very accurate, but not perfect.  Since I had extensive work experience writing Microsoft Excel macros, I wrote Word macros (VBA) to help format the book.

One macro removed all the page headings.  Where each heading was found, it inserted a Word comment with the old page number.  This allowed me to more easily cross-reference the word document to the physical book during my review.  At the end of the review, all comments were removed with a single command.

Other macros were written to initialize styles, replace double paragraph breaks, and format chapter headings.  The macros were written just once, stored in the Normal.dot, then used to prepare all the books.

Other formatting issues were handled in a less automated fashion, to avoid inadvertently making improper changes.  As I worked through the initial books, I created a log in a spreadsheet tracking what issues should be addressed, and in what order.  I continually refined the log and used it as I began formatting each new book.

For example, the log included how to deal with contractions (e.g., didn’t, could’ve).  Contractions were an issue because the OCR generally inserted a space after each single quote.  My log listed all the contractions (‘s, ‘t, ‘d, ‘ve, ‘ll, ‘re) that I needed to address.  Word’s repeated find/replace feature worked well here.

Other log issues included ellipsis, end-of-line dashes, em dashes, I’s interpreted as 1’s, double spaces, double single quotes instead of single double quotes, end-of-paragraph issues, end-of-sentence issues, proper double space after period, …

And there were some pure OCR issues.  For example, “corner” was often interpreted as “comer”.  And “barn” sometimes came out as “bam”.  Once I had my checklist, I used it to review and fix each book.

I was pleased to see that italics were properly interpreted during the scanning and OCR process.

My initial review took about 4 hours per book.  Then I’d do a complete edit, reading the entire book, which took about two days.  I’m not a particularly fast reader, and I usually found errors from the original hardcopy book.

After my review, I turned each book over to my wife, who spent 1 ½ days reviewing.  So in total, we spent about 4 days of effort to convert each hardcopy book into a digital version.  Plus $12 for scanning.  This excludes the final formatting, front and back matter, and conversion needed before uploading to the digital platforms (e.g., Amazon).

We expect five of these books to be up for sale by mid-October 2016.  Look for the Cameron Utah series.

 

Solitaire

My husband is THE BEST. I got a new computer yesterday – my old one is dying a slow, painful death and needed to be put out of her misery. I love the new one, except for one thing. It didn’t have the old Microsoft version of Solitaire. The version on the new computer was just wrong.


Solitaire is the mindless thing I do to get ready to write. Also the mindless thing I do when I’m stuck on a plot twist or a character. So it’s crucial to my writing process.


My wonderful husband went into my old computer, found the file, and somehow managed to get it on the new computer. It took him a long time. But I’m now a very happy writer.

Favicon

A favicon is a small file (16 x 16 pixels) used to enhance the website URL shown at the top of the browser. Since space is limited, authors often use their initials (e.g., Stephen King, Courtney Milan).  This website uses my wife’s initials (MW).  Not all browsers will show the small icon (e.g., Android Chrome does not).

Many WordPress websites overlook this simple, but useful branding technique.  And it’s easy to implement.

Create the favicon.ico file using software or a free online tool.   I used http://www.favicon.cc/.  After downloading the icon file to my PC, I copied it to our website (using FileZilla’s ftp), placing it in the public_html folder.

That’s all there is to it.  At first the icon didn’t appear, but I typed http://margaretwatson.com/favicon.ico in the browser address, F5 to refresh, and then it worked.

CreateFavicon

Technical Topics

I’m the author’s assistant (aka husband).  My wife’s a great writer, but she hates blogging.  She’s finishing up the fifth book (Cover Me) in her Donovan Family series.  It’ll be out in April.  While she’s focusing on that, I’ll get her blog started.

I retired recently and offered to help with her writing business — getting into bed with her, so to speak.  There’s much to learn.  My accounting background didn’t help much with the skills we needed —  website development, book editing, publishing, book promotions, etc.

We’ve learned a great deal, but still feel like beginners in many areas.  When we get stuck, we google things.  I’m thankful to all the people who took time to share their knowledge and post things on the internet.  We’re also grateful to Novelists Inc (NINC) for the valuable information shared at their conferences.

We’d like to give back to the community.  Even though we’re novices, we can still share what we’ve learned.  So over the next year, I’ll document my findings on various subjects (examples below).  When I’m done, the information will be obsolete, and I can start over.

Book Formatting

  • Section breaks vs. page breaks
  • Styles
  • Font and size
  • Converting MS Word to ePub and Mobi formats
  • Caliber software
  • Atlantis software
  • Backmatter
  • TOC
  • Scene changes
  • Start bookmark
  • Hard tabs vs. styles

Book Covers

Uploading Books

  • iBooks Author vs. iProducer
  • ISBN’s and Bowker
  • Amzn, Apple, Kobo, Nook, Google

Print on Demand (CreateSpace)

Book Pricing

Promotions

  • KDP Select
  • BookBub
  • Library Thing
  • FreeBooksy
  • First in Series Free
  • Facebook
  • Newletters (MailChimp)
  • Giveaways (Rafflecopter)
  • Book reviews

WebSite

  • Content management system (CMS) – WordPress
  • Hosting (BlueHost)
  • SQL to update WordPress database
  • MySQL Workbench
  • Favicon
  • cPanel
  • Caching
  • Templates
  • FTP (FileZilla)
  • Plugins
  • Affiliates program
  • Search engine optimization (SEO)
  • Analytics (Google Analytics)

Social Media

Automation of sales data collection (Amzn, Apple, Kobo, Nook)