PDF to EPUB Conversion Tools Score Low on Accuracy in VIGC Test

June 6, 2012
TURNHOUT, BELGIUM—June 6, 2012—Publishers wanting to convert their libraries of books to eBooks may find it problematic, given a test run by the Flemish Innovation Center for Graphic Communication (VIGC) that found tools for converting PDFs to EPUB files scored an average of 30 percent accuracy, with the lowest-performing tool scoring just 10 percent. And even more significant: the four EPUB validation tools that VIGC used produced different results.
“There’s no doubt that the popularity of eBooks is soaring,” says Eddy Hagen, general manager at VIGC. “This trend leaves publishers facing a big challenge—they have to convert their whole back catalog to make them available as eBooks.
“An obvious approach is to take the print-ready PDF files and convert them to EPUB files, the standard file format for ebooks. For printers, this presents a potential new service they can offer their customers. On the internet you can find a lot of tools for converting PDFs to EPUB files—unfortunately, however, it’s not that straightforward.”
Conversion tools struggle to make the grade
The VIGC assessed 13 tools. The test began with a perfect PDF/X-4 file, which was converted via the different tools to an EPUB file. The next step was to validate the EPUB file with four different validation tools, to check if they conformed to the EPUB specifications. Then, all files were checked visually with five different EPUB viewers. In some cases there was a final conversion from EPUB to Amazon Kindle or to Apple iBooks.
“We used a very challenging print-ready test file that contained text and images,” explains Hagen. “And from a typographic point of view, we added all kinds of tricks. Eventually we ended up with a book covering nearly 30 pages. In total, we tested 65 different elements—from a simple italic, to OpenType functions like ligatures, through mathematical formulas. We didn’t expect any tool to register a perfect score, but at the same time we didn’t expect some tools to score as low as 10 percent.”
Ligatures prove a challenge
An important issue highlighted by the VIGC test is the difficulty in converting ligatures. A ligature is a “combined glyph”—two or sometimes three letters that have been joined together for aesthetic reasons.

 “A good example is the combination of the letters ‘f’ and ‘I’,” continues Hagen. “In the text we wrote, there were the word profiles where the ‘f’ and ‘i’ were replaced by one glyph, the ligature. But in some EPUB files we found ‘profles’ instead of profiles’—the  ‘i’ had been dropped.


