The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Looking for Trustworthy Alternatives to Adobe PDFs

Share:

There was a day when PDFs were the safe, portable alternative to Microsoft Word documents.  There was no chance of macro-virus infections, and emails to Spaf with PDFs didn’t bounce back as they did if you sent him a Word document.  It became clear that PDFs adopted mixed loyalties by locking features down and phoning home.  Embedded content caused security issues in PDF viewers (CVE-2007-0047, CVE-2007-0046, CVE-2007-0045, CVE-2005-1306, CVE-2004-1598, CVE-2004-0194, CVE-2003-0434) including a virus using JavaScript as a distribution vector (CVE-2003-0284).  Can you call safe a document viewer that stands in such company as Skype, Mozilla Firefox, Thunderbird, Netscape Navigator, Microsoft Outlook, and Microsoft Outlook Express [1] with a CVSS score above 9 (CVE-2007-5020)?  How about PDFs that can dynamically retrieve Yahoo ads over the internet [2], whereas Yahoo has recently been tricked into distributing trojans in advertisements [3]?  Fully functional PDF viewers are now about as safe and loyal (under your control) as your web browser with full scripting enabled.  That may be good enough for some people, but clearly falls short for risk-averse industries.  It is not enough to fix vulnerabilities quickly;  people saying that there’s no bug-free software are also missing the point.  The point is that it is desirable to have a conservative but functional enough document viewer that does not have a bullseye painted on it by attempting to do too much and be everything to everyone.  This can be stated succinctly as “avoid unnecessary complexity” and “be loyal to the computer owner”.

Whereas it might be possible to use a PDF viewer with limited functionality and not supporting attack vectors, the format has become tainted—in the future more and more people will require you to be able to read their flashy PDF just as some webmasters now deny you access if you don’t have JavaScript enabled.  Adobe has patents on PDF and is intent on keeping control and conformance to specifications;  Apple’s MacOS X PDF viewer (“Preview”) initially allowed printing of secured PDFs to unsecured PDFs [4].  That was quickly fixed, for obvious reasons.  This is as it should be, but it highlights that you are not free to make just any application that manipulates PDFs.

Last year Adobe forced Microsoft to pull PDF creation support from Office 2007 under the threat of a lawsuit while asking them to “charge more” for Office [5].  What stops Adobe from interfering with OpenOffice?  In January 2007 Adobe released the full PDF (Portable Document Format) to make PDF an ISO standard [6].  People believe: “Anyone may create applications that read and write PDF files without having to pay royalties to Adobe Systems”, but that’s not quite true.  These applications must conform to the specification as decided by Adobe.  Applications that are too permissive or somehow irk Adobe could possibly be made illegal, including open source ones, at least in the US.  It is unclear how much control Adobe still has (obviously enough for the Yahoo deal) and will still have when and if it becomes an ISO standard.  Being an ISO standard does not make PDFs necessarily compatible with free software.  If part of the point of free software is to be able to change it so that it is fully loyal to you, then isn’t it a contradiction for free software to implement standards that mandate and enforce mixed loyalties?

Finally, my purchase of the full version of Adobe Acrobat for MacOS X was a usability disaster;  you’ll need to apply duress to make me use Acrobat again.  I say it’s time to move on to safer ground, from security, legal, and code quality perspectives, ISO standard or not. 

How then can we safely transmit and receive documents that are more than plain text?  HTML, postscript, and rich-text (rtf) are alternatives that have been disused in favor of PDF for various reasons which I will not analyze here.  Two alternatives seemed promising:  DVI files and Microsoft XPS, but a bit of research shows that they both have significant shortcomings.

Tex (dvi): TeX is a typesetting system, used to produce DVI (Device independent file format) files.  TeX is used mostly in academia, by computer scientists, mathematicians or UNIX enthusiasts.  There are many TeX editors with various levels of sophistication; for example OpenOffice can export documents to .tex files, so you can use even a common WYSIWYG text editor.  Tex files can be created and managed on Windows [7], MacOS X and Linux.  TeX files do not include images but have tags referencing them as separate files;  you have to manage them separately.  Windows has DVI viewers, such as YAP and DVIWIN

However, in my tests OpenOffice lost references to embedded images, producing TeX tags containing errors (”[Warning: Image not found]”).  The PDF export on the same file worked perfectly.  Even if the TeX export worked, you would still have a bunch of files instead of a single document to send.  You then need to produce a DVI file in a second step, using some other program. 

Even if OpenOffice’s support of DVI was better, there are other problems.  I have found many downloadable DVI documents that could not be displayed in Ubuntu, using “evince”;  they produced the error “Unable to open document—DVI document has incorrect format”.  After installing the “advi” program (which may have installed some fonts as well), some became viewable both using evince and advi.  DVI files do not support embedded fonts;  if the end user does not have the correct fonts your document will not be displayed properly. 

Another issue is that of orphaned images.  Images are missing from dvi downloads such as this one;  at some point they were available as a separate download, but aren’t anymore.  This is a significant shortcoming, which is side-stepped by converting DVI documents to PDF;  however this defeats our purpose.

Microsoft XPS: XPS (XML Paper Specification) documents embed all the fonts used, so XPS documents will behave more predictably than DVI ones.  XPS also has the advantage that

“it is a safe format. Unlike Word documents and PDF files, which can contain macros and JavaScript respectively, XPS files are fixed and do not support any embedded code. The inability to make documents that can literally change their own content makes this a preferable archive format for industries where regulation and compliance is a way of life” [8].

Despite being an open specification, there is no support for it yet in Linux.  Visiting Microsoft’s XPS web site and clicking on the “get an XPS viewer” link results in the message “This OS is not supported”.

It seems, however, that Microsoft may be just as intent on keeping control of XPS as Adobe for PDFs;  the “community promise for XPS” contains an implicit threat should your software not comply “with all of the required parts of the mandatory provisions of the XPS Document Format” [9].  These attached strings negate some advantages that XPS might have had over PDFs.

XPS must become supported on alternative operating systems such as Linux and BSDs, for it to become competitive.  This may not happen simply because Microsoft is actively antagonizing Linux and open source developers with vague and threatening patent claims, as well as people interested in open standards with shady lobbying moves and “voting operations” [10] at standards organizations (Microsoft: you need public support and goodwill for XPS to “win” this one).  The advantages of XPS may also not be evident to users comfortable in a world of TeX, postscript, and no-charge PDF tools.  The confusion about open formats vs open standards and exactly how much control Adobe still has and will still have when and if PDF becomes an ISO standard does not help.  Companies offering XPS products are also limiting their possibilities by not offering Linux versions, at least of the viewers, even without support. 

In conclusion, PDF viewers have become risky examples of mixed loyalty software.  It is my personal opinion that risk-averse industries and free software enthusiasts should steer clear of the PDF standard, but there are currently no practical replacements.  XPS faces extreme adoption problems, not simply due to the PDF installed base, but also due to the ill will generated by Microsoft’s tactics.  I wish that DVI was enhanced with included fonts and images, better portability, and better integration within tools like OpenOffice, and that this became an often requested feature for the OpenOffice folks.  I don’t expect DVI handlers to be absolutely perfect (e.g., CVE-2002-0836), but the reduced feature set and absence of certain attack vectors should mean less complexity, fewer risks and greater loyalty to the computer owner.

1. ISS, Multiple vendor products URI handling command execution, October 2007.  http://www.iss.net/threats/276.html

2. Robert Daniel, Adobe-Yahoo plan places ads on PDF documents, November 2007.  http://www.marketwatch.com/news/story/adobe-yahoo-partner-place-ads/story.aspx?guid=%7B903F1845-0B05-4741-8633-C6D72EE11F9A%7D

3. Bogdan Popa, Yahoo Infects Users’ Computers with Trojans - Using a simple advert distributed by Right Media, September 2007.  http://news.softpedia.com/news/Yahoo-Infects-Users-039-Computers-With-Trojans-65202.shtml

4. Kurt Foss, Web site editor illustrates how Mac OS X can circumvent PDF security, March 2002.  http://www.planetpdf.com/mainpage.asp?webpageid=1976

5. Nate Mook, Microsoft to Drop PDF Support in Office, June 2006.  http://www.betanews.com/article/Microsoft_to_Drop_PDF_Support_in_Office/1149284222

6. Adobe Press release, Adobe to Release PDF for Industry Standardization, January 2007.  http://www.adobe.com/aboutadobe/pressroom/pressreleases/200701/012907OpenPDFAIIM.html

7. Eric Schechter, Free TeX software available for Windows computers, November 2007.  http://www.math.vanderbilt.edu/~schectex/wincd/list_tex.htm

8. Jonathan Allen, The wide ranging impact of the XML Paper Specification, November 2006.  http://www.infoq.com/news/2006/11/XPS-Released

9. Microsoft, Community Promise for XPS, January 2007.  http://www.microsoft.com/whdc/xps/xpscommunitypromise.mspx

10. Kim Haverblad, Microsoft buys the Swedish vote on OOXML, August 2007.  http://www.os2world.com/content/view/14868/1/

Comments

Posted by nixps
on Monday, December 3, 2007 at 11:40 PM

I think you’re a bit harsh on Microsoft about XPS.

Microsoft communicates very clearly that XPS is intented to be an open standard, and actively promotes this. I can tell you first hand as a third party XPS tool developer that Microsoft atively supports our efforts.

The specifications are also open, and free for all to implement. As we are doing on Windows and Mac OS X.

And lastly: ECMA TC46 (http://www.ecma-international.org/memento/TC46.htm). Microsoft has submit the XPS spec for standardization with ECMA, which again illustrates the ‘open’/‘non-proprietary’ intentions Microsoft has with XPS.

Posted by Peter Hesse
on Tuesday, December 4, 2007 at 03:32 AM

A few comments on your post:
- You wrote “Then, it became clear that PDFs adopted mixed loyalties and were disloyal to the computer owner by locking features down and phoning home.”  Locking features down and phoning home, are not functions of the PDF standard, but rather of the Adobe Reader.  There are other reader programs out there which do not lock features, phone home, nor support javascript or downloads of ads.  Check here: http://en.wikipedia.org/wiki/List_of_PDF_software

- You wrote “Last year Adobe forced Microsoft to pull PDF creation support from Office 2007 under the threat of a lawsuit while asking them to “charge more” for Office”.  This is somewhat true, as Microsoft does not include PDF creation in the base installation, but it is still available as an add-on from Microsoft for free: http://www.microsoft.com/downloads/details.aspx?FamilyId=4D951911-3E7E-4AE6-B059-A2E79ED87041

- You wrote “Whereas it might be possible to use a PDF viewer with limited functionality and not supporting attack vectors, the format has become tainted”.  Again these are all optional portions of the PDF standard.  There is a separate stripped-down PDF standard called PDF/A which removes all the “active” portions of PDFs and the result is a better archival format.  More info at http://www.pdfa.org or http://en.wikipedia.org/wiki/PDF/A

Overall I agree with your concerns, but I think that the community be pushing for PDF/A to become the document standard of choice (as it is compatible with the widely used PDF) and to create or buy reader applications which only support the PDF/A standard.

Posted by Andy
on Tuesday, December 4, 2007 at 03:47 AM

Microsoft’s viewer is a non-starter for other reasons.  Because “XPS documents support digital signatures and information rights management of the contents” the threat your quoted is actually stronger in the following paragraphs and has real teeth.  What’s funny is that even the specification is delivered as executable for windows platforms.

Posted by Pascal Meunier
on Tuesday, December 4, 2007 at 07:46 AM

@nixps, Thanks, that’s nice to know.  However, as long as Microsoft retains control of the XPS specification and the license terms enforce compliance, tomorrow Microsoft can change it to something much less appealing, and you’ll have to comply.  That’s a significant business risk.

Posted by Pascal Meunier
on Tuesday, December 4, 2007 at 08:04 AM

Peter,
Thank you very much for your comments.  I was unaware of the PDF/A-1 ISO 19005-1 standard.  It does seem to address my concerns satisfactorily, from what you say.  In that case, it deserves more publicity and awareness, because I never saw a mention or hint of it anywhere while researching my post.

Posted by Pascal Meunier
on Tuesday, December 4, 2007 at 08:29 AM

@Andy, good point, thanks.

Posted by stacy
on Tuesday, December 4, 2007 at 12:44 PM

djvu?

http://djvu.sourceforge.net/

Unfortunately there is no direct export from OpenOffice to djvu, but you can convert PDF to djvu.

Posted by MS XPS poised to take over Adobe PDF? No. | Rick T
on Friday, December 14, 2007 at 08:29 PM

[...] Windows, has no Linux version yet, and costs $99 to viewon a Mac box, or $399 to write. Despite the problems of PDFfiles, in this field it’s still the dominant player and will be for a long time. After all [...]

Posted by Antonio
on Tuesday, December 18, 2007 at 06:51 AM

[ Posted previously in http://slashdot.org/comments.pl?sid=177969&threshold=5&mode=nested&commentsort=0&op=Change ]

PDF is not Adobe?

Tell that to Dmitri Skylarov.

Like it or not, to download the PDF spec, you have to agree not to “violate” the DRM, among other things. Of course, you could try to clean room reverse engineer it, but that would kill the portable part fairly quickly, since the DMCA would most likely cover “circumventing DRM” even in a clean room implementation.

De facto, PDF == Adobe.

Also, PDFs are not made to simply represent the print layout. While that is their most beneficial feature, PDF does a lot more. It provides bookmark navigation and can be used to reformat the document to different page sizes when the document is properly generated.

As for “read only”, well, I’ve been paid hourly to modify a PDF’d contract prior to signing (which was perfectly legal and delightfully unexpected by the other party). Once of the happiest moments in my life was removing the section that said the contract was void if it was modified. It was an eye-opening and kind of surreal moment. It was also the first time I ever heard a lawyer giggle…

From a technical perspective (having tried to manually work with PDF at a file level) its horrible. The format more closely resembles FAT than PostScript (contrary to popular belief—I am painfully serious about this). It’s broken into blocks with a weird allocation table. Originally, it appears the idea was to make it editable (although “editing” a PDF in anything is pretty painful). As such, even though I don’t currently recommend much other than PDF for my customers, I don’t feel very much love towards it either.

In the spirit of offering solutions instead of only complaints, I like SVG quite a bit, SVG-P (standard with SVG 2.0) more, and actually find XSL-FO the easiest to work with.I currently crank out a few invoices per month and some finanacial reports with XSL-FO and FOP. Even though they end up in PDF, I really wish XSL-FO was the de facto standard instead of PDF…

Leave a comment

Commenting is not available in this section entry.