The document below teaches you how to decrypt password protected Open Document Format (ODF) files using Java.1
In particular, this works around an issue with the standard Java
crypto library's PBEKeySpec, which requires a UTF-8 encoded
char[]
-based password. Alas, ODF documents are encrypted
by digesting the user-entered password with SHA1, which produces an
all-binary password. That makes it impossible to use only the standard
Java crypto library for this process.
Our document provides a version of Matthias
Gärtner's PBKDF2 class and all remaining knowledge for
decrypting ODF documents. Thanks also go to Steven Elliot for an excellent analysis of the
processes and caveats involved, not to mention his oodecr
software.
NOTE: Beginning with the OASIS Open Document Format specification v1.2 the use of AES rather than Blowfish is encouraged for encrypting documents. Following this recommendation LibreOffice 3.5+ actually forces encryption of documents using AES.2 As of April 11, 2012 this document includes information pertaining to the relevant changes introduced by ODF v1.2.
Description: | How to decrypt password protected Open Document Format (ODF) files (such as used by LibreOffice and OpenOffice.org) using only pure Java code. |
Version: | 2012.04.11 |
Documentation: | Decrypting ODF Files.odt (28.8KiB) Decrypting ODF Files.pdf (260.0KiB) |
If you have lost your password for an encrypted document, there is virtually nothing that you can do to recover it: ODF is encrypted with a 128-bit cryptographic cipher (Blowfish or 256-bit AES). This is not just "secret decoder ring" junk, but serious crypto, the kind that on average requires thousands of years of time on super-computers to break. While there are known issues with how the ODF specification applies cryptographic algorithms, these attack vectors are generally not exploitable, except by security expects (and no, I'm nothing of the sort!) Yes, in order to address this weakness recent versions of LibreOffice insert random(?) data in an XML comment at the front of encrypted documents. This reduces the severity of the problem but does not elimitate the issue (there remains the XML header and comment introducer, 43 bytes in total).
So, that having been said, you have two options to recover your document, and only the first is likely to be of any value to mere mortals:
If you do want to give #1 a try (and why not, you have no other option), then you might find our ODFind tool useful: Its purpose is to search for text in OpenDocument files; when it encounters an encrypted document, it will prompt you for a password (over and over again if you don't give it one that works). Nothing can magically recover an encrypted document for you, but at least ODFind makes it as easy/fast as possible to type passwords over and over again as you try a bunch of them.
The Ringlord Technologies ODF Java Library is now available. It provides an implementation of the principles laid out in this document, is capable of parsing an ODF container's XML manifest, and provides all you need to build a tool to extract files from an ODF container. Note: For encrypted (password protected) documents you must have the password.
Of interest to programmers might be also the Apache ODF Toolkit. It doesn't decrypt anything, but does provide for a means to get at the meta-data for a document (and everything else, too). If you don't want to roll your own XML parsing to retrieve the salt, initialization vector, etc. for an encrypted document, then have a look at the Apache Foundation's ODF Toolkit!
__________All content is copyright © Ringlord Technologies unless otherwise stated. We do encourage deep linking to our site's pages but forbid direct reference to images, software or other non-page resources stored here; likewise, do not embed our content in frames or other constructs that may mislead the reader about the content ownership. Play nice, yes?