Unpacking Android backups

One of the less known new features introduced in ICS is the ability to backup a device to a file on your computer via USB. All you have to do is enable USB debugging, connect your phone to a computer and type the adb backup command in a shell. That will show a confirmation dialog on the phone prompting you to authorize the backup and optionally specify a backup encryption password. It looks something like this:


This doesn't require rooting your phone and lets you backup application data, both user installed and system applications (APK's), as well as shared storage (SD card) contents. There are some limitations though: it won't backup apps that have explicitly forbidden backups in their manifest, it won't backup protected (with DRM) apps and it won't backup some system settings such as APN's and WiFi access points. The transfer speed is limited by ADB channel speed (less than 1MB/s), so full backups can take quite some time. There is also a rather annoying bug in 4.0.4 where it will backup shared storage even if you don't request it. With all that said, it's a very useful tool, and will hopefully see some improvements in the next Android version.

The backup command is fairly flexible and lets you specify what apps to backup, whether to include system apps when doing a full backup, and whether to include shared storage (SD card) files. Here's a summary of the available options as displayed by adb's usage:

adb backup [-f ] [-apk|-noapk] [-shared|-noshared] [-all] [-system|-nosystem] [<packages...>]

- write an archive of the device's data to <file>. If no -f option is supplied then the data
is written to "backup.ab" in the current directory.

(-apk|-noapk enable/disable backup of the .apks themselves in the archive;
the default is noapk.)
(-shared|-noshared enable/disable backup of the device's shared storage / SD card contents;
the default is noshared.)
(-all means to back up all installed applications)
(-system|-nosystem toggles whether -all automatically includes system applications;
the default is to include system apps)
(<packages...> is the list of applications to be backed up. If the -all or -shared flags
are passed, then the package list is optional. Applications explicitly given on the command
line will be included even if -nosystem would ordinarily cause them to be omitted.)

The restore command is however quite limited -- there are no options, you can only specify the path to the backup file. One of the features most noticeably lacking is conditional restore: restores are all or nothing, you cannot restore only a subset of the apps (packages), or restore only the contents of the shared storage. Supporting this will require modifying the firmware, but you can extract only the needed data from the backup and copy it manually. Copying apps and app data to your device requires root access, but extracting and copying external storage files such as pictures and music can be done on any stock ICS device. And if you create a backup file containing only the files you need to restore, you wouldn't need root access at all. This post will present the format of Android's backup files and introduce a small tool that allows you to extract and repackage them as needed.

SDK API's for using Android's backup architecture were announced as far back as Froyo (2.2), but it has probably been available internally even before that. As introduced in Froyo, it uses a proprietary Google transport to backup application settings to the "cloud". ICS adds a local transport that lets you save backups to a file on your computer as well. The actual backup is performed on the device, and is streamed to your computer using the same protocol that adb pull uses to let you save a device file locally. When you execute the adb backup command a new Java process (not an activity or service) will be started on your device and it will bind to the system's BackupManagerService and requests a backup with the parameters you specified. BackupManagerService will in turn start the confirmation activity shown above, and execute the actual backup if you confirm (some more details including code references here). You have the option of specifying an encryption password, and if your device is already encrypted you are required to enter the device encryption password to proceed. It will be used to encrypt the archive as well (you can't specify a separate backup encryption password).

After all this is done, you should have a backup file on your computer. Let's peek inside it. If you open it with your favourite editor, you will see that it starts with a few lines of text, followed by binary data. The text lines specify the backup format and encryption parameters, if you specified a password when creating it. For an unencrypted backup it looks like this:

ANDROID BACKUP
1
1
none

The first line is the file 'magic', the second the format version (currently 1), the third is a compression flag, and the last one the encryption algorithm ('none' or 'AES-256').

The actual backup data is a compressed and optionally encrypted tar file that includes a backup manifest file, followed by the application APK, if any, and app data (files, databases and shared preferences). The data is compressed using the deflate algorithm, so, in theory, you should be able to decompress an unencrypted archive with standard archive utilities, but I haven't been able to fine one compatible with Java's Deflater (Update: here's how to convert to tar using OpenSSL's zlib command: dd if=mybackup.ab bs=24 skip=1|openssl zlib -d > mybackup.tar). After the backup is uncompresed you can extract it by simply using tar xvf mybackup.tar. That will produce output similar to the following:

$ tar tvf mybackup.tar
-rw------- 1000/1000 1019 2012-06-04 16:44 apps/org.myapp/_manifest
-rw-r--r-- 1000/1000 1412208 2012-06-02 23:53 apps/org.myapp/a/org.myapp-1.apk
-rw-rw---- 10091/10091 231 2012-06-02 23:41 apps/org.myapp/f/share_history.xml
-rw-rw---- 10091/10091 0 2012-06-02 23:41 apps/org.myapp/db/myapp.db-journal
-rw-rw---- 10091/10091 5120 2012-06-02 23:41 apps/org.myapp/db/myapp.db
-rw-rw---- 10091/10091 1110 2012-06-03 01:29 apps/org.myapp/sp/org.myapp_preferences.xml

App data is stored under the app/ directory, starting with a _manifest file, the APK (if requested) in a/, app files in f/, databases in db/ and shared preferences in sp/. The manifest contains the app's version code, the platform's version code, a flag indicating whether the archive contains the app APK and finally the app's signing certificate (called 'signature' in Android API's). The BackupManagerService uses this info when restoring an app, mostly to check whether it has been signed with the same certificate as the currently installed one. If the certificates don't match it will skip installing the APK, except for system packages which might be signed with a different (manufacturer owned) certificate on different devices. Additionally, it expects the files to be in the order shown above and restore will fail if they are out for order. For example, if the manifests states that the backup includes an APK, it will try to read and install the APK first, before restoring the app's files. This makes perfect sense -- you cannot restore files for an app you don't have installed. However BackupManagerService will not search for the APK in the archive, and if it is not right after the manifest, all other files will be skipped. Unfortunately there is no indication about this in the device GUI, it is only shown as logcat warnings. If you requested external storage backup (using the -shared option), there will also be a shared/ directory in the archive as well, containing external storage files for each shared volume (usually only shared/0/ for the first/default shared volume).

If you specified an encryption password, things get a little more interesting. It will be used to generate an AES-256 key using 10000 rounds of PBKDF2 with a randomly generated 512 bit salt. This key will be then used to encrypt a randomly generated AES-256 bit master key, that is in turn used to encrypt the actual archive data in CBC mode ("AES/CBC/PKCS5Padding" in JCE speak). A master key checksum is also calculated and saved in the backup file header. All this is fairly standard practice, but the way the checksum is calculated -- not so much. The generated raw master key is converted to a Java character array by casting each byte to char, the result is treated as a password string, and run through the PBKDF2 function to effectively generate another AES key, which is used as the checksum. Needless to say, an AES key would most probably contain quite a few bytes not mappable to printable characters, and since PKCS#5 does not specify the actual encoding of a password string, this produces implementation dependent results (more on this later). The checksum is used to verify whether the user-specified decryption password is correct before actually going ahead and decrypting the backup data: after the master key is decrypted, its checksum is calculated using the method described and compared to the checksum in the archive header. If they don't match, the specified password is considered incorrect and the restore process is aborted. Here's the header format for an encrypted archive:

ANDROID BACKUP
1
1
AES-256
B9CE04167F... [user password salt in hex]
9C44216888... [master key checksum salt in hex]
10000 [number of PBKDF2 rounds]
990CB8BC5A... [user key IV in hex]
2E20FCD0BB... [master key blob in hex]

The master key blob contains the archive data encryption IV, the actual master key and its checksum, all encrypted with the key derived from the user-supplied password. The detailed format is below:

[byte] IV length = Niv
[array of Niv bytes] IV itself
[byte] master key length = Nmk
[array of Nmk bytes] master key itself
[byte] MK checksum hash length = Nck
[array of Nck bytes] master key checksum hash

Based on all this info, it should be fairly easy to write a simple utility that decrypts and decompresses Android backups, right? Porting relevant code from BackupManagerService is indeed fairly straightforward. One thing to note is that it uses SYNC_FLUSH mode for the Defalter which is only available on Java 7. Another requirement is to have the JCE unlimited strength jurisdiction policy files installed, otherwise you won't be able to use 256 bit AES keys. Running the ported code against an unencrypted archive works as expected, however trying do decrypt an archive consistently fails when checking the master key checksum. Looking into this further reveals that Android's PBKDF2 implementation, based on Bouncy Castle code, treats passwords as ASCII when converting them to a byte array. The PKCS#5 standard states that 'a password is considered to be an octet string of arbitrary length whose interpretation as a text string is unspecified', so this is not technically incorrect. However since the 'password' used when calculating the master key checksum is a randomly generated value (the AES key), it will obviously contain bytes not mappable to ASCII characters. Java SE (Oracle/Sun) seems to treat those differently (most probably as UTF-8), and thus produces a different checksum. There are two ways around this: either use a Bouncy Castle library with the Android patches applied, or implement an Android-compatible PBKDF2 function in our decryption code. Since the Android Bouncy Castle patch is quite big (more than 10,000 lines in ICS), the second option is clearly preferable. Here's how to implement it using the Bouncy Castle lower level API's:

SecretKey androidPBKDF2(char[] pwArray, byte[] salt, int rounds) {
PBEParametersGenerator generator = new PKCS5S2ParametersGenerator();
generator.init(PBEParametersGenerator.PKCS5PasswordToBytes(pwArray),
salt, rounds);
KeyParameter params = (KeyParameter)
generator.generateDerivedParameters(PBKDF2_KEY_SIZE);

return new SecretKeySpec(params.getKey(), "AES");
}

This seems to do the trick, and we can now successfully decrypt and decompress Android backups. Extracting the files is simply a matter of using tar. Looking at the archive contents allows you to extract certain files that are not usually user accessible, including app databases and APK's without rooting your phone. While this is certainly interesting, a more useful scenario would be to restore only a part of the archive by selecting only the apps you need. You can do this by deleting the ones you don't need, repacking the archive and then using adb restore with the resulting file. There are two things to watch out for when repacking though: Android expects a particular ordering of the files, and it doesn't like directory entries in the archive. If the restore process finds a directory entry, it will silently fail, and if files are out of order, some files might be skipped even though the restore activity reports success. In short, simply tarring the unpacked backup directory won't work, so make sure you specify the files to include in the proper order by creating a backup file list and passing to tar with the -T option. The easiest way to create one is to run tar tvf against the decompresed and decrypted original backup. Once you create a proper tar file, you can pack it with the provided utility and feed it to adb restore. Another thing you should be aware of is that if your device is encrypted, you need to specify the same encryption password when packing the archive. Otherwise the restore will silently fail (again, error messages are only output to logcat). Here's how to pack the archive using the provided shell script:

$ ./abe.sh pack repacked.tar repacked.ab password

Full code for the backup pack/unpack utility is on github. Keep in mind that while this code works, it has very minimal error checking and might not cover all possible backup formats. If it fails for some reason, expect a raw stack trace rather than a friendly message. Most of this code comes straight from Android's BackupManagerService.java with (intentionally) minor modifications. If you find an error, feel free to fork it and send me a pull request with the fix.