Tuesday, March 1, 2011

Android's binary XML

Hi !

During the creation of the APK file, the AndroidManifest.xml is converted into a binary XML file.

This format is not well documented, but you can find several tools to transform this binary format in classic xml like axml2xml or AXMLPrinter. You have also the aapt official tool to display the manifest into a human readable format :
$ aapt d xmltree yourapps.apk AndroidManifest.xml
However, Androguard is in python and I need to have more information directly. So I have translated the AXMLPrinter tool which is a java source code in a python source code. So in the latest commit, I added the support of the Android's binary XML, and a specific tool if you would like to use it externaly.

It's very useful to have quickly the permissions of the application, or different entry points (activity, service, receiver ..).

If you would like to use the API, it's very simple. The first thing to do is to import the dvm module, and to create a AXMLPrinter object :
import dvm

ap = dvm.AXMLPrinter( open("apks/tmp/AndroidManifest.xml", "r").read() )
print ap.getBuff()

from xml.dom import minidom
print minidom.parseString( ap.getBuff() ).toxml()
print minidom.parseString( ap.getBuff() ).toprettyxml()
The getBuff function returns the xml buffer, you can validate the xml buffer by using the minidom API.

In most cases, you have an APK file, and you would like to extract useful information from the AndroidManifest.xml. You can create an APK object :
import dvm

a = dvm.APK( "yourapps.apk" )
print a.xml[ "AndroidManifest.xml" ].toxml()
print a.xml[ "AndroidManifest.xml" ].toprettyxml()

# Get the classes.dex and continue the analysis ....
j = dvm.DalvikVMFormat( a.get_dex() )
And you can extract application permissions, activities, services, receivers ...
import dvm

a = dvm.APK( "yourapps.apk" )
print a.get_permissions()
print a.get_activity()
print a.get_service()
print a.get_receiver()
In fact the function "get_elements" is doing the job for you by making requests to the xml file :
def get_activity(self) :
return self.get_elements("activity", "android:name")
If you do not want to use the API, the androaxml.py program can help you :
$ ./androaxml.py -h
Usage: androaxml.py [options]

-h, --help show this help message and exit
-i INPUT, --input=INPUT
filename input (APK or android's binary xml)
-o OUTPUT, --output=OUTPUT
filename output of the xml
-v, --version version of the API

$ ./androaxml.py -i yourfile.apk -o output.xml
$ ./androaxml.py -i AndroidManifest.xml -o output.xml
The API must be updated to support more specific attributes, if you have bug, please report it in the bugtrack.

I did a simple video to show how to use this new feature :

See ya !


  1. i want to modify the binary xml ,such as add a permission,how can i do ? thank you very much!!!

  2. Hi,

    if you would like to read/write your binary xml, you should use apktool (http://code.google.com/p/android-apktool/) :)

  3. Where can I find more information about the Binary XML format? I'm going to write a parser in JavaScript. Thanks.

    1. You can find full implementation of the parser here: https://github.com/brutall/brut.apktool/blob/master/apktool-lib/src/main/java/brut/androlib/res/decoder/AXmlResourceParser.java . It supports all features and is well tested, because I have a base of thousands of users :-)

    2. Dude - APKTOOL work was abandoned, and a lot of its functionality is broken. Speaking from experience (I've tried to use it).