28 January, 2015

Android Decompilation Testing

What’s an APK file?
Android application package (APK) is the package file format to install application software on Google’s Android operating systems. APKs are similar to installers where code is first compiled and its parts are packaged into one file. An APK file contains program’s code (*.dex files), manifest file, resources, assets and certificates.

An APK is simply an archive file with a set of classes and resources. The compiled Dalvik executable bytecode will be available in classes.dex file. Classes.dex file contains compiled executable code - classes & methods having business logic for problem being solved. One can convert classes.dex file (which is in machine readable format) to JAR file (which is human readable) with the help of tools available for this purpose. Relevant tools are listed in appendix for further reference.

Organized Structure of APK File
The file structure of an APK file looks as below:

Specific files related to the app like fonts, 3rd party SDKs and integration of it can be found in com folder.

Manifest file
An Android app must have “AndroidManfiest.xml” file in APK’s root folder as seen in the file structure above. It includes permissions needed by the app while it is installed on the device, activities and other details. There are Android apps which helps to view manifest of an installed app. Manifest file can also be parsed and read from APK file using appropriate tools (listed in Appendix).

Dex (Dalvik Executable)
Android apps are written in Java and executed in DVM (Dalvik Virtual Machine). This is different from JVM (Java Virtual Machine).  DVM is built for Android OS.  The Java bytecode is transformed into DVM bytecode and stored as .dex (Dalvik Executable) and .odex (Optimized Dalvik Executable) files. The terms odex and deodex is associated with Dalvik bytecode operations.

When Java compiler has produced JVM bytecode, Dalvik compiler deletes .class files and recompiles them into Dalvik bytecode. After this, Dalvik compiler merges this into one .dex file. During this process, interpretation, translation and reconstruction happens in class definitions, function code executed by target VM, method names, reference names, etc.

There is a limitation on what .dex file can include and how much it can. Due to this, APK can have one .dex file or more than one .dex file associated with it.
A hint about this limit can be obtained from the JAR obtained during conversion of .dex to JAR and viewing in JAVA decompiler. This tells about the complexity of the app and how to plan tests across multiple .dex files associated with the app for better coverage. With Android 5.0’s ART, dex file is converted to OAT file in this way è Java -> Class à Dexà OAT. Analysis of this can lead to devising better tests.

JAR Decompilation
JAR file contents can be viewed using Java decompiler.  Usually, the decompiled content in bytecode will be obfuscated by programmers. This gives visibility into classes and methods. While this does not comprise entire source code, it does provide hints with respect to the business logic used in programming the app. Reading through source code surely helps testers to identify what tests can be done and what recommendations can be given to the development team.
If the code is not obfuscated, it has to be brought to the attention of product owners and programmers.  Obfuscation is possible using tools like ProGuard and DexGuard.

Key Tests to be performed on a decompiled JAR file
1. Readable Business Logic
Presence of obfuscated code indicates the developer may have put some security mechanism in place. If the code is in plain text format, it means the code is readable and understandable to an extent, providing visibility to compiled functions and classes. Some part of this code might be patented already or about to be patented. And, this part of code may have the core logic of the app. Keeping the code in the open in logically readable format, may incur heavy loss to the organization in revenues and reputation. 

Code must be obfuscated at all times. Obfuscation is a procedure where organization masks source code to avoid others from viewing or copying the code/algorithms to build similar apps. ProGuard is one of the tools that can be used by programming team to obfuscate code.

2. Manifest file
Permissions are easily visible to anyone who views manifest file, hence giving an overview of what kind of permissions are needed for the app. This is again an unsafe option.

It is advisable that manifest files are encrypted using methods that will make it hard for the hacker to decrypt and see. In web technologies, a few encryption algorithms are used. In mobile, you may have to explore if there are similar algorithms or techniques for encryption or obfuscation. [Android store guidelines insists that permissions are visible to users. It’s upto testers to review these permissions used and identify potential risks]

3. getDeviceID() usage
It is fair for anyone to track individual installations on specific devices on a diverse user base. It sounds plausible just to call TelephonyManager.getDeviceId() and use that value to identify the installation. There are problems with this: Firstly, it doesn’t work reliably. Secondly, when it does work, that value survives device wipes (“Factory resets”) and thus one could end up making a nasty mistake when one of the customers wipes their device and passes it on to another person.

There are many good reasons for avoiding an attempt to identify a particular device. For those who want to try, the best approach is probably the use of ANDROID_ID on anything reasonably modern, with some fallback heuristics for legacy devices.

It is recommended to study the operation of permissions and activities in the app. If these are used and not handled well, the app may allow using the data which it processes on the device. The decompiled JAR can be of help here.

4. Accessibility to Resource files
Resource XMLs, important images and other files may be easily downloadable and directly consumable by anyone who does so. It is important to secure these files.

Images should not be accessible for anyone post de-compilation process of the APK.

It is a common myth that JAR decompilation is related to Security Testing. However, it is not true. Decompiling helps to learn the app better. One can learn how functionality works; understand the performance of the app and possible areas where the memory usage is a concern for the app.  Next time you decompile an app, look at the treasure trove in front of you and explore innumerable possibilities.
My team and I have applied above key tests for a few android apps we tested. It gave good insight to developers about the app itself and how it is perceived in the google world. Some of them who didn’t know much about the threats associated with APK files learned a lot from this exercise. Additionally, it also helped my team build credibility with the development team.

Token of Thanks
My community colleague Ravisuriya reviewed this article thoroughly and supporting me patiently in my learning. Thanks Ravi, for your precious time.

P.S: I cannot share the actual APK report format for obvious confidentiality reasons, however, this article gives a short overview of different things that can be done with decompiled JARs.

Appendix [for additional reading]
What’s the need to review java code?
       1.       Basic Health Check of the app (via java source code and other confidential data)
       2.       To perform security audits (provided code is not obfuscated)
    1. Sensitive Data Exposure
    2. Inadequate Access Controls
    3. Code Injection Vulnerabilities
    4. Broken Authentication and Session Management
       3.       Reverse engineering third-party protocols or APIs

APK Protection
Protecting an APK file is an area to be concerned about in this free internet era. Developers must have an eye on this while building APK files. There are many methods to protect APK files. There are several articles that brief you about them. I have listed out high level headers here, so you can go read them up in detail on Android developer site
  1. Secure Java source code
  2. Encrypting class files
  3. Code Obfuscation
Tools that might help in the process of JAR decompilation
          1.       Dex2jar – Tool to work with android *.dex and java *.class files
          2.       JAD – Java decompiler is a decompiler for Java programming language
          3.       JD-GUI – JD-GUI is a graphical utility that displays source code of *.class files
          4.       APKtool – It is a third party reverse engineering tool for android apps
          5.       Winzip / WinRAR – Compression/decompression utility
          6.       Proguard- ProGuard is a free Java class file shrinker, optimizer, and obfuscator. 
           7.       APK Protect – This is an APK protection solution

Image Credits - World Wide Web

14 January, 2015

Device Fragmentation – How to Tame The Bull

This article was originally published in Testing Trapeze December 2014 Edition.

As per the 2013 IDC report, mobile internet traffic is growing 1.5x per year and likely to grow much faster in future. Social media usage across the world shows a growing participation on mobile devices. Players like Snapchat, Instagram, Pinterest and Facebook are having more mobile users than on the web. This data suggests that many software organizations/developers/users are switching to building/using mobile apps more than ever. This, now, brings us to the question of how to get these mobile apps tested on millions of devices. In this article, I will share my thoughts and also highlight how I applied Jonathan Kohl’s approach to solve the device fragmentation problem and also build a mobile test strategy.

Device Fragmentation

Mobile device fragmentation occurs when some users are running older versions of the operating system, while other users are using newer versions. Some even call it Device Diversity.
The term mobile device fragmentation is also used to describe different versions of the same operating system that are created when an original equipment manufacturer (OEM) modifies an open source mobile operating system for specific products.  

Device fragmentation could arise due to various reasons:

Different Platform
Organizations assume that fragmentation needs to be addressed on Android alone as it is home to several OS version and mobile device types. However, with newer platforms showing up, the fragmentation bug is biting iOS, Windows and newer platforms too. Even testing on iOS with multiple device models and operating systems is becoming cumbersome although it’s not as complex as testing on Android yet. Note the word “yet.”

Hardware Dependency
Several features on apps are hardware dependent and hence need devices from specific manufacturers on particular device makes/models. For instance, Android 2.3 has support for Near Field Communications (NFC), but many phones do not have NFC hardware.

Manufacturer Customizations
In the Android world, manufacturers usually take a version of the operating system as a baseline and add their customizations and ideas, thus putting modern versions in the phone. iOS doesn’t pose this threat yet, but as it loosens some of its guidelines over a period of time, it is not a far off challenge on iOS too.

CPU/Memory Footprint Limitations
Many apps are memory/CPU intensive which means they perform well on high end phones and struggle to cope well on phones with limited capabilities. This is one of the top challenges of fragmentation for app developers today.

How to you choose from an ocean of devices?

Device usage is an incredibly important factor to consider. There are millions of devices in the market. How do you decide which ones to choose optimally in terms of time, cost, and effort? This is a universal question several organizations would like an answer for.

This picture is the best way of visualizing the number of different Android devices that have downloaded the OpenSignal app in the past few months. According to OpenSignal developers, the fragmentation has tripled from previous year. If you are building an app that has similar fragmentation, how would you address that?

Solving the device fragmentation problem starts with seeking answers to questions like:
1.       What platforms are customers using?
2.       What are the operating system versions?
3.       What are the multiplicity of screen sizes and resolutions used?
4.       Which are the manufacturers of mobile devices, makes, and models?
5.       What are the most popular mobile devices on the market?
6.       What are the most frequently used mobile browsers?
7.       Which devices demonstrate more problems to mobile apps compared to others?
8.       Which new devices should an already released mobile app support?
9.       How many previous versions of the platform should a mobile app support?

And the list of questions goes on.

Passing through the above list of questions is only the beginning. Listing out answers to most of them only brings more questions. However, as complex as it might sound, answers to these questions can help stakeholders narrow down the number/type of mobile devices used for testing.
In his book, “Tap into Mobile Application Testing”, author Jonathan Kohl lists out 4 main approaches that help in choosing a test strategy for mobile apps. I will highlight those 4 approaches below:

Singular Approach

Testing is done on one device type. Some organizations build apps that are focused on a specific mobile device. For example, a mobile app developed for viewing the menu and placing a food order at restaurants is sure to be installed on a specific tablet from a specific manufacturer of a particular version of the operating system. In this case, the stakeholders make a conscious choice of which device to go with and build an app just for that device. This is a straight forward case where testing on a single device will be good enough.
Some stakeholders might consciously make a choice of supporting just one device to start with and hence focus only on one device.

Proportional Approach

Testing is done on multiple devices. “How many devices are good enough?” is a question that can be answered best by doing basic research on mobile traffic and analytics data. Suppose the app has 80% Android mobile traffic and 15% iOS mobile traffic and the remaining traffic comes from other devices. This information helps in focusing most testing efforts on the Android platform for most part and *some* testing on the iOS platform. This is best suited for mobile apps that are already out in the market and there is visibility on the incoming traffic and mobile analytics.

Shotgun Approach

Shotgun means ‘as many as possible’. In this approach, Testing is done on multiple devices on a very large scale. This is best suited to test mobile apps where stakeholders don’t place any restrictions on mobile devices/platforms used. With hundreds of platforms and platform versions, it’s extremely hard to test on all devices to cover these. There will always be some platforms/versions that we miss. Over a period of time, based on experience, inputs from the programming team and historical data from users/analytics/mobile traffic, choosing ‘X’ number of devices becomes easier. The first time is always the hardest, but it is an important step towards optimizing the approach in subsequent test cycles/release plans.

Outsourced Approach

Testing is performed by outsourcing to multiple channels as listed below:
Third party vendors: Testing is outsourced to organizations which claim to have an in house mobile device lab with scores of devices.

Crowdsourcing: Testing is outsourced to organizations that run dedicated mobile test cycles with a well-known crowdtesting community who work in a “Bring Your Own Device (BYOD)” model. This approach also gives a sneak peek into how the crowd (a snapshot of the user base) uses the mobile app and provides input on a large variety of mobile devices under real world conditions.
Mobile Device Lab hosted over cloud: Testing is outsourced to organizations that run cloud based mobile device labs online. Typically, testers will be given access to a web based application to access mobile devices that are connected over a cloud. In this case, gestures, network scenarios, and a few other specific tests cannot be done.

What’s the best way to tame the bull?

In a constant quest to test mobile apps, one of the above approaches can be applied to tame the bull of device fragmentation. In some cases, one or more of the above approaches can be combined together to get better output.

In my experience, a mix of the Shotgun approach and outsourced approach has worked best. I change my strategy based on the context of the mobile app and the expectations of the stakeholders. It has easily taken 2-6 test cycles to determine a good fit for me. The challenge is how we get it right, the first time. It is worth noting that solving the fragmentation problem is not a one-time activity, but a time tested solution that becomes better as we discover newer challenges and deal with them on a case to case basis.

How do you handle mobile device fragmentation challenges on your project? Share your thoughts with me at parimala dot shankaraiah at gmail.

*Image Credits : World Wide Web