Android codebase has been changing extensively over the last few years to support a wide range of mobile devices and attractive features. Although it is an open source project, due to its dynamic nature and continuous growth, there is an ever-increasing need for a tool that facilitates developers to instrument the code at the platform level and aid them in analysis of the source code by providing handles into the platform. The objective of this project is to develop a tool that provides the developers with the ability to instrument the Android platform using Aspect-Oriented Programming.
The Instrumentation tool enables its users to inject custom Java-based code into Android Open Source Project (AOSP) which can be very useful in many situations such as providing the developers with better insights of Android source code, understanding method and class usage, monitoring the behavior of applications and checking any security violations by Android apps. Furthermore, the tool can also be extended and customized to meet the individual needs.
I certify that the Abstract is a correct representation of the content of this thesis.
Chair, Thesis Committee Date
PREFACE AND/OR ACKNOWLEDGEMENTS
This research project was supported and supervised by Dr. Arno Puder, whom I would like to gratefully and sincerely thank for his guidance, encouragement, and patience during my study at San Francisco State University. Also, I would like to thank my peer reviewers Syed Omer Khureshi, Sai Krishna Undurthi and Anshul Vyas from San Francisco State University.
I would like also to thank Dr. William Hsu for reviewing my report in his capacity as the second member of the committee.
Table of Contents
1.4 Organization of Document
2. Use Cases
3. Background & Related work
3.1 Android Open Source Project(AOSP)
3.2 Android Architecture
3.2 Android Build System
3.2.2 Function Definitions
3.2.4 Module Build Templates
3.2.6 Build Recipes
3.2.7 Compiling with Jack and Jill
3.3 Aspect Oriented Programming
3.3.1 AspectJ Library
3.4 Byte Code Engineering Library (BCEL)
4. Implementation Process
4.1 Build Flow Modification
4.2 Weaving Custom Code
4.2.1 Using Bytecode Manipulation
4.2.2 Using Aspect Oriented Programming
5.1 Launcher Process
5.2 Activity Manager
5.4 Application Process
5.4.1 Android Platform Variations
22.214.171.124 Gingerbread (2.3.7)…………………………………………..
126.96.36.199 Gingerbread (2.3.7) with Android support library
188.8.131.52 Nougat (7.1.1)……………………………………………….
184.108.40.206 Nougat (7.1.1) with support library……………………………………
220.127.116.11 Comparison of the results of analysis on Android platform variations………………..
6. Summary and Conclusion
6.1 Project Summary
Steps to instrument and build Android
List of Figures
Figure 1: Working of the Instrumentation tool
Figure 2: Android architecture
Figure 3: Working of Zygote process
Figure 4: Android build architecture
Figure 5: Jack toolchain structure
Figure 6: Jack library format
Figure 7: Android Build flow with Jack and Jill toolchain.
Figure 8: Android build process
Figure 9: Android build process with AspectJ instrumentation
Figure 10: Application startup process
Figure 11: Call-stack graph of the APIs invoked by launcher process
Figure 12: Call-stack graph of the APIs invoked by Activity Manager
Figure 13: Call-stack graph of the APIs called by Zygote Process
Figure 14: Call-stack graph of the APIs invoked during application startup on Gingerbread
Figure 15: Comparison of the results of analysis on different versions of Android
List of Tables
Table 1: Directory contents of Android Open Source Project(AOSP)
Table 2: Set of values that can be provided for TARGET_PRODUCT in the Android build
Table 3: List of functions added by build/envsetup.sh script
Table 4: Configuration information set up by lunch for generic-eng combo
Table 5: List of module build templates and their corresponding .mk files in AOSP
Table 6: Comparison of custom code weaving approaches: Bytecode manipulation and
Table 7: List of processes observed at the application startup
Table 8: Metrics used in the analysis of a call-stack graph
Table 9: Graph analysis of the call-stack graph of application process on Gingerbread (2.3.7)
Table 10: Graph analysis of call-stack graph of the application process on Gingerbread (2.3.7) with Android support library
Table 11: Graph analysis of call-stack graph of the application process on Nougat (7.1.1)
Table 12: Graph analysis of the application startup sequence on Nougat (7.1.1) with support library
Table 13: Percentage comparison of the results of analysis on different versions of Android
List of Code snippets
Code snippet 1: Makefile of AOSP
Code snippet 2: List of choices offered by lunch command on Gingerbread
Code snippet 3: Part of the code in envseup.sh which adds lunch combinations…………..
Code snippet 4: vendorscript.sh script in Gingerbread 2.3.7………………..
Code snippet 5: Example of AspectJ around advice
Code snippet 6: Changes made in build/core/definitions.mk file to support
Code snippet 7: Program to insert log statements at the beginning of every method in a class using BCEL library
Code snippet 8: Program to insert log statements at the beginning of every method in a class using AspectJ library
Code snippet 9: Content of post-compile.sh script
Code snippet 10: AspectJ program to add enter/exit logs in every method of AOSP
This chapter outlines the objective and the motivation behind this project and explains the organization of this document.
The idea for this project first formed when we learned about an interesting situation involving a law suit targeting a major Android app on the PlayStore. The allegation was that the app was accessing private information in situations where it didn’t require such access and was misusing the information collected. This called for the analysis of the app to verify if such claims were valid. This was not an easy task, more importantly, many other situations arise where we are left wondering what sort of information an app might be privy to. Since the access to source is not available, it was very difficult, almost impossible, to accurately ascertain if the app accesses the private information. Furthermore, there are other major hurdles such as,
- No Access to handles in the platform for APIs accessing private information
Android framework provides multiple ways to access the same piece of information, making it difficult to determine if the information was accessed. There was a need to have a handle in the platform to detect when such APIs are invoked to make the illegal accesses to private information more visible for analysis.
- App level visibility is insufficient
Although the current Android permission model allows a user to control what information/resources an app can access, it does not enable the users to control how frequent the access is or what the app does with the private information. This level of control requires platform level visibility.
- Lack of access to source code
Access to Android application source code is not always available. In fact, often, companies do not make source code open. This leaves us with the only option to perform analysis on the bytecode.
- Lack of existing tool
Although there are tools available in the market that perform analysis at the application level, they do not provide flexibility and visibility needed for a conclusive and definitive answer to the question of whether the app is misusing its permissions. No tool is available for analyzing the APIs at the platform level.
To summarize, there was a need for a tool that could resolve the mentioned hurdles above and since we could not find it, we started to develop our own tool which could instrument the Android platform with which it would be possible to insert handles into the platform.
The aim of the project is to provide an end-to-end tool that enables users to inject any custom Java code based on user specified constraints into the Android platform using Aspect Oriented Programming. Figure 1 explains the objective of the project.
The end user can write the custom Java code that they want to execute every time a condition is met in the base code which is the Android Open Source Project (AOSP). The user can then use the Instrumentation tool to inject the custom code into the base code. The Instrumentation tool weaves the custom code into the base code through the build process and generates a custom emulator i.e. a final executable image of the AOSP. The custom emulator can be used by the end users to execute the APKs and see the output of their custom code every time the condition specified by them is met.
- Custom code injection
The Instrumentation tool provides an ability to inject a custom code into Android Open Source Project that can be very useful in many ways such as understanding APIs and class usage in AOSP, profiling applications and monitoring behavior of applications. There is no need to generate an emulator image every time. Preconfigured emulator images can be reused to analyze multiple apps.
- Support for multiple versions of Android
The Instrumentation tool supports the different versions of Android. It has been specifically developed for all the versions from Gingerbread (2.3.3) to the latest version Nougat (7.1.1).
- Consolidated documentation to understand AOSP and Android Build system.
Though there are several documents available that provide information about Android Open Source Project and Android build system, the information is fragmented. The document consolidates all the relevant information in one place.
4. Environment setup guide
The appendix attached at the end of the document provides step by step instructions on how to download, build and generate a custom emulator with the Instrumentation tool.
- Framework for other projects
The Instrumentation tool provides a necessary framework for other projects like Android best practices verification tool by Sai Undurthi, security inspection tool by Anshul Vyas and visualization tool by Syed Omer Khureshi.
- A detailed analysis of application startup process
The demonstration of the Instrumentation tool provides an insight into the sequence of actions that take place when an app is launched on the Android OS. An analysis of the application startup process is carried out on different versions of Android with different configurations, which helps in understanding the phenomenal growth of Android over the last few years and impact of Android support library on the application startup process.
Chapter 2 outlines the use cases of the tool. Chapter 3 discusses the relevant background knowledge, including Android Open Source Project (AOSP), Android build architecture and Aspect-oriented programming. Chapter 4 gives a detailed walkthrough of our implementation process to support custom weaving into the AOSP. Chapter 5 provides a demonstration of the tool using a sample app. Finally, Chapter 6 summarizes the project. The appendix provided at the end explains how to get the Android sources from the Android website and how to compile them with custom aspects to generate a functional emulator image.
In this chapter, we present four use-cases of the Instrumentation tool which highlight its applications.
- Understanding Android APIs and features
Sydney’s team has a published app on Android PlayStore which is designed to detect earthquakes by using mobile sensors. The app uses a background service to record the sensor data constantly, process it and upload to a server. Starting from Android 6.0 (API level 23), Android has introduced a new feature Standby and App Doze to reduce battery consumption. It does so by deferring background services and network activity for apps when the device is unused for long periods of time. To understand how the Doze and App Standby affects the functionality of the app, she utilizes the functionality of the Instrumentation tool by writing custom aspects to add logs in the relevant Java classes and builds an instrumented emulator image. She then installs her app on the emulator and tests it with a set of test cases. After analyzing the logs generated by the tool, she is able to check how the functionality of the app is affected. She can then modify her app functionality accordingly.
- Profiling Android application
Jen is a part time Android app developer who developed a map application on the PlayStore. She has a crash reporting tool ACRA setup in her app. When she checks the recent crash report, she sees a few crashes due to Application Not Responding(ANR) error. To find out the root cause of the issue, she proceeds to use the Instrumentation tool. With the help of the logs generated by the tool, she realizes that her app is spending more than normal time in the Android callbacks, triggering the ANR error. This makes it easy for her to fix the bug and speed up the execution time of bottlenecks in her software code.
- Ensuring if an application is following best practices.
Kevin is a team lead responsible for Android game development. He has been receiving feedback from the app users about excessive battery drainage when using the app. Having heard about the Instrumentation tool, he writes a set of custom aspects to check if the app follows the best practices suggested by Android. With this analysis, Kevin discovers that the app is violating some of the best practices. He incorporates the use of the tool in his team’s standard testing protocol. The issues such as not unregistering sensor listener and broadcast receivers on onPause(), not stopping background services are detected by the tool early in the testing phase. His team is able to fix the code to ensure the best practices are followed. After a couple of months, the app has got increased number of installations and an incredible rating on the PlayStore.
- Detecting privacy breach in an app
Raymond is an Android enthusiast with an extremely deep understanding of the Android framework. He is contacted by a Law firm that is working on a client’s complaint about a potential privacy breach by a leading social networking app and request Raymond, in his capacity as an Android expert, to ascertain the validity of this claim by providing him access to the app’s APK file. Since he does not have access to the source code of the app, it becomes very difficult to understand the working and to examine the behavior of the app. Being aware of the Instrumentation tool, Raymond writes custom aspects on potential APIs that can access private information like contacts, location, etc. and runs the app on an instrumented VM instance. By analyzing the generated logs, he is able to understand the flow of data in the app. He provides his findings to the Law firm.
As can be observed from the presented use cases, there is a need to instrument the Android platform when the source code of Android apps is not accessible, or the application level access is insufficient. In such scenarios, the Instrumentation tool can be used with minimal efforts.
This chapter introduces the relevant background knowledge for our project by providing a brief explanation of the key concepts related to the project. Section 3.1 gives an overview of the Android Open Source Project (AOSP), whereas section 3.2 explains the Android architecture. Section 3.2 starts with the introduction of Android build system and then digs into Android’s internals such as Android build architecture, build configuration parameters, module template, reusable functions and the newly introduced build toolchain ‘Jack and Jill’. Section 3.3 discusses Aspect-Oriented Programming paradigm. Section 3.4 covers a high-level overview of Apache’s Byte Code Engineering Library (BCEL) used for bytecode manipulation.
Android is an open source software project which works on a range of devices. The purpose of Android is to develop an open source platform available for developers and to create a product that enhances the mobile experience for users . AOSP is available for download, customization, experimentation and porting.
Android source can be downloaded and built to get custom Android OS running on a device or an emulator. Step by step instructions on how to download and build Android can be found in Appendix. AOSP is a fairly large project consisting of more than 14,000 directories and 100,000 files in Gingerbread (2.3.7) and more than 1340000 files and 132,000 directories in Nougat (7.1.1).
Table 1 provides an overview of the important directories and their contents in the AOSP project.
|Abi||Minimal C++ Run-Time Type Information support|
|bionic||Android’s custom C library|
|bootable||OTA, recovery mechanism and reference bootloader|
|cts||Compatibility Test Suite|
|device||Device specific files and components|
|docs||Documentation of the source code|
|external||External projects imported into the AOSP|
|frameworks||Core components such as system services|
|hardware||HAL and hardware support libraries|
|libcore||Apache Harmony Java source|
|ndk||Native Development Kit|
|pdk||Platform Development Kit|
|prebuilt||Prebuilt binaries, including toolchains|
|sdk||Software Development Kit|
|system||“Embedded Linux” platform that houses Android|
|tools||Various IDE tools|
‘prebuilt’ and ‘external’ are the two major directories in the AOSP tree. They account for close to 70% of its size. Both the directories mostly have content from other open source projects such as kernel images, GNU toolchains, common libraries such as OpenSSL and WebKit. ‘libcore’ is also a part of another open source project, Apache Harmony . The key components of Android reside in frameworks/. Inside frameworks, System services can be found in frameworks/base/services and frameworks/base/media. frameworks/base /core contains core components including Runtime and Zygote, the system key elements which are explained in the following section. Native daemons can be found in frameworks/base/cmds.
This section describes the key components of Android exhibited in Figure 2. in the order in which they are loaded during system startup.
When the Android system is started, a bootloader is the first program executed by CPU. The bootloader initializes the RAM, loads the kernel and RAM disk, and jumps to the kernel.
The main goal of Android Kernel is to load the necessary things for CPU. It initializes several subsystems and invokes the ‘init’ functions of all built-in drivers. It also mounts the root filesystem and fires up the init process of Android.
The init process executes instructions stored in init.rc file. The init.rc file contains instructions to create mount points, mount filesystems, start native daemons and set up environment variables such as system-path.
Init process sends an app_process command to Android Runtime. On receiving the command, the Runtime kicks off the first Dalvik VM, which invokes Zygote’s main() method to launch the Zygote process.
Zygote is a daemon process responsible for starting an app as its child process. During system startup, the Zygote preloads all necessary Java classes and resources, starts System Server process and opens a socket /dev/socket/zygote to listen to incoming requests for launching new applications .
When the Zygote receives a request to launch a new app through the socket, it takes advantage of the fork() system call to create a new process. With the fork() call, it creates a replica of itself. Therefore, a new Dalvik VM is preloaded with all the necessary classes and resources that any app might need. This makes the process of creating a VM and loading of resources more efficient. With the Copy On Write (COW) technique implemented by the Linux system, the memory pages are shared between the parent and the child processes. When one of the processes attempts to modify the shared memory, the kernel intercepts the call and makes a copy of the shared pages. In the case of Android, the pages are not writable. This means that all the process forked from the Zygote use the same copy of the system classes and resources. In addition to being efficient, this approach also saves physical memory space on a device; regardless of how many applications have started, the increase in memory usage will be a lot smaller . The above process is summarized in Figure 3 provided below.
System Server is started by the Zygote during system startup and stays as a background process. It hosts the majority of the system services that run on Android within a single process. Some system services are written in Java whereas rest of them are written in C/C++. It also includes some native code access through JNI to allow some of the Java-based services to interface to Android’s lower layers.
Activity Manager is one of the services hosted by the System server. It handles the starting of new components, such as Activities and Services, fetching of Content Providers and intent broadcasting. It is also involved in the maintenance of out of memory adjustments used by the in-kernel low- memory handler, permissions, task management, etc.
Launcher app is started by the Activity Manager by sending an intent of type Intent.CATEGORY_HOME. The launcher is responsible for displaying the home screen which is familiar to the Android users.
Android build system resembles make-based build system, with a few notable differences. The Android build system does not count on recursive makefiles, unlike the make-based build system. Instead, it makes use of a special makefile, Android.mk. The Android.mk file defines how a local module is built. A module is any part of the AOSP such as a binary, an app package or a library that needs to be built. The Android build system invokes a script that traverses all the subdirectories till it locates an Android.mk file. After finding an Android.mk, it stops. It does not explore the subdirectories beneath that file’s location, unless explicitly specified in the Android.mk. .
Another difference between the make and the Android build system is in the way the Android build system is configured. Android depends on a set of variables that are are defined statically in a buildspec.mk file or either set dynamically by envsetup.sh and lunch scripts. Also, the level of configurability allowed by the Android’s build system is limited. Although the properties of the target can be specified, there is no way to enable or disable most of the features. For instance, it is not possible to disable power management support or Location Service.
Also, the Android build system does not store an intermediate output within the same location as the AOSP source files. Instead, the build system generates the intermediate output as well as final output in a new directory out/. Therefore, removing the out/ directory also removes everything that was generated during build. In other words, ‘make clean’ is the same thing as rm -rf out/.
The root directory of AOSP includes a single Make File. That file is mostly empty; its main use is to include the entry point for the Android’s build system as can be seen in Code snippet 1.
### DO NOT EDIT THIS FILE ###
### DO NOT EDIT THIS FILE ###
The build/core/main.mk file is the entry point to the build system. The build system pulls everything into a single makefile. Therefore, each .mk file in the end becomes a part of a single huge makefile. Therefore, this single makefile contains the rules for building all the modules in the system. Figure 4 presents the components of the build system. The components are explained in detail in the subsequent sections.
The build configuration is specified in config.mk file. The build system pulls in the build configuration by including config.mk. The config.mk file defines the following environment variables. .
Android flavor to be built. The set of values that can be provided to the TARGET_PRODUCT variable, includes the following:
|generic||the most basic build of the AOSP parts|
|full||With most apps and the major locales enabled|
|full_crespo||Same as full but for Crespo (Samsung Nexus S)|
|full_grouper||Same as full but for Grouper (Asus Nexus 7)|
|sdk||The SDK; includes a vast number of locales|
Table 2: Set of values that can be provided for TARGET_PRODUCT in the Android build
Dictates which modules to install. Each module sets a LOCAL_MODULE_TAGS variable in its Android.mk from the list: user, debug, eng, tests, optional, or samples. With the selection of the variant, module subsets to be included can be specified.
Decides on whether to use release or debug build type.
By default, the build system uses one of the cross-development toolchains in the prebuilt/ directory. To use a different toolchain, the value of the TARGET_TOOLS_PREFIX variable should point to the location of the toolchain.
By default, the build system generates the build output into the out/ directory. This variable is used to provide a different output directory.
If the default template build/buildspec.mk.default is used to create buildspec.mk file, this value is set correctly. However, if the buildspec.mk created with an older AOSP release and is used in the newer AOSP release having important build system changes, this variable acts as a safety net. It causes the build system to update the users that buildspec.mk file is not compatible with the build system.
envsetup.sh sets up the build environment for Android. Primarily, it defines a chain of shell commands useful to many AOSP jobs. Invoking build/envsetup.sh from a shell adds the functions described in Table 3 to the environment.
|croot||Changes directory to the top of the tree|
|m||Makes from the top of the tree|
|mm||Builds all the modules in the current directory|
|mmm||Builds all the modules in the supplied directories|
|cgrep||Greps on all local C/C++ files|
|jgrep||Greps on all local Java files|
|resgrep||Greps on all local res/*.xml files|
|godir||Go to the directory containing a file|
Table 3: List of functions added by build/envsetup.sh script
‘m’ and ‘mm’ are quite useful commands. ‘m’ command allows users to build from top level regardless of the current path, whereas ‘mm’ builds the modules located in the current directory. For example, if a modification is made to the Launcher and the current path is packages/apps/Launcher2, then the module can be rebuilt by with mm instead of going back to the topmost level, and typing make. Since mm does not rebuild the entire tree, it does not rebuild AOSP images even though a dependent module has modified. mm can be still helpful to tryout local changes and to check if they crash the build .
lunch command is defined by envsetup.sh. When lunch is executed without any arguments, it displays a list of alternatives. For example, on Gingerbread, the following list appears.
You’re building on Linux
Lunch menu… pick a combo:
Which would you like? [generic-eng]
Basically, the menu asks the users to select a combination of the TARGET_PRODUCT and the TARGET_BUILD_VARIANT. The menu gives the default combination, but the other options can be chosen as parameters on the command line. These choices are not generated dynamically based on content of AOSP. They are individually added using the add_lunch_combo() function defined in the envsetup.sh . For instance, in Gingerbread 2.3.7, envsetup.sh includes generic-eng and simulator as illustrated in Code snippet 3.
|# add the default one here|
|# if we’re on linux, add the simulator. There is a special case|
|# in lunch to deal with the simulator|
|if [“$(uname)” = “Linux”] ; then|
lunch also offers a way to add vendor-specific scripts. Here is how it is done in 2.3.7 Gingerbread:
|# Execute the contents of any vendorsetup.sh files we can find.|
|for f in `/bin/ls vendor/*/vendorsetup.sh vendor/*/build/vendorsetup.sh device/*/*/vendorsetup.sh 2> /dev/null`|
|echo “including $f”|
Table 4 describes the required configuration information set up by lunch for default ‘generic-eng’ combo.