Reverse Engineering OruxMaps' Preference Encryption - Part 1
I’m an avid user of the Android app OruxMaps. I use it to navigate on pre-made tracks while cycling. It is able to show directions, altitude diagrams, live-track other people and much more! Sadly it isn’t open source and has some — in my opinion — anti-features. Let’s see if we can do something about those.
Table of Contents
While setting up a second phone to use as a backup navigation device on long cycling trips, I wanted to transfer the settings of the app to a new phone. Nicely enough OruxMaps provides a “profiles” feature, which saves all the in-app preferences into a xml file (or so I thought).
The preview line under the filename already made me suspicious.
.xml files begin with the same weird looking characters, but it does not seem to be XML.
Inspecting the Files on the Desktop
What other util to use than the
file command when not knowing what kind of file it could be.
Great, that was useful, now I exactly know what to do…
Maybe a look at the raw bits will help me identify some recurring patterns or strings.
Opening the file in emacs using
hexl-mode the beginning of each xml file looks exactly the same:
Hmm all files start the same way, there seems to be a fixed structure! Maybe this is some unicode encoding issue? I tried opening the app with all encodings emacs had to offer, but to no avail. Online tools claiming to be able to identify charsets failed to identify any.
I could not find a charset to open the file which makes the xml appear like a readable text document.
Maybe the File is Encrypted?
My next suspicion was that the file’s contents are encrypted without the use of a nonce1. Therefore same blocks of plaintext result in the same ciphertext!
Nonces basically are cryptographically random numbers which are fed into a symmetric encryption in order to prevent the mistake we just saw. By using different nonces when encrypting with the same algorithm and secret, the same input will look differently every time. Therefore an attacker is unable to see if files contain the same content.
If we encrypt the string
ABT letter by letter (ECB-mode, block size = 1) without using a nonce the problem becomes apparent:
Without any knowledge about the secret key an attacker can see that the first and second ciphertext starts with the same two characters, therefore the plaintext must too. This usually reveals file signatures (in our case XML).
To prevent this the encryption algorithm is additionally given a nonce in the encryption and decryption process. As nonce means number used once, the encryption of plaintext + nonce yields a different ciphertext every time. Drawback: The nonce must be stored in plaintext alongside the file as it is needed for the decryption again.
This information gives us a very good indication that the file might be encrypted. Let’s take further steps to confirm our suspicions.
die to Check the Entropy
A good metric for telling encrypted and unencrypted files apart is entropy2. The basic idea of entropy is to describe how random the data of a file seems.
When writing a plain english text, one typically uses alphanumerical characters, whitespaces and other punctuation. The letter e also is used very frequently in comparison to other characters. This “syntax of the english language” can be distinguished from completely random data using entropy.
Let’s use the Detect-It-Easy (
die) tool3 to see if it has any new information on the maybe-encrypted file and to check the entropy.
First, let’s see if it shows us some known signatures:
Detect-It-Easy detects no known signatures on the possibly encrypted file. Clicking the “Entropy” button reveals a histogram displaying the relative frequency of each byte in the file.
As you can see, each byte value appears about equally often which would be highly unlikely for structured text. The entropy field in the GUI shows a coefficient of ≈99%, which is a good indicator for an encrypted file.
Comparing Entropy to a Plaintext File
To see the contrast between the entropy and histograms, let’s open a unencrypted plaintext utf-8 encoded XML file:
The tool correctly identified the XML file type and the standard linux line feeds (LF). What about the entropy?
That looks very different compared to the result before! We can clearly identify discrete regions of byte values with high frequency. The entropy also is at a low 59%.
Looking into the ASCII table4 at the decimal values, one can recognize that this correlation is not by accident. All high values in the histogram are very common in XML files:
- Peak @ 32-34: 32 - Space (
- Small Peak @ 48-57: Numbers (
- Peak @ 60-62: XML Tag Syntax (
- Peak @ 97-122: Lowercase Alphabet (
Decompiling the Android App
Now that we have a good indication that the file is encrypted, we can continue our search and focus on finding the encryption.
Thankfully decompiling an Android app is easy nowadays.
apktool5, which literally is “A tool for reverse engineering Android apk files”.
After downloading the official apk from the website6 I took a look at our new tool.
The usage couldn’t be more simple:
apktool d OruxMaps7.4.23.apk decompiles the whole app and leaves us with a directory of decompiled goodness.
Anyone who ever has developed an Android app will immediately recognize some of those directories:
AndroidManifest.xml: The app’s metadata and required permissions
res: Ressources like images, strings, and drawables
assets: Raw ressources, not touched in compilation
smali folder however is what we are really looking for.
It contains the disassembled Java code.
Aren’t Android Apps Written in Java?!
Yes, they are! But compiling Java source code into bytecode which can be run by Java Virtual Machine (JVM) is a lossy process. Information like variable names, comments and code structure is lost as it is irrelevant to your phone’s processor.
But Android is special: It does not run your run-of-the-mill
.class bytecode files which you get when using the Java compiler (
Additionally it uses the
d8 tool7 (formerly
dx) to bundle multiple
.class files into a
.dex file can be run by Android’s implementation of the JVM, which is called Dalvik.8
.dex files are human-readable (because they are made for machines, duh!).
But by using the
.dex disassembler baksmali,
apktool can turn the compiled
.dex bytecode into multiple
Where to Start Reverse Engineering?
The problem is pretty obvious once we see how many smali files we have:
To narrow it down we can use some information that we already have:
- Some encryption algorithm
- Something saving a file named
- Probably something creating an XML file
Generally strings are the best entrypoint for analyzing the code, as they are preserved in the compilation process (if no obfuscators or packers are used).
Let’s focus on the file name then.
The full path to the encrypted preferences file is
Internal Storage/oruxmaps/preferences/4.2/om2_Basti.xml (as we saw in the screenshot of the directory).
But which parts of the string are non-dynamic and are likely to be found in the smali code?
Internal Storage/oruxmaps/preferencesseems like a fixed string, but it could be joined out of the individual directory names
4.2is likely some version number ⇒ dynamically generated
om2_Basti.xmlconsists of the fixed prefix
om2_, the profile name
Bastiand the suffix
Our best shot is to go with the
preferences directory string or the filename prefix
om2_ as they are constant with a high probability.
Looking for Constant Strings
For my code-searching tasks I prefer
grep as it comes with prettier output and sane defaults.
In order to look for the constant string
preferences in all smali “sources”, we can use a extremely simple regex:
It seems like we have found ourself some possible entrypoints. We will dive into reverse engineering the logic in part 2.
Starting from Android 5, the Android Runtime (ART) is used instead ↩︎