Understanding the nuances of user attribution is paramount in digital marketing. This has become increasingly difficult on iOS following recent changes such as App Tracking Transparency (ATT) in iOS14.5, and more recently with updates to iOS 17. One such method that has been a point of contention with Apple is fingerprinting, also known as probabilistic attribution. Let's dive deep into this concept, its workings, and the challenges it faces as a measurement method when ‘Required Reasons’ is rolled out in iOS 17.
What is fingerprinting / probabilistic attribution?
Definition
Fingerprinting, in the context of digital marketing, refers to the process of identifying users across different applications / websites based on specific device data. This method doesn't rely on cookies or device IDs but instead uses a combination of device attributes to create a unique profile for a user with varying levels of confidence. For example, Appsflyer, boasts a 92% accuracy rate and 89% coverage for their probabilistic attribution method. Often these profiles are quite transient since device attributes naturally change over time. The accuracy rates of different fingerprinting methods will be determined by how much data can be collected and how little time has passed between capturing properties to build user profiles.
How it Works
When a user clicks on an ad, views an ad, or visits a web page, their device captures a variety of data about that interaction. Later, when the user opens an app, the system matches this data with the initial data captured during the ad click or impression. This match helps in attributing the user's action to the specific ad they interacted with.
An example
The image above shows one example of probabilistic attribution by Adjust, and here is another scenario from the attribution company Incrmntal:
“If a user click is triggered from an iPhone 12 device running on iOS15.5, with 46% battery power, and the IP starting with 160.93 – this may be matched with an install of an app that is 80mb to download, if the app was downloaded within the same hour by an iPhone 12 device, running iOS15.5, with 40% battery power, and the IP starting with 160.93.
Theoretically, it’s likely that the click and the install actually came from the same user. But given the fact that New York has a population of 8.38M , and iPhones represent 50% of the US market, iOS15.5 represents 66.7% of iOS users (as of August 2022) and that at least 6% of the users will have a battery range of 46% - 40% within a given hour – this leaves approximately 167,683 users in the suspect pool of those who might have been “the one” that clicked the ad.”
This example reveals both the challenges and opportunities of fingerprinting / probabilistic attribution as an alternative to user-level attribution.
When is Fingerprinting Used?
While fingerprinting offers an approach to user attribution, it's not without its challenges and limitations:
Example use cases:
- Web to App Ads: iOS 14.5 introduced ATT, which required user consent for tracking users across websites and apps. Some marketers have found a workaround to Apple's restrictions by driving paid traffic to a landing page prompting users to download the app from there. By doing this they could collect device information on their landing page and when the user downloads the app. This was largely driven by SKAdNetwork (SKAN) not supporting Web > App ads, such as ads in Google Search that drive downloads, prior to SKAN4 (which is still not supported by Google).
- Connected TV Attribution: Another significant use case for fingerprinting is in the realm of connected TV attribution. Here, the system matches the IP address of the TV with the IP of the user downloading the app.
Challenges:
- Apple's stance: Apple has taken a stringent stance against fingerprinting. In 2021, Apple began blocking apps that incorporated the Adjust SDK. Apple's primary concern was that the SDK algorithmically used device and usage data to create unique identifiers for tracking users. This move was in line with Apple's broader initiative to enhance user privacy.
- Accuracy dependence: The accuracy of fingerprinting largely depends on the quality, quantity and timeliness of data available. The more data points you have, the better your chances of making a correct match and accuracy of matches are going to be significantly higher the shorter the time period between clicking on the ad and installing the app.
iOS 17 and Fingerprinting
Apple's iOS 17 has introduced new requirements for app developers and the SDKs they implement in their apps. These changes have been outlined in more detail in our article about privacy updates in iOS 17, but of particular importance to fingerprinting and probabilistic attribution are the introduction of Required Reasons APIs.
If an SDK wants to access specific data points, it needs to provide valid reasons. In Apple’s words: “We know that there are a small set of APIs that can be misused to collect data about users’ devices through fingerprinting, which is prohibited by our Developer Program License Agreement.”
Apple's Required Reasons API will drastically limit the type of apps that can request sensitive information from users. Learn more about the privacy apocalypse in our podcast.
The APIs are:
File Timestamp APIs:
APIs Included: creationDate, modificationDate, fileModificationDate, contentModificationDateKey, creationDateKey, getattrlist, getattrlistbulk, and more.
The Basics: These are the digital markers, indicating when a file was created or modified.
Legitimate Reasons to access this data:
- To display: Just as you'd showcase a certificate, apps might want to display these timestamps to users
- For internal use: Sometimes, apps need these timestamps for internal functionalities, like organizing files in a CloudKit container
- User-specific access: In scenarios where users grant specific access, like using a document picker
System Boot Time APIs
APIs Included: systemUptime, mach_absolute_time().
The Basics: These measure the time since your device last started up.
Legitimate Reasons to access this data:
- Event timing: Access the system boot time to measure the time elapsed between events within the app or to enable timers.
Disk Space APIs:
APIs Included: volumeAvailableCapacityKey, volumeAvailableCapacityForImportantUsageKey, volumeAvailableCapacityForOpportunisticUsageKey, volumeTotalCapacityKey, systemFreeSize, systemSize, and more.
The Basics: These keep track of how much storage space is left on your device.
Legitimate Reasons to access this data:
- To display: Apps use these APIs to inform users about their device's storage.
- Operational needs: In situations where apps need to check storage before downloading or saving files e.g. if you have enough space on your device to download a movie
Active Keyboard APIs:
APIs Included: activeInputModes.
The Basics: These identify which keyboards (like English, Spanish, or Emoji) are active on your device.
Legitimate Reasons to access this data:
- For custom keyboards: If an app offers a unique keyboard experience as their core product
- UI customization: apps might tweak their interface based on active keyboards e.g. displaying different languages
User Defaults APIs:
APIs Included: UserDefaults.
The Basics: These store your specific settings and preferences within apps.
Legitimate Reasons to access this data:
- Only for use within the app: Access user defaults to read and write information only accessible to the app itself (e.g. Uber couldn’t get the UserDefaults for Postmates or Uber Eats)
A third party SDK, such as a mobile measurement partner (MMP) or an ad network, will not have these ‘Required Reasons’ to access this data and therefore the data that they collect will not be as rich, thus lowering the accuracy of the results.
However, most MMPs only require location, OS version, model and user agent for the base of their probabilistic attribution models. This data could still be sent server-to-server or collected by their SDK. The other changes in iOS 17 (Privacy Manifests and blocking HTTP requests to tracking domains) are likely to restrict the ability for third-party probabilistic attribution providers to access this data.
Approach Fingerprinting / Probabilistic Attribution With Caution
This change states a clear message from Apple: user privacy isn't up for negotiation. It is clear from this and the other changes that user-level marketing measurement on iOS is going away, and Privacy Sandbox for Android isn’t far behind. Savvy advertisers have accepted this, and rather than trying to hold onto user-level attribution, they’re investing in more privacy-safe marketing measurement techniques such as SKAdNetwork, Geo-Testing and Media Mix Modeling. As governments enact more regulations to reflect peoples’ growing demands for greater control over their online data, marketers, ad platforms, and operating systems will continue to converge towards a privacy-safe digital future for all.