An Update on the British Airways Breach and the Difficulty of Estimating Breach Numbers

“How many people were affected by the British Airways data breach?” That has been the million-dollar question for some time, but providing decent estimates in moments of crisis is hard.

First Published 2nd November 2018

An Update on the British Airways Breach and the Difficulty of Estimating Breach Numbers

BA addresses the fallout.

4 min read  |  Reflare Research Team

A statement released by International Airlines Group, the holdings company of British Airways, outlines that during the extended investigations following the initial breach discovery, two things have become clear:

  1. The attackers may also have captured payment card data from users performing rewards bookings during the time of the attack. British Airways has thus begun notifying an additional 185,000 customers that their data might have been breached. The potentially affected data sets consist of 77,000 complete payment data sets and 108,000 payment data sets without CVVs.

  2. Out of the initially reported and notified 380,000 affected users, only 244,000 turned out to have actually been affected by the breach.

Why is it hard to estimate the number of affected users?

At first glance, the numbers outlined in the previous paragraph may seem strange. British Airways missed 185,000 affected users (49% of the reported total) and simultaneously falsely identified 136,000 users (36% of the reported total) as being affected.

However, such diverging numbers are somewhat normal when dealing with information security breaches. Even in clear-cut attacks, it can be hard to estimate how much data was stolen.

Let’s say for example that analysis shows that attackers gained access to a central database holding user payment information. They held that access for 8 minutes before incident response teams and IPS systems disconnected them. The database held 100 million user records totalling 500 GB of data. Let’s assume we can also be sure that the attackers didn’t leave backdoors and thus had no access to the database outside of the 8-minute window. How many data sets were compromised?

From a reporting standpoint and without further investigations, all 100 million of them. After all, a breach can’t be ruled out for any one of the individual records. However, from a practical perspective, stealing 500GB of data in 8 minutes is virtually impossible. Even assuming that the attackers had 1 gigabit per second of transfer speeds (which is highly unlikely as the methods commonly used to cover tracks would make transfer speeds of one-hundredths of that more realistic), the total amount of data that could be transferred in 8 minutes would total 60 gigabytes (8 minutes * 60 seconds * 1 gigabit / 8 bits per byte). So even in a best-case scenario, the attackers could only have accessed around 12% of the data. However, this ignores the time it would take for the attackers to realize what they had broken into, the time it takes to ready the data transfer, the time it takes for the database to run the query, and possible sub-selections of data and much more. There are plenty of reasonable scenarios where attackers would only manage to actually steal a couple of dozen records in the 8-minute window.

Since maximum possible clarity is needed in such situations, forensics teams will use all sorts of analytical tools to figure out just what data was accessed. If the traffic was unencrypted, logs may contain a precise record of the stolen entries. Otherwise, database analytics and disk access tools may be able to generate a rough breakdown of what data was accessed to limit the breach’s scope from the complete 100 million record data set down to perhaps 10 million likely affected users.

However, even this is a somewhat best-case scenario. Under normal circumstances, the attack window can’t be precisely identified, backdoors can’t be ruled out and discovery may come after weeks of leaks - long after precise forensic information can be gathered. Furthermore, in cases such as the British Airways breach, the data was stolen from the users’ browsers directly, meaning that the airline has almost no data to go on except for roughly correlating bookings with the presence of the malicious JavaScript file on their servers.

Thus, affected companies try to err on the side of caution. By over-reporting and over-notifying the number of affected users, they shield themselves from lawsuits and regulatory penalties. This explains why only 244,000 of the 380,000 reported users are now considered to have been affected.

The 185,000 newly affected users are a classic example of newly discovered attack vectors. The initial report seems to assume that only users making credit-card payments were affected. Somewhere in the incident response cycle the airline likely realized that the form for ordering reward flights also contained a credit card payment option and that its users were likely affected as well.


It can be hard to figure out how many users were affected by a data breach. Companies tend to err on the side of being overly cautious to shield themselves from liability. There is a difference between “legally affected” (any user of a breached system where data theft cannot be ruled out) and the number of people whose data was ultimately stolen.

Subscribe by email