You might be familiar with the common use of ADS-B – plotting
planes for the curious public to see, on sites like Flightaware, or
Flightradar24. Behind these sites is a network of radio receivers that
capture Mode S transponder traffic, broadcast on 1090MHz.
These broadcasts are receivable (with a good antenna) up to 200 or 300km
away, and consist of small packets of data, known as frames.
Small would be an understatement, though – each frame consists of
either 56 bits, or 112 bits of payload, much smaller than would be
required to transmit the identity, location, speed and velocity of the
aircraft all at once. Aircraft broadcast their state spread across
multiple frames, and with data encoded
with a dirty hack in a
really cute way that makes the most of the limited message size.
We'll be looking into how aircraft are identified in Mode S messages, in ADS-B, and how (sometimes) you can extract a tail number from Mode S aircraft IDs. But, first, a brief description of Mode S communications.
Secondary surveillance radar, or SSR, is a technique that has been used in air traffic control (ATC) for decades. Originally (back in the WW2 days), aircraft positions were derived from the ground using primary radar - a radio beam goes out from a ground station, reflects off an aircraft, and the time delay and azimuth of the ground antenna indicates how far, and what direction, the aircraft is away, relative to the ground station. Primary radar was useful, but it suffers from clutter from time to time – weather, birds, terrain features and other things can cause false returns which may not be distinguishable from real aircraft.
Secondary surveillance was established by installing transponders in aircraft that would respond to primary surveillance radar pulses - when the radar was detected, the transponder would reply with a set of information from the aircraft. Mode A was an early example of secondary surveillance - the pilot would select an (octal) 4 digit identifier which would be broadcast on every received radar pulse, to identify the type of, or the individual aircraft, based on a code issued by ATC.
Over time, this evolved into the more modern Mode A/C transponders, which report pressure altitude in addition to the 4-digit identifier, allowing for more accurate vertical position estimation from the ground, given the limited elevation resolution from ground-based radars at a distance from aircraft. Including the (normalised) pressure altitude allows ATC to provide more accurate, safer vertical separation between aircraft. These transponders are limited in which data they can supply, though – the protocol only allows for octal identifier and pressure altitude, 12 bits each.
Mode S followed as a further enhancement, with a number of (really cute) hacks to suppress responses from Mode A/C transponders; the pulse timing from the ground radar is varied in each different mode, and the particular timing of Mode S inhibits responses from Mode A/C transponders. Additionally (and crucially), the number of aircraft addresses was increased from 4096 to 16 million (24-bit addressing), which eliminates one possible area of pilot error, and provides the ability for aircraft to have a stable, unique identifier in the air.
Mode S SSR is a protocol between a ground radar station, and an aircraft with an installed transponder. The ground station can communicate directly with a single aircraft, or request that all aircraft broadcast their 24-bit ID, reducing the frequency at which aircraft will talk over each other, a common problem with Mode S in busy airspace.
Mode S messages can either be short (56 bits of payload), or long (112 bits), taking 20µs and 34µs on uplink, and 64µs and 120µs on downlink, as the latter has a lower bitrate (1Mbps) than uplink (4Mbps), however, both incur a significant preamble overhead (6-18%).
Given the limited space, the protocol has a significant amount of bit packing and smart reuse of fields, especially considering the age of the protocol. Each frame must include (at a minimum) the type of frame, the identity of the aircraft, and a checksum such that simultaneous replies can be detected and rejected by the receiver. A 24-bit aircraft ID would take up nearly half the frame, for a short message!
A typical Mode S downlink frame will consist of a downlink format (type of message), payload and CRC. A few example message types are shown below:
As you can see, a number of these downlink formats don't include (nor have the space for) a 24-bit aircraft ID. How do they include it? There's another 24-bit field in every message that has the answer!
One of the ways that aircraft save space in the frame is that they don't broadcast their identity – or at least, don't broadcast their identity in every single frame. They infrequently broadcast their ICAO 24-bit identifier in plaintext, which is roughly equivalent to a computer's network card MAC address. As the name suggests, it's 24 bits, and is broadcast in an all-call reply message periodically, such that ground equipment and other aircraft are aware of which aircraft IDs are in the area. Why, you ask?
Mode S has a cute hack where aircraft IDs aren't transmitted as a standalone field, but are XORed with the CRC to provide the address/parity field. Given a (good) 24-bit checksum, there's only a 1 in 224 chance that a bad message will pass the CRC. By overlaying the aircraft ID with the CRC, the Mode S protocol increases that (expected) false positive rate to ~100 in 224, if there are ~100 aircraft in the area, which is a reasonable approximation for most urban areas. The aircraft IDs broadcast in-the-clear with a standalone CRC are used to determine which CRC results are acceptable around that point in time, and implicitly identifies the transmitting aircraft.
The International Civil Aviation Organisation (ICAO) is responsible for administering the ID space for aircraft 24-bit IDs. It does this by allocating (variable length) prefixes to countries, whose national registries then delegate individual IDs to particular aircraft. Unfortunately, the official list of these prefixes is not officially made available to the public, but by doing some quick research, you can find that countries are allocated blocks of 210, 212, 215, 218 and 220 addresses (notionally) based on the size of the aviation environment in each country, but perhaps due to political factors as well.
These 24-bit IDs tend to be allocated on the basis of an aircraft registration, which will typically also designate a particular tail number for the aircraft. You may have seen tail numbers written on the exterior of commercial aircraft, or if you've been up front, you may have seen the tail number on a placard in the cockpit. For example, N999N is a US tail number, and VH-ABC is an Australian tail number. Tail numbers are globally unique, and consist of a country prefix (N, VH-, JA), followed by a unique suffix for the aircraft. When an aircraft changes country of registration, the tail number will change too.
Some countries (mainly European and North American) have been allocated sufficient 24-bit ID space (32,768 and above) where they can define an algorithmic mapping from tail number to 24-bit ID. For example, the US has a (unique) algorithmic scheme, as do Australia, Canada, Germany and France.
Other countries, perhaps later adopters of Mode S SSR, have adopted more standard schemes for packing a tail number into a Mode S ID, including the following approaches:
For this particular encoding, it's dead simple to decode. Most countries share the same character set used for the 5-bit encoding: '?ABCDEFGHIJKLMNOPQRSTUVWXYZ?????'.
We'll walk through this with a sample aircraft ID of 0x448421. Looking through the list of country prefixes, we find the leading bits match up with Belgium, which is allocated the binary prefix 0100 01 001, or 0x448000 through 0x44FFFF. Great! We know that Belgian planes have the tail number prefix OO-.
To decode the suffix, let's mask off the bottom 15 bits: 0x0421, or 000 0100 0010 0001. It's a little clearer when we rearrange into 5-bit groups: 00001 00001 00001.
Looking up each of those 5-bit groups into our table, we get a suffix of AAA, making for a tail number of OO-AAA. Too easy!
This is perhaps a smidge harder to show visually, but just as easy to decode as the 5-bit format above. Let's try this out with a tail number of 0x471F7E. Again referring to the country prefix list, we find that Hungary has been allocated the 9-bit prefix of 0x470___, and we know Hungarian tail numbers start with a HA-.
Instead of a divide-by-32 approach that we took with the 5-bit suffixes, we'll need to divide by 26 for this tail number. First, find the offset within the Hungary block, by subtracting 0x470000 from the 24-bit ID. This leaves us with 0x1F7E, or 8062 in decimal.
floor(8062 / (26 * 26)) = 11
floor(8062 / 26) % 26 = 24
8062 % 26 = 2
From this, we can consult our friendly Latin alphabet, and look at the (zero-indexed) letter for each calculation. 11 is L, 24 is Y, 2 is C, leaving us with HA-LYC.
Not all aircraft regulators manage their address block in the same manner, so you'll see different countries using different base offsets, depending on the size of their ICAO24 address block and the way they allocate tail numbers. If you have 3 or 4 tail numbers from a country and their corresponding ICAO24 addresses, you can confirm whether they have a common base address for the tail number, and whether there is a consistent scale between each tail number (typically 1:1).
For this example, let's use North Korea, who we (experimentally) have determined start their addresses at tail number 0 = address 0x727530. We know their tail number prefix is P-, so to decode 0x7277D0, we'll subtract 0x727530.. which leaves us with 0x2A0, or P-672. Easy!
Take a quick look at the ADJS-B library if you're interested in decoding other countries' ICAO24 IDs, including Australia, Belgium, Canada, Switzerland, Germany, Denmark, Finland, France, Greece, Hungary, Japan, North Korea, South Korea, Portugal, Romania, Russia, Sweden, Thailand, Turkey, the US and South Africa.
One quick word of warning, though: these mappings were (mostly) determined experimentally, as very few countries publish their address-to-tail-number algorithms. And many countries don't actually have a static mapping – they generate seemingly-random IDs that are assigned to an aircraft, rather than rely on a static, algorithmic mapping between the two.