Fingerprinting a flood: forensic statistical analysis of the mid-2021 Monero transaction volume anomaly
Contributors: Isthmus (Mitchell P. Krawiec-Thayer), Neptune, Rucknium, Jberman, Carrington
Correspondence: isthmus@getmonero.org
Code: See the project repository and visualizations notebook
Introduction
In the second half of July 2021, there was an anomalous increase in transaction volume on the Monero network. This is marked by the red line in this plot of daily transaction volume:
One cannot help but wonder about the sudden uptick, and a few questions naturally come to mind about the “source” of the excess transaction volume:
- Is the source one or multiple entities?
- What are the software fingerprints and behavioral signatures of anomalous transactions?
- How many transactions did the source generate, and how much did that cost?
Thankfully, we have the data to explore all of these questions about the anomaly and its source, courtesy of Neptune and the Noncesense Research Lab database.
If you want to jump straight to the answers, feel free to skip ahead to the conclusions section. If you want to get deep in the data and see exactly how we analyzed the activity, all the nitty gritty details are below. Note: this article assumes familiarity with the concept of ring signatures as a privacy mechanism. The relevant background is covered in chapter 3 of Mastering Monero, which is a free resource thanks to generous community crowdfunding.
Principles and limits of analysis
Before we go any further, let’s talk about the fundamental idea underlying this analysis (and its limits). The following analysis is not capable of deanonymizing arbitrary individual transactions, and is only statistically viable for wallets with a large transaction volume (hundreds to thousands of transactions per day). In other words, this aggregate profiling should not alarm day-to-day users. To help clarify its limitations, this section provides a nontechnical analogy for the key concept behind this aggregate analysis.
How is the upcoming anomaly profiling accomplished without tracking individual transactions? We look at overall statistics, and how they change when the transaction volume increases. For an analogy, consider a fruit stand that typically has about a dozen pieces of fruit out each morning: a mix of bananas, limes, apples, and oranges. On most mornings, it looks something like this:
One morning, you hear that there are TWICE as many pieces of fruit as usual! When nobody was looking, some vendor dumped an extra box of some fruit onto the display. Just knowing that there is twice as much fruit doesn’t allow you to point to a particular item and say ‘that was added by the mysterious source.’ But that doesn’t mean that we can’t infer anything. Here’s what the table looks like now:
You can probably make some educated guesses now. Since the stand now contains 65% apples instead of the usual 25% apples, the excess fruit probably came from an apple vendor. Since there are 18 apples today, compared to the usual 4 or so, you could make a ballpark estimate that they probably contributed around a dozen apples.
The statistical analysis below uses the same logic! We look at all blockchain activity during the volume anomaly and then compare the prevalence of various transaction characteristics over time to draw general conclusions.
Decomposing transaction volume
The original transaction volume timeseries has frequent oscillations on a day-to-day basis:
The reason that the combs look so precisely spaced is because there is a weekly periodicity to Monero’s transaction volume:
Since this weekly periodicity is unrelated to the phenomenon we are studying, it’s easier to look at the volume trends without this “seasonality”:
How to ascertain source attributes
To see how these plots can be used to suss out the characteristics of the source, let’s first start with a control case, looking at transactions with 11 outputs.
Both the observed counts (black line above) and the underlying trend (green line above) don’t really change at the red line when the anomaly started. And now that we’re looking at a subset of transactions (# outputs = 11) we can also ask “what fraction of transactions on that day had this characteristic?”
Since the rate of transactions with 11 outputs did not increase during the anomaly, its fractional share of the daily volume decreases at the red line. This decrease in the relative prevalence when the anomaly begins (red line) is an antipattern suggesting that having 11 outputs is NOT a fingerprint of the source.
Now let’s compare these results to statistics for the subset of transactions with 2 outputs (“2OTX”):
Above we see that the trend for 2OTXs increases in step with the anomalous transaction volume, and becomes even more prevalent than usual. Thus we can note that the characteristic of having 2 outputs is an attribute of the anomalous transactions.
Question 1(a): Source fingerprint — Number of outputs
We’ll apply the above logic to look at the first few possible numbers of outputs, below. An increase in 2-output transactions (top line) is quite clear.
Based on the increase or decrease in the prevalence at the start of the anomaly (red line in figures below), we see that N=2 matches the source, whereas N=3 and N=4 exhibit a strong anti-pattern for the source.
This trend continues for higher output counts. As noted above, we can conclude that having 2 outputs is a fingerprint of the source. This is typical of most transactions which have a recipient output and a change output. Note that because 1TXOs were privacy issue, the protocol now requires at least two outputs for all transactions.
Question 1(b): Source fingerprint — fees
Since fees are many-valued, it’s easier to peek at a heatmap (below) and see that the increased transaction volume perfectly followed the existing trend of fee recommendations from the core wallet.
Question 1(c): Source fingerprint — unlock_time
Here again, we see that the source transactions match the fingerprint of the core wallet by setting the unlock_time field to 0.
Question 1(d): Source fingerprint — tx_extra
The “extra” field for each of the source transactions contained a transaction public key and an encrypted payment identifier (which matches how the reference wallet constructs transactions). Together, these have length of 44 bytes, and we see that counts of transactions with this characteristic increase with the anomaly:
Update: An earlier version of this article explored whether the presence or absence of additional keys in tx_etra could leak information about whether a transaction recipient is a primary address or subaddress. Upon review, Koe pointed out that this analysis only works for 3+ output transactions (in which case absence of additional keys indicates conclusively that no subaddresses were involved).
Question 1(e): Source fingerprint — conclusions
Based on statistical correlations, it appears that the attacker’s transactions all had the following 5 characteristics:
- `0` in the `unlock_time` field
- A public key and an encrypted payment ID in the `tx_extra` field
- Fees constructed according to the core wallet implementation
- Exactly 2 outputs
All of these properties match the core wallet perfectly, so it seems likely that the anomaly transactions were generated by that reference wallet software. With this filter, we can look at what portion of all transactions matches the overall profile.
Question 2(a): Is the source one or more entities? Analyzing input counts
It’s natural to be curious about whether the anomalous transaction volume was being generated by a single party (like a university research group collecting data for a paper) or by many independent entities (like the organic growth of users when a new platform becomes popular). One way to explore this is examining input consumption patterns (counts, timing, etc).
To start, let’s look at the number of transaction inputs, which is naturally dynamic depending on the transaction amount and the distribution of transaction amounts in the wallet. So if the anomaly is due to organic growth of unrelated individuals, we would expect there to be no correlation between the output amounts in their wallets and impact on input counts. However, what we see is the opposite — a remarkably sharp uptick in transactions with 1–2 inputs (the top two lines).
First we note that when the anomaly begins, there is a huge uptick in 1INTX and 2INTX with no appreciable corresponding spike in 3+ transaction inputs. If you look at earlier portions of the timeseries, you’ll note that this is an unusual pattern not seen before the start of the anomaly. We can see this more easily if we normalize each trace to have zero mean (to get rid of the vertical offsets) and unit standard deviation (to get them on the same scale).
Looking at the 2021 data above, there were roughly four different phases. During Phase A, when there is a spike or lull in transaction volume, it is reflected across all input counts. This is what we would expect during organic use. During Phase B, there are correlated spikes in 4+ input transactions that are not reflected in 1–2 input transaction counts. This is an oddity not explored further here. During Phase C, there’s not much action. Then, during the anomaly in Phase D, we see unprecedented behavior that is essentially the opposite of Phase B, namely a huge (4 standard deviation) uptick in 1–2 input transaction counts that is not reflected in 4+ input transaction counts.
To dig into this a little more closely, let’s pick a low count (let’s say 2 inputs) and one of the high counts (let’s say 4 inputs) and compare their absolute counts. Looking at Phases A — C, we would imagine that they’re relatively correlated, and in fact we do see that over YTD data (one point per day) nearly the entire data set falls into a cone of normalcy, only once wandering severely out of the typical range.
As you probably guessed, the extreme deviation corresponds to the anomaly, annotated here as circled data points:
Placing the input count correlation plot next to the input count timeseries for Phase D, we can see two subphases, ‘a’ and ‘b’
During the first portion (‘a’), the timeseries (left) shows this massive increase in 1–2 transactions that does not come with the typical correlated increase in 4+ input transactions. This shows up in the correlation plot (right) as the horizontal base of the unprecedented deviation from the cone of normalcy. In the second portion (‘b’) as the anomaly progresses and reduces in volume, the 1–2 input transaction count starts to fall back down while the 4+ input transaction count increases by two standard deviations, eventually retracing back into the cone of normalcy.
Why are we looking so closely at this in such excessive detail? Let’s again consider the scenario of organic growth: based on intuition and historical data, we expect that spikes in transaction volume to be reflected across input number counts, remaining within the cone of normalcy. What did we see instead? A transaction spike that consisted almost entirely of 1–2 input transactions, with overall characteristics a significant distance outside the historical norm.
This is interesting because the core wallet will preferentially construct 1–2 input transactions before resorting to 3+ input transactions. Let’s do a thought experiment, and imagine that we have a single wallet with a vast number of available outputs to spend (mise en place). What happens if today we flip on a switch and start generating a huge number of small-amount transactions from the wallet back to itself? For the first few days the wallet would have lots of flexibility in choosing existing outputs, and would generally be able to achieve its goal of creating 1–2 input transactions (with no appreciable production of 3+ input transactions). However, as time progresses, outputs are created and destroyed and eventually the wallet becomes more constrained and must resort to creating 3+ input transactions more frequently than at the start. (Why more constrained? Because our standard wallet only produces 2-output transactions: self-’recipient’ and ‘change. So if we have 500 outputs available then construct a transaction that consumes 5 outputs and produces 2 outputs, then we only have 497 outputs available. In fact, every transaction with 2 outputs and 3+ inputs will reduce the wallet’s unspent output count, and further constrain its subsequent input selection).
Question 2(b): Is the source one or more entities? Analyzing spend time distributions
Let us continue to consider the above thought experiment. If we have a wallet with many outputs available, and then we flip on a switch so that it starts very rapidly sending transactions to itself, what would be the impact on the distribution of ring member ages? Initially, because new outputs haven’t been created yet, the wallet would be constructing transactions that use outputs from before the switch is flipped, so we would see no significant change in how often very young ring members are selected. But over time, the older outputs are consumed, and the wallet by necessity must construct transactions from young outputs created after the switch was flipped.
Returning from the thought experiment to actual on-chain data, let’s look at the age of every single ring member in every single transaction generated over the past few months, and visualize it as a heatmap.
There are a lot of stories to unpack from this plot, but for today let’s just focus on the facets related to the volume anomaly. We’ll zoom in to the lower right of the above plot to look at the circled feature:
Observations from the weeks leading up to the anomaly (left of the black dashed line) roughly match expectations for how frequently the wallet would select decoys from transactions that are less than 50 blocks old: sometimes, but not often. After the anomaly starts, there is initially no appreciable change in the rate at which outputs < 15 blocks old are included in transactions, presumably because older outputs were initially being consumed. Shortly thereafter, the observed trend shifts dramatically, with a disproportionate number of rings including a member from within the last 15 blocks, which is what we would expect to see when the set of older outputs becomes exhausted, causing the wallet to begin constructing outputs created by transactions during the anomaly.
This effect can also be seen by looking at the age of the youngest ring member, which under normal conditions is typically about 40 blocks old on average. When the anomaly begins, this shifts to be dramatically earlier, suggesting the frequent consumption of very recently created outputs.
Question 3: How many transactions were generated at what cost?
This section is VERY speculative and involves making a bunch of assumptions that could introduce significant error. Take these estimates with a grain of salt. First, let’s look at the anomaly and the immediately preceding baseline:
If we make an assumption that the baseline organic volume stayed the same, then we can remove the offset to estimate source volume:
Integrating under the above transaction count curve (from red line to right side) gives us an estimate for the number of transactions that were “excess” of the norm: 365,000 transactions over the course of a few weeks. The anomalous transaction volume was paying standard fees, which at the time was about 0.000015 XMR per transaction with this construction. So we can estimate their daily costs:
Applying a cumulative sum over that window, we can estimate a total cost of 5 XMR:
At the time, the exchange rate for Monero was about 200 USD per XMR, so the cost of generating ~365,000 transactions would have been $1,000. A 2-in/2-out transaction weighs about 1.93 kB so these transactions would have added about 700 MB to the chain, at a cost of $1.40 per MB.
Conclusions
Let’s revisit our original questions based on what we’ve observed above.
Is the source one or multiple entities? All signs point towards a single entity. While transaction homogeneity is a strong clue, a the input consumption patterns are more conclusive. In the case of organic growth due to independent entities, we would expect the typically semi-correlated trends across different input counts, and no correlation between independent users’ wallets. During the anomaly, we instead observed an extremely atypical spike in 1–2 input txns with no appreciable increase in 4+ input transactions
What are the software fingerprints and behavioral signatures of anomalous transactions? The anomalous transactions appear to have been generated by the core wallet, or one that matches its signature. The source used default settings for fees and unlock time, and only generated transactions with 2-outputs. They appeared to be generating transactions as fast as possible, resulting in frequent spending of outputs that were only 10–15 blocks old.
How many transactions did the source generate, and how much did that cost? A very rough estimate is 365,000 transactions, for a total cost of 5 XMR (worth $1000 at the time). A back of the envelope calculation suggests that the anomaly contributed somewhere in the ballpark of 700 MB, at a cost of $1.40 per MB.
Future work
So far, we have mostly focused on aggregate statistics and counts, without even touching on ancestry analysis. Due to the volume and speed of the anomaly, it is likely that many of the rings can be deanonymized (i.e. we can identify which was the true spend versus decoys). We already have traction by knowing the anomaly fingerprint, which allows us to rule out ring members that don’t match the signature. Contextually, ring members from freshly converted coinbases can also be ruled out. The fact that transaction generation was so rapid (see section 2b) is also detrimental to the anomaly’s privacy. Since a vast number of transactions were generated with an extremely fast spend time (outputs typically being consumed 10–15 blocks after creation), it is likely that the guess-newest heuristic will be very effective.
Furthermore, a topological analysis of the transaction tree is likely to be very informative. We would assume that usually funds diffuse among entities and accounts. In typical use cases, the outputs of a transaction are sent to different recipients (e.g. 1 output to coffee shop, 1 output back to wallet for change). However in the case of a core wallet repeatedly initiating transfers from itself to itself, both the “recipient” and the “change” output are sent to the same wallet, and may be combined later! Unless the anomaly exhibited extremely precise change control practices, it is likely that in many cases two outputs created by the same transaction would be consumed by the same or subsequent transactions. Recombinations along the graph are a topological signature that would occur very frequently in the case of a churning, relative to chance occurrences from random decoy selection. The core wallet algorithmically attempts to spread out how inputs are sampled, to avoid obvious recombinations (like using two outputs from the same transaction as inputs to another transaction). So recombinations would not be apparent from looking at a blockchain explorer, however a graph topology analysis engine could identify these loops even if the recombinations are spread out across several hops and several hours or days.
For a thought experiment to show why this must occur, imagine that you have a script like this:
# Run indefinitely until the wallet runs out of funds
while 1==1:
try:
core_wallet.transfer(amount=1, recipient=self.address)
The only way to run the wallet to empty is to spend all the valued inputs that you have created during the anomaly. Recalling the convergence property noted at the very end of section 2(a), we realize that that the first 100,000 outputs produced during the anomaly are all upstream on the transaction tree from the final 100 transactions of the anomaly. The only way to merge 100,000 outputs down to 100 outputs is for combinations to occur along the way.
Combining all of these factors, it is likely that in many cases the true spends will be heuristically identifiable for the anomalous transactions. Remember the fruit stand analogy above; this is specifically possible due to the anomaly homogeneity and sheer volume increase relative to baseline, and is not as easily applicable to low-volume day-to-day users. One interesting extension might be to examine how often transactions matching the anomaly signature were used in rings for transactions that don’t match the anomaly signature, to put an upper bound on how many general transactions were potentially impacted by sampling decoys from the anomaly. Based on the estimated source volume from the section estimating counts and costs, even at the peak of excess volume we attribute less than half of the transactions to the anomaly (17,500 out of 40,000 on the highest day).
As an aside, a few months prior to the main anomaly studied in this writeup, there was a similar signature in the ring member age distribution (and other traces). While not conclusively linked, the phenomena look similar, and the earlier anomaly also occurred during a spike of anomalously high transaction volume.