From Trinity to Liquidity

Dr. Strangelove, or: How I Learned to Stop Worrying and Love the VWAP

Nov 20, 2022

Inaugurating this Substack with a big bang, and teaching you how basic algebra and a few simple ideas from physics can help you to steal nuclear secrets from the US government, or maybe just learn how to eke out a few extra basis points on your fills—whichever one you’re in to.

In the mists of yore, when I was a young and optimistic graduate student, we had to take a survey “Overview of Applied Mathematics” course—giving brief examples of techniques from a bunch of different subfields on a bunch of different topics. The very first lesson on day 1 was on “dimensional analysis”. While some of you may have flashbacks to high school stoichiometry problems, I promise this is at least marginally more interesting. The basic idea is to note that every measurable quantity in the universe can be expressed as some combination of seven fundamental units- length, mass, time, temperature, quantity, electric current, and luminosity. Everything else: energy, speed, pressure, magnetic flux density, whatever, is derived by combining those units in different ways. For example, energy is mass times length squared divided by time squared (since kinetic energy = 1/2*m*v^2), pressure mass over length times time squared (left as an exercise for the reader), etc.

Since equations need to, well, be equal, the units on both sides must match each other. If one side of your equation has watts times an unknown quantity, and the other side of your equation is in joules, then you know your unknown quantity must have a unit of seconds, since a joule is a kilogram meter squared per second squared, while a watt is a kilogram meter squared per second cubed. This is useful because it means, without any work beyond algebra, we can figure out very important things about a problem’s structure and the functional form of its solution. It’s the favorite hack of the lazy first-principles physicist, who wants to be able to say something meaningful about a system without actually thinking through what they’re doing, and annoyingly, it often turns out to be really accurate. Let’s start off with the same example I was taught: the Trinity nuclear detonation.

On July 16th, 1945, the Manhattan Project detonated the first ever atomic weapon, the Gadget, in a test codenamed Trinity in a remote New Mexico desert. This was an incredibly classified event, and while the world would later learn about the existence and power of nuclear weapons, much of that knowledge would remain secret (the exact mechanism of the Teller-Ulam thermonuclear design, for instance, is much speculated upon but not known in detail even to this day). However, a few years after the test, a famous photograph was published in Life magazine, showing the fireball shortly after detonation, and crucially including a distance scale and timestamp as well:

A black-and-white photograph of the upper hemisphere of a nuclear blast, with a timestamp and length scale — Image from Los Alamos National Laboratory

Upon seeing this photograph, G. I. Taylor, a British physicist, was able to work out the beyond-top-secret explosive yield of the bomb, with nothing more than dimensional analysis and some algebra. His reasoning was as follows:

Some quantities relevant to the fireball he could measure from this photograph were energy, time, distance and density
The energy (E) released by the explosion had the units of [kg][m]^2/[s]^2
The time (t) elapsed since detonation had the units of [s]
The radius (r) of the fireball had the units of [m]
The density (ρ) of the ambient air had the units of [kg]/[m]^3
The radius of the fireball was likely a function of the other three variables, r = f(E,t,ρ)

Since units must be multiplied or divided (you can have a meter/second, but not a meter+second), f had to be a product of its arguments to various exponents: r = E^a*t^b*ρ^c. Furthermore, in order for the units to match on both sides, the following conditions must be true:

a = -c (so kilograms vanish from the right hand side)
b = 2a (so seconds vanish from the right hand side)
1 = 2a - 3c (so we match the [m] on the left hand side)

This is a linear system of three equations in three variables, and it can be solved with basic techniques to yield a = 1/5, b = 2/5, c = -1/5. Rearranging in terms of E, we now get E = ρ*r^5/t^21. Google tells us that air's density is about 1.2 kg/m^3, and from the picture it looks like the fireball is about 120 meters in radius, so we can calculate:

E = (1.2)*(120^5)/(0.016)^2 = 1.16E14 J of energy = 27 kilotons of TNT

The actual, classified, Manhattan Project estimate of the energy of the detonation was 25 kilotons, so it’s impressive that a single photograph got us to within 10% without knowing a thing about nuclear physics, weapons engineering, trans-uranic chemistry, or any of the other secret details of this test.

Ok, ok, you got me, I didn’t actually write and send you a newsletter so I could wax on about favorite subject, estimating the size of explosions from pictures—this is ostensibly a finance Substack after all. What’s important in the above example is the procedure of dimensional analysis, not the subject matter:

Identify a problem of interest
Use domain knowledge to pick out a few variables you know affect the problem
Algebraically combine those variables in the “correct” way, so the dimensions foot
Check against other data to validate your result isn’t stupid and marvel at your brilliance (or, if it doesn’t pass the smell test, pick new variables and try again)

Let’s consider problem in finance, rather than nuclear physics: trade impact. This is something that flies below the radar of many investors and analysts, but it’s very relevant in surprising ways. Consider a security (it could be anything: stock, bond commodity future, whatever, as long as it's trading in a limit order book, but we'll assume it's a US-listed equity since that's what most people are familiar with) currently trading with a bid/offer of 10.00 x 10.02. If you’re just buying 1 lot/100 shares, you’ll pay 10.02 to buy it, since that’s what the offer is. If you’re moving size though, that answer could be quite different. A 10,000 share purchase won’t be able to source sufficient liquidity from the top of the book: you might buy 500 shares at 10.02, 1,500 at 10.03, 3,000 at 10.04, and 5,000 at 10.05, for an average price of 10.0425. That 2.25 cent gap, between the 10.02 price and what you actually paid, is the "trade impact". In this example, it wasn’t huge—2x the size of the mid-offer spread—but especially for large trades, it can be dozens or even hundreds of times bigger than the spread2. An investor who thinks a position has 6% expected alpha, and pushes the price up by 1.5% on entry and down by 1.5% on exit, has just destroyed half of their edge. It may well be preferable to take a smaller edge on a liquid stock with minimal impact over a large edge on an illiquid one where significant portions of the edge are vaporized by blasting in and out of a position.

Of course, in order to evaluate that tradeoff, we need to be able to come up with a decent a priori estimate of impact. Let’s try and do that. First, let’s come up with a list of things we think might be relevant to our problem, like G. I. Taylor before us:

Impact (Δ): What we’re trying to measure: how much worse our order is from an arbitrary starting point, like the mid-price in the market when we hit send. Units: [%].
Spread (s): This is the first and most obvious thing, since it impacts every trade, even ones too small to otherwise move the price. A 20 bps spread, like the example above, is obviously going to be a better than something trading 10.00 x 10.20 or 10.00 x 12.00, even if we’re just trading 100 shares! Units: [%]
Order Quantity (Q): Again, this is obvious. A 100 share order will have less impact than a 5,000 share order, which will have less impact than a 100,000 share order. Units: [shares]
Average Volume (V): In general, experience suggests that something which normally trades 20 million shares a day will be easier to get in and out of than something that trades 20,000 shares a day. Units: [shares]/[t]
Volatility (σ): A security which ping-pongs back and forth all day, or trades at 10.00 one day and 17.00 the next and 8.00 the day after that, is probably going to be tougher to trade as well. Units: [%]*[t]^0.5- remember, volatility scales like the square root of time, so something with a volatility of 16% annual has a volatility of 1% a day.

On our left side, we want to have units of %, so our right side also has units of percent. Spread also has units of %, so we can feel free to add and subtract it rather than multiplying or dividing. Everything else, however, has mixed/other units, so our equation is of the form Δ = k1*s + k2*f(Q,V,σ). Using the same ansatz as above, f(Q,V,σ) = Q^a*V^b*σ^c. Busting out more algebra:

c = 1, since it’s the only place % appears
a = -b, since this gets rid of the [shares] unit
b = -c/2, since this gets rid of the [time] unit
k1 = 1, since as Q → 0 and we just buy a single lot, we only pay the spread once
k2 = unknown, so let’s call it K, an arbitrary constant. Like before, this is approximately 1.

Putting this together, we have:

Δ = s + K*σ*sqrt(Q/V)

This is well-known in the literature and widely used in industry (TCA <GO> on your Bloomberg terminal, for example), to the point where it’s simply referred to as the “square root model” for trade impact. K is a fudge factor, which is usually in the neighborhood of 1-3ish- closer to 1 for liquid US mega caps, closer to 3 for OTC Kazakhstani microcap bank warrants. This formula works remarkably well- for a huge variety of assets in a huge variety of markets, with a very small number of inputs and no fancy stochastic calculus needed to derive it. In practice, it can often get you to within a percent or so of the actual impact experienced if you’re careful, which is “good enough” for plenty of purposes.

Interestingly enough, time does not appear anywhere in this formula. Whether we sent in a passive VWAP order over the entire day, or an aggressive one over 10 minutes, or blew out of a position with a single market order, the average impact would be the same, we would just experience it faster or slower. More refined models do incorporate participation rate/aggression, but it has a second-order effect compared to the above factors3.

Helpfully, for those who dislike algebra, there’s a purely geometric way to think about this instead- and it offers us additional insight into the problem in a really neat way. Imagine, if you will, a better order book than what I can draw in MS Paint, which might look something like this:

A hypothetical order book. Mid price is P0, spread is s, bars represent the quantity of shares offered at a price P

Aggressive market sell orders will eat into the resting limit buys, driving the price down, and aggressive market buys will eat into the resting limit sells, driving it up. You may note that this looks awfully linear- in general, there is more liquidity further from the last price, and less closer to it. While this linear pattern doesn’t hold everywhere, it’s true close enough to the mid that we can make this approximation. Having done so, imagine we placed a trade to gobble up this book until we had our fill of Q shares. The total distance we went until we were filled would be Δ, our trade impact.4

Linearization of above order book. It doesn’t have to be a good least-squares fit, it just needs to roughly match the averages under integration. To quote a former professor, “you can toss any old shit in an integral and it usually works fine”.

Since the area of a triangle is 1/2*base*height, that means Q = 1/2*(Δ)*(λΔ), or Δ = sqrt(2Q/λ). This is obviously the same formula we derived above, with λ = 2V/(Kσ)^2. This gives an interesting spin on what volatility in markets “is”, from a microstructural perspective: a security’s variance is proportional to the ratio of the arrival rate of trades to the steepness of the order book. If a lot of trades arrive very quickly, or if the order book is shallow and there isn’t much resting liquidity, then a security will be volatile. By contrast, if a security trades rarely, or has a very steep book with lots of bids and offers and trading interest, then its price will generally not move as much, and it won’t be volatile.

Dimensional analysis is a tool, not a magic wand, but it’s certainly a very handy one to have, and it provides a useful framework for thinking about stock-flow comparisons in a more rigorous manner. Price to earnings and interest coverage are both generically termed “ratios”, but this is far from accurate. The former has units of time, more akin to a corporate bond’s duration than anything else, while the second is a purely dimensionless ratio like a Reynolds number, suitable to be fed in to all kinds of black-box functions spitting out arcane default predictions. Just keep your head about you and remember not to go too overboard ~~stealing~~ liberating tools from physicists, or the next thing you know you’ll be writing crank posts about how analyst ratings can be treated as spins on a mean-field Ising lattice and eating squid ink pasta.

Until next time,

Q.5

Putting this here instead of going on a long textual digression: we have omitted a constant of proportionality k in front of this whole thing, which will be included later. I am doing another lazy physicist trick, which is to assume that all constants are approximately 1, and thus we can ignore them and still get an answer which is the correct order of magnitude whether k is 0.86 or 2.13 or whatever. The fact that we end up taking the 5th root of it anyways means it gets smushed towards 1 anyways, so we don’t lose much fidelity over a wide range of magnitude with this assumption.

For a real world example, at the time of writing the bid-offer spread on Robinhood stock, $HOOD, is about 10 bps. Sam Bankman-Fried, owner of a cool 56,273,469 shares, was reported in the financial press to be attempting to sell his stake for a 20% discount to the average price. This indicated that his trade impact was expected to be >400x the spread to mid.

“More complicated models” invoked here are far, far more complicated mathematically speaking- they are usually integro-differential equations that are enormously difficult to solve. Simplifying immensely, many of these models imply that the “optimal” way to place a large trade is to do a chunk of it in a single large market order/block trade, VWAP the bulk of it with a low participation rate, and then finish up with a second large trade, with the relative size of the first block/VWAP/second block dependent on a variety of factors beyond the scope of this newsletter.

Another boring, technical footnote: not really. The total amount the price is moved is what’s called the “temporary impact”. Since we are able to buy some stock for less than this price, our actual experienced impact, the “permanent impact” is some fraction of that, usually around 2/3rds. I will again invoke the handy tool of footnote 1, “pretend all constants are 1” and continue.

Light editorial assistance provided by

CVAR Newsletter

. Please subscribe to Veb if you are not already doing so.

One notable point I forgot to add to the post: G. I. Taylor was not merely a random British physicist reading a magazine with the pictures. He not only worked on the Manhattan Project, but was one of the ~10 scientists actually present at the Trinity test itself. So while this is presented as him being clever and working things out from first principles, in fact the "knew" the answer already, and then figured out how to back into it from open sources to publicize what the yield was without breaking classification.

...I guess that does undermine the point somewhat, oh well.

Expand full comment

Quantian’s Newsletter

Discussion about this post