Introduction: A Garlic-Oriented Meeting
The first time I met Emanuel Derman, it was in the summer of 1996, at Uncle Nick’s on 48th street and 9th Avenue. Stan Jonas paid, I remember (it is sometimes easier to remember whopaid than the exact conversation). Derman and Dupire had come up with the local volatility model and I was burning to talk to Emanuel about it. I was writing Dynamic Hedging and in the middle of an intense intellectual period (I only experienced the same intellectual intensity in 2005-2006 as I was writing The Black Swan). I was tortured with one aspect to the notion of volatility surface. I could not explain it then. I will try now.
First, note the following. Local volatility does not mean what you expect volatility to be along a stochastic sample path that delivers a future price-time pair. It is not necessarily the mean square variation along a sample path. Nor is it the expected mean-square variation along a sample path that allows you to break-even on a dynamic hedge. It is the process that would provide a break even P/L for a strategy.
The resulting subtelty will take more than one post to explain (or I may expand in Dynamic Hedging 2). But I will try to explain as much as I can right here.
The first problem is that options are not priced off a mean-square variation in the underlying, but off a mean variation in the underlying. I cover the point elsewhere –the use of L2 norm is not adequate. Skip this for now.
The second problem is that options have a vega variation ALONG THE PATH so the PL for a strategy is decomposed as PL from variation in S (asset price), and P/L from changes in “volatility” (or expected mean deviation) –what the option is reflecting about future additional variations in the price of S between that point of evaluation and some terminal expiration. The P/L from changes in both implied and delivered volatility is path dependent. Severely path dependent.
Look at the graph. Take S an equity index. Assume that you Start at S0, at time t0. At time t2 you are at S2. Fine. But you can get there by two ways. The first way, path 1, is through S1a. The second one, path 2, is through S1b. Note that path 2 is more volatile than sample path 1 (mean square or mean absolute deviations). Consider an option valued at time to and at time t2 (it expires sometimes in the future, say t3).
Now take the standard “implied volatility” (what we call implied volatility by inverting the commonly used Bachelier-Thorp model, what I used to formerly & mistakenly call “Black Scholes” and is still called so by those who have not spoken to Espen Haug). Paradoxically, path 2 will lead to a lower terminal implied volatility at time t2, although that sample path was more volatile. The “retracement” brought a lower volatility. It is not just implied volatility that is lower at S2b (2nd path), but the future variations in the underlying S are expected to be dampened. You can check it empirically by taking volatility at new lows (say with S at the min three months window) and comparing it to situations in which you recover from new lows.
In my discussion of barrier options I talk about prices in areas with stops and a high density of barriers that have not been knocked out. This is far worse.
So we see a path dependence, a strong memory for the route taken by a price, etc. Mathematically, it means that you cannot easily work with a process –transition probabilities are not unconditional.
But the equation is fine and useable; it is the naive interpretation that is often wrong. I initially thought that if we had sigma0 at (S0,t0) and sigma 2 at (S2, t2), the local volatility should be ON AVERAGE approximately (sigma0 +sigma 2)/2. That is very rough approximation. It will be lower, much lower for large deviations, higher for smaller ones.
Forward volatility is what it takes to break even in a strategy along the average of all routes followed by the underlying security; there is a stochastic element in it and nonlinearities in option reactions to these variations. Because of such stochastic element, it will be HIGHER than the average sample paths for out of the money options, and lower for at the money ones. Why lower for the at the money ones? Because the collection of paths that will end unchanged are far less troublesome than the ones that stray.