Topics in QF
  • Home
  • About
  • Lecture 01
  • Lecture 02
  • Lecture 03
  • Lecture 04
  • Lecture 05
  • Lecture 06
  • Lecture 07
  • Deep Hedging

On this page

  • 深度对冲 (Deep Hedging)
    • Deep Hedging
      • Machine Learning Notation

深度对冲 (Deep Hedging)

Quotes from 知乎:

早在2008年,摩根大通(J.P. Morgan)量化策略分析师 Hans Buehler 就开始考虑用一种新方法来做衍生品对冲,并认为这种方法颠覆了现代大部分量化金融理论。他认为依靠机器学习技术,可以根据过去有效的方法来确定未来的哪些对冲可能有效。银行可以仅使用基于数据的方法来对冲衍生品,而不是用传统的 Black-Scholes、Heston 模型。

Hans Buehler

图片来源:Risk.net

Quotes from Risk.net:

Statistical hedging was only a forerunner, though, to Buehler’s true objective: a machine that could learn how to form replicating portfolios for complex over-the-counter derivatives, for which payoffs might be path-dependent and data scarce.

Buehler dubbed it “deep hedging” – and he went on to describe such a machine in a 2019 paper of that title. The paper has garnered more than 10,000 views and downloads across different sites since publication. A LinkedIn post of a presentation he gave on the topic last year was viewed 33,000 times.

To apply deep hedging required one additional step, however.

The neural network demands far more data for training than exists. So, Buehler and his JP Morgan colleagues built a market generator to create simulated but realistic market data to train the model on.

Last year, Buehler detailed the firm’s work using signatures – a way of encoding time series data that captures how the data evolves through time – to build a data generator that replicates features of real markets, including correlations.

In another working paper also released last year, JP Morgen quants proposed a way to eliminate the ‘drift’ the machine would otherwise learn from observing a market that trends. Essentially, this is a way to stop the engine thinking markets that have drifted higher in the recent past always will. “You need to make sure the machine understands that’s an estimation error,” Buehler says.

Deep Hedging

Bühler 提供了两种版本的 Deep Hedging 论文:

  • 使用 math finance notation 的版本: (Buehler, Gonon, Teichmann, and Wood n.d.)
  • 使用 machine learning notation 的版本: (Buehler, Gonon, Teichmann, Wood, Mohan, et al. n.d.)

Machine Learning Notation

Modern quantitative finance has developed a rich toolkit for handling derivative pricing and risk management under the idealized “complete markets” assumption of perfect hedgability in the absence of any trading restrictions or cost. It has not yet succeeded in providing a scalable industrial approach under more realistic conditions which take into account such market frictions. As a consequence, the practical risk management of non-electronic over-the-counter derivatives is still to a large extent manual, driven by the trader’s intuitive understanding of the shortcomings of the existing derivative tools.

In this article, we take a first step towards a more integrated, realistic and robust approach to automated derivative risk management by applying modern deep reinforcement learning policy search. In the context of derivatives risk management, a policy means a hedging strategy. We propose to use neural networks to represent our hedging strategies.

The networks are trained on simulations of future states of the market, including all relevant hedging instruments. Designing a market simulator is not the focus of this paper, and we use toy simulators for the experiments presented here; an advantage of our approach is that the procedure for learning the optimal hedging policy is independent of the choice of market simulator. This is not the case for standard approaches to hedging derivatives.

关于 Hedging Instrument 的假设:

  • it can be traded daily with sufficient liquidity (but not necessarily at zero cost)
  • our trading does not affect the price

Notation:

  • \(d\): \(d\) 种 hedging instruments are used in each time step.
  • \(\mathcal{S}_t\): The market observed at some point \(t\) represents our state space \(\mathcal{S}_t\).
    • all current and past prices
    • cost estimates
    • news
    • anything else we might deem necessary for determining our risk management strategies
    • past trading decisions
    • the internal state of our policy
  • \(s_t \in \mathcal{S}_t\): the state of the market at time \(t\).
  • \(h_t \equiv h_t(s_t)\): Holding a hedging instrument will at some point trigger a cash flow, either positive (received) or negative (paid). These payments are denoted by the d-dimensional function vector \(h_t(s_t)\).
  • \(H_t \equiv H_t(s_t)\): the vector of observable mid-market prices at time \(t\).
  • \(\delta_t\): our current position in our hedging instruments.

TODO: TO BE CONTINUED…

References

Buehler, Hans, Lukas Gonon, Josef Teichmann, and Ben Wood. n.d. “Deep Hedging.” {{SSRN Scholarly Paper}}. Rochester, NY: Social Science Research Network. Accessed August 4, 2025.
Buehler, Hans, Lukas Gonon, Josef Teichmann, Ben Wood, Baranidharan Mohan, and Jonathan Kochems. n.d. “Deep Hedging: Hedging Derivatives Under Generic Market Frictions Using Reinforcement Learning.” {{SSRN Scholarly Paper}}. Rochester, NY: Social Science Research Network. Accessed August 4, 2025.