Treasury market research design memo

Measuring Treasury-Market Instability
Using Public Data

A study design for building transparent public-data instability measures, testing them against known market episodes, and using them to evaluate whether similar repo and liquidity behavior may reflect economically different states.

Executive Summary

This page describes a research design. This initial research aims to construct a defensible set of public-data Treasury-market instability measures that can be used in place of a proprietary benchmark when cost or transparency constraints make that benchmark impractical.

The project will test whether similar increases in repo or liquidity positioning may correspond to two different economic states: one where dealers appear able to intermediate instability under stable funding conditions, and another where similar surface-level activity appears alongside funding strain and impaired intermediation.

The emphasis here is on measurement design, regime interpretation, and falsification. The measures are proposed inputs into that process. They are not, by themselves, proof of monetization, proof of discrete state identification, or proof that public-data proxies are perfect substitutes for the Merrill Lynch Option Volatility Estimate (MOVE).

Positioning

What this page is doing

This project is proposing a measurement framework for Treasury-market instability under public-data constraints.

MOVE note

Why bring up MOVE?

  • MOVE remains the important implied-volatility reference point.
  • MOVE is expensive.
  • The project does not need to claim one-for-one equivalence with MOVE, but it does look to explain around it so I can continue my larger research.
  • So really: The actual initial design question is whether transparent public-data proxies can support meaningful testing of the broader framework when MOVE is unavailable or impractical.

Public-data rationale

Issue Repo Panic Proposed project stance
The Media It seems that every time treasury repo usage increases there is a new article that says 'everybody panic'. This is unnecessary. I am confident we can find a way to more discernibly demonstrate when there are reasons for concern and when it's simply noise
Illustration In Fall 2025 repo usage spiked and the media published numerous articles warning of systemic risk and market stress (Reuters; Bloomberg; Daily Economy). During this same window, United States' financial institutions were recording record trading desk profits. Clearly the media missed something. As I explained on TikTok, my larger theory hints towards periods of "stress" and "harvest" when it comes to repo usage.
Reading the Room When the economy is going well, and repo spikes, then bank may just be using excess liquidity in trading or other opportunities and benefitting from the ambiguous nature of the signal We should be able to utilize public data sources to gain insight to the event, perhaps even building up to events, without relying on insider 'black box' information. This research is the first step in being able to approach that larger theory.
1. Measurement Design

Proposed Treasury-Market Instability Measures

The project will build three distinct public-data measures. Each is intended to capture a different observable dimension of Treasury-market state pressure. Together they are meant to help interpretation and falsification, not to serve as a complete theoretical contribution by themselves.

Measure 1

Realized Volatility of Key Yields

A direct public-data proxy for Treasury-market instability built from daily yield changes in key maturities such as 2Y, 5Y, and 10Y.

Near-term rate-path instability
Measure 2

Yield-Curve Slope Instability

A measure of volatility in curve relationships, especially spreads like 10Y-2Y, intended to capture growth and policy outlook instability.

Growth / policy outlook
Measure 3

Return-Based Instability Index

A return-distribution measure adapted from the Brenner-Izhakian style logic, used operationally to see whether it adds regime-separation value beyond simple volatility.

Distributional instability
Measure 1 formula
RVt = √252 × √[1/(N−1) × Σ(rt−i − r̄)²]

The baseline proposal is a 20-day rolling window, with 10-day and 30-day windows reserved for sensitivity checks.

Measure 2 steps
St = y10Y,t − y2Y,t
ΔSt = St − St−1
RVSlope,t = √252 × √[1/(N−1) × Σ(ΔSt−i − ΔS̄)²]
Measure 3 idea

The return-based index will estimate the time-varying probability of downside outcomes and then measure instability in those probabilities over a longer rolling horizon. In this project, that measure is exploratory unless it clearly improves interpretation or regime separation.

Interpretive rule

The project should treat these as state variables and regime inputs. A spike in one of them is not supposed to settle the entire theory on its own.

Terminology boundary

Where the project still uses the word ambiguity, it should be read as shorthand for experienced Treasury-market instability, uncertainty, or dislocation, not as a strong claim to a strict Knightian definition.

2. Aggregation and Scaling

Independent First, Composite Second

The primary strategy will be to analyze the three measures independently before considering any composite summary. That keeps diagnostic clarity and avoids forcing different signals into a single black box too early.

Primary approach

  • Use each measure on its own first
  • Look for divergent signals instead of suppressing them
  • Ask whether similar repo activity appears under different combinations of instability and funding conditions

Secondary exploratory step

PCA may be used later to see whether a shared latent factor exists, but only after the individual measures have been examined. If the measures diverge in economically meaningful ways, the independent series should retain priority over any composite index.

Normalization
Zi,t = (Xi,t − μi) / σi

Z-score standardization will be the primary scaling choice because it makes the measures comparable across time and across construction families. Min-max scaling may still be useful for certain visuals, but not as the core analytical standard.

Key tradeoff the page should admit plainly

These measures are mostly backward-looking. That matters. The correct claim is not that they magically become forward-looking implied volatility, but that they may still be useful, transparent inputs for studying Treasury-market state pressure under realistic public-data constraints.

3. Validation Framework

Known Episodes, External Indicators, and Sensitivity Checks

The proposed measures should not be accepted on style alone. They need to be tested against event windows, compared with outside indicators, and checked for sensitivity to rolling-window choices and maturity selection.

September 2019 repo disruption

This is a central validation case because it sits near the interpretive problem itself. The analysis should not assume the answer in advance. It should ask whether the measures behave more like acute funding strain, monetizable dislocation, or a transition between the two.

March 2020 COVID-19 crisis

This should function as the strongest stress benchmark in the design. The expectation is that all three measures would register elevated readings if they are capturing meaningful Treasury-market instability.

March 2023 banking-sector stress

This episode should help test whether the measures can register a more nuanced policy and banking shock without simply collapsing all disruption into the same story.

October 2025 high repo activity / high bank profits

This episode directly motivates the research question. The point is not to prejudge whether it was stress or harvest. The point is to test whether the proposed measurement framework can help separate those interpretations.

Cross-correlation checks

The project should compare the proposed measures with outside indicators such as:

  • VIX, for broader market uncertainty context
  • SOFR-TGCR spread, for funding-stress context

Sensitivity tests

  • 10, 20, and 30 trading-day windows
  • Additional maturities where useful
  • Alternative curve spreads and return constructions
4. Robustness Testing

What the broader APL study would need to test

The eventual purpose of this measurement work is to support a broader test of whether Treasury-market instability interacts with funding conditions in economically different ways. That broader question remains to be tested; this memo only describes how the measurement layer would feed into it.

State 1

Harvest

Proposed shorthand for periods in which elevated instability appears alongside relatively stable funding conditions and dealer intermediation seems more absorbable or monetizable.

State 2

Stress

Proposed shorthand for periods in which similar surface-level instability appears alongside funding strain, impaired intermediation, or deteriorating dealer outcomes.

Important boundary

These labels are part of the hypothesis architecture. They should be presented as classifications the broader research is trying to evaluate, not as already proven categories.

What should be tested

  • Whether repo or liquidity repositioning tends to precede instability spikes
  • Whether instability relates differently to outcomes under low versus high funding stress
  • Whether similar classifications survive across alternative instability definitions

What would count against it

  • No lead-lag ordering
  • No change in sign or interaction under funding stress
  • No meaningful difference across calm, harvest-like, and stress-like conditions
5. Deliverables

Expected Outputs From This Design

Because this page is about the research design stage, the outputs below are framed as planned deliverables rather than completed findings.

Weekly series

Friday-aligned time series for each proposed instability family, including documentation for formulas, windows, and source fields.

Validation tables

Event-window comparisons, cross-correlation summaries, and sensitivity-test outputs showing where the measures do and do not appear useful.

Robustness package

Comparisons across alternative instability definitions, with clear separation between stronger support, provisional support, and failures.

Reuse and transparency

The design is meant to stay transparent and auditable. The intended outputs are machine-readable series, documented methods, and plainly stated evidentiary boundaries so later readers can tell what was actually supported and what remained open.