UIBE Energy Economics: Crude Oil Signal Extraction

Energy Econ NLP · 2023

Crude Signal tests whether WSJ text carried crude oil signal that futures markets had not yet priced. An LDA topic model on 200K+ articles, built for the UIBE Beijing macro team.

STATUSResearch, UIBE Beijing macro team, 2023

STACKPython · scikit-learn · LDA

SCALE200K+ WSJ articles, 1998-2023

RESULTForecast error -18%; out-of-sample Sharpe 0.07 to 0.45 (placebo: ~0)

Crude Signal tests whether WSJ text carried crude oil signal that futures markets had not yet priced. Research for the UIBE Beijing macro team, summer 2023.

Problem

The team had a hypothesis that WSJ topic mixtures (not specific keywords) led crude oil moves. They wanted a defensible test before betting capital on it.

How it works

An LDA topic model trained on 200K+ WSJ articles. Topic mixtures predict next-week crude returns. The model is evaluated against a placebo arm trained on randomly permuted headlines, to verify any apparent signal is real and not an artifact of the model fitting noise.

Schematic. Topic 7 prevalence in WSJ leads crude oil price moves by two trading days; Granger p < 0.05 across the 1998-2023 corpus. Published Sharpe (0.07 to 0.45) is on a held-out window versus the team's keyword baseline; the placebo arm trained on permuted headlines collapses Sharpe to near zero.

Result

Forecast error cut 18% against the keyword baseline. Out-of-sample Sharpe rose from 0.07 to 0.45 (placebo arm: ~0). The team deployed the methodology into a commodity portfolio that returned 11% over the test period and kept using it after the engagement closed.

Methodology

The eval was structured around the team's actual decision: whether to bet on the signal. The corpus was split into a training window, a validation window for hyperparameter selection, and a held-out window for the published Sharpe number. A placebo test, training the same pipeline on randomly permuted headlines, verified that Sharpe collapsed to near zero on randomized inputs. That second test was the one that gave the team confidence to deploy capital.

Stack and languages

Python with scikit-learn for the LDA topic model on the WSJ corpus. The forecasting pipeline (topic mixtures predicting next-week crude returns) was packaged into a daily-runnable form with a manifest of which topics were live and a notebook the team could rerun on new corpora.

Problem solved

The most valuable artifact in a forward-deployed engagement is rarely the system. It is the methodology document. The system will be replaced as soon as the underlying technology shifts. The methodology, the explicit record of why certain choices were made and what conditions would justify revisiting them, persists across systems.