UIBE Energy Economics: Crude Oil Signal Extraction
Crude Signal tests whether WSJ text carried crude oil signal that futures markets had not yet priced. An LDA topic model on 200K+ articles, built for the UIBE Beijing macro team.
Crude Signal tests whether WSJ text carried crude oil signal that futures markets had not yet priced. Research for the UIBE Beijing macro team, summer 2023.
The team had a hypothesis that WSJ topic mixtures (not specific keywords) led crude oil moves. They wanted a defensible test before betting capital on it.
An LDA topic model trained on 200K+ WSJ articles. Topic mixtures predict next-week crude returns. The model is evaluated against a placebo arm trained on randomly permuted headlines, to verify any apparent signal is real and not an artifact of the model fitting noise.
Forecast error cut 18% against the keyword baseline. Out-of-sample Sharpe rose from 0.07 to 0.45 (placebo arm: ~0). The team deployed the methodology into a commodity portfolio that returned 11% over the test period and kept using it after the engagement closed.
The eval was structured around the team's actual decision: whether to bet on the signal. The corpus was split into a training window, a validation window for hyperparameter selection, and a held-out window for the published Sharpe number. A placebo test, training the same pipeline on randomly permuted headlines, verified that Sharpe collapsed to near zero on randomized inputs. That second test was the one that gave the team confidence to deploy capital.
Python with scikit-learn for the LDA topic model on the WSJ corpus. The forecasting pipeline (topic mixtures predicting next-week crude returns) was packaged into a daily-runnable form with a manifest of which topics were live and a notebook the team could rerun on new corpora.
The most valuable artifact in a forward-deployed engagement is rarely the system. It is the methodology document. The system will be replaced as soon as the underlying technology shifts. The methodology, the explicit record of why certain choices were made and what conditions would justify revisiting them, persists across systems.