Paper ID: 2108.00480

Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

Eghbal Rahimikia, Stefan Zohren, Ser-Huang Poon

This study develops a financial word embedding using 15 years of business news. Our results show that this specialised language model produces more accurate results than general word embeddings, based on a financial benchmark we established. As an application, we incorporate this word embedding into a simple machine learning model to enhance the HAR model for forecasting realised volatility. This approach statistically and economically outperforms established econometric models. Using an explainable AI method, we also identify key phrases in business news that contribute significantly to volatility, offering insights into language patterns tied to market dynamics.

Submitted: Aug 1, 2021