Foundation Models for Numerical Tasks

1. Language models

By now, we all know that large language models (LLMs) are very capable in qualitative and language-based tasks. The jury is still out, however, concerning their 𝐫𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐧𝐮𝐦𝐞𝐫𝐢𝐜𝐚𝐥 skills.

Researchers at the University of Chicago’s Booth School of Business (my alma mater) used 𝐅𝐢𝐧𝐚𝐧𝐜𝐢𝐚𝐥 𝐒𝐭𝐚𝐭𝐞𝐦𝐞𝐧𝐭 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 (FSA) to test LLMs’ ability to analyze and synthesize purely financial numbers (paper here). The task was to predict whether earnings will grow or decline in the following period (various timeframes tested). The LLM (GPT 4.0 Turbo) was not given any textual information, just numbers, as shown in Fig. 1.

Figure 1: One shot prompting: quantitative input data for the prompt (image from the paper).

After telling it to assume the role of a financial analyst, 𝐂𝐡𝐚𝐢𝐧-𝐨𝐟-𝐓𝐡𝐨𝐮𝐠𝐡𝐭 (CoT) techniques guided the LLM towards its answers. The LLM was asked to:

Identify notable changes in the financial statements.
Compute financial ratios, by first stating the formulae, then computing the ratios.
Provide economic interpretations of the computed ratios
Predict the directional change of future earnings and provide the rationale for that prediction.

Figure 2: The LLM’s answer (image from the paper).

The authors found that the LLM, with CoT, easily outperformed the median financial analyst. Even though the LLM was only given quantitative material, it benefited from its general ‘understanding’ of the world, including business and investment know-how, combined with an emerging form of intuitive reasoning and a capacity to formulate hypotheses. Moreover, human financial analysts suffer from statistical bias, in all likelihood more so than LLMs in this specific, quantitative use case.

The authors also trained a three-layer artificial neural network (ANN) on a vast body of data. This 𝐭𝐚𝐬𝐤-𝐬𝐩𝐞𝐜𝐢𝐟𝐢𝐜 𝐀𝐍𝐍 just matched the general-purpose LLM’s accuracy. A remarkable result, considering that an 𝐨𝐟𝐟-𝐭𝐡𝐞-𝐬𝐡𝐞𝐥𝐟 𝐠𝐞𝐧𝐞𝐫𝐚𝐥-𝐩𝐮𝐫𝐩𝐨𝐬𝐞 𝐋𝐋𝐌 without any further fine-tuning was used.

Overall, FSA is an interesting use case demonstrating the numerical skills and emerging reasoning capabilities of general-purpose LLMs. I’d like to see the results of this study when the LLM was fine-tuned with the data fed into the ANN…

2. Specialized foundation models

Above, I showed research demonstrating how a language model, basically pre-trained to perform next-word prediction, was capable of accomplishing 𝐧𝐮𝐦𝐞𝐫𝐢𝐜𝐚𝐥 𝐭𝐚𝐬𝐤𝐬 and some related reasoning.

Recently, a new breed of specialized foundation models has emerged. 𝐓𝐢𝐦𝐞𝐆𝐏𝐓 is such a model 𝐬𝐩𝐞𝐜𝐢𝐚𝐥𝐢𝐳𝐞𝐝 𝐢𝐧 𝐭𝐢𝐦𝐞 𝐬𝐞𝐫𝐢𝐞𝐬: It is pre-trained on over 100 billion rows of financial, weather, Internet of Things (IoT), energy, and web data.

In their latest paper, my LIRIS colleagues tested TimeGPT for soil water potential prediction for orchards. As data gathering in agriculture is often expensive, the relative 𝐬𝐡𝐨𝐫𝐭𝐚𝐠𝐞 𝐨𝐟 𝐝𝐚𝐭𝐚 often precludes data-hungry deep learning methods such as LSTMs.

Figure 3: TimeGPT architecture (image from the paper). Notice how a CNN replaces the Feed Forward layers from the original GPT architecture.

They find that, with minor fine-tuning using the target variable’s (soil water potential) history only, TimeGPT delivers respectable results, only losing out against the state-of-the-art Temporal Fusion Transformer (TFT) model. Note that the TFT model also included exogenous variables such as weather data in its dataset. Considering its 𝐬𝐮𝐩𝐞𝐫𝐢𝐨𝐫 𝐞𝐚𝐬𝐞 𝐨𝐟 𝐮𝐬𝐞 𝐢𝐧 𝐭𝐞𝐫𝐦𝐬 𝐨𝐟 𝐞𝐟𝐟𝐨𝐫𝐭 𝐚𝐧𝐝 𝐝𝐚𝐭𝐚, TimeGPT can therefore be considered a serious alternative for use cases plagued by data scarcity. TimeGPT and other such specialized foundation models can leverage their learned skills, such as time series forecasting, to address new problems where training data are not sufficiently available for alternative deep learning methods that require training from scratch.

Figure 4: Conventional versus foundation models (image from the paper).

Foundation Models for Numerical Tasks

1. Language models

2. Specialized foundation models

Comments

Leave a Reply Cancel reply