Interpretable Clustering with Sparse Autoencoders
Amazon Knowledge Sharing Session
Presented my ACL paper on interpretable clustering with sparse autoencoders to enhance interpretability of industrial applications of AI.
My work has been presented in conferences all around the world and I've been interviewed by the likes of Google Deepmind.
Amazon Knowledge Sharing Session
Presented my ACL paper on interpretable clustering with sparse autoencoders to enhance interpretability of industrial applications of AI.
December 6, 2025
Risk management is a crucial task for both individual investors and financial institutions that seek to identify and quantify the risks they are exposed to. We introduce Risko1, an 8B-parameter financial reasoning model trained with Group Relative Policy Optimization (GRPO) on both textual context and financial information. It identifies specific risks to which companies are exposed and quantifies their impact in terms of standard risk metrics: Value at Risk (VaR), Conditional Value at Risk (CVaR), and Volatility. These metrics must strictly satisfy fundamental constraints such as CVaRα > VaRα and monotonicity across confidence levels. Performance is slightly above the much larger Llama 3.3 70B in accuracy, and roughly on par in Mean Squared Error (MSE). Beyond quantitative ability, we analyze the quality of the risk scenarios generated. Regulators require institutions to establish controls to mitigate risk exposure. This is done using a risk taxonomy that classifies risks across tiers based on granularity, with 1 being a broad category (for instance, operational risks), and 4 being the most granular (a specific event). Controls are enacted at the appropriate level of granularity. We explore the distribution of tiers of generated risks, and find that they are coherent with the given context (mainly market and operational) and granular (mostly tier 3 and 4), and hence amenable to mitigation, as controls may be assigned effectively.
December 10, 2024
Determining company similarity is a vital task in finance, underpinning hedging, risk management, portfolio diversification, and more. Practitioners often rely on sector and industry classifications to gauge similarity, such as SIC-codes and GICS-codes - the former being used by the U.S. Securities and Exchange Commission (SEC), and the latter widely used by the investment community. Since these classifications can lack granularity and often need to be updated, using clusters of embeddings of company descriptions has been proposed as a potential alternative, but the lack of interpretability in token embeddings poses a significant barrier to adoption in high-stakes contexts. Sparse Autoencoders (SAEs) have shown promise in enhancing the interpretability of Large Language Models (LLMs) by decomposing LLM activations into interpretable features. We apply SAEs to company descriptions, obtaining meaningful clusters of equities in the process. We benchmark SAE features against SIC-codes, Major Group codes, and Embeddings. Our results demonstrate that SAE features not only replicate but often surpass sector classifications and embeddings in capturing fundamental company characteristics. This is evidenced by their superior performance in correlating monthly returns - a proxy for similarity - and generating higher Sharpe ratio co-integration strategies, which underscores deeper fundamental similarities among companies.
September 28, 2024
Sparse Autoencoders (SAEs) have recently been employed as an unsupervised approach for understanding the inner workings of Large Language Models (LLMs). They reconstruct the model’s activations with a sparse linear combination of interpretable features. However, training SAEs is computationally intensive, especially as models grow in size and complexity. To address this challenge, we propose a novel training strategy that reduces the number of trained SAEs from one per layer to one for a given group of contiguous layers. Our experimental results on Pythia 160M highlight a 6x speedup without compromising the reconstruction quality and performance on downstream tasks. Therefore, layer clustering presents an efficient approach to train SAEs in modern LLMs.
August 20, 2024
Sparse AutoEncoders (SAEs) have gained popularity as a tool for enhancing the interpretability of Large Language Models (LLMs). However, training SAEs can be computationally intensive, especially as model complexity grows. In this study, the potential of transfer learning to accelerate SAEs training is explored by capitalizing on the shared representations found across adjacent layers of LLMs. Our experimental results demonstrate that fine-tuning SAEs using pre-trained models from nearby layers not only maintains but often improves the quality of learned representations, while significantly accelerating convergence. These findings indicate that the strategic reuse of pretrained SAEs is a promising approach, particularly in settings where computational resources are constrained.
Meta Llama User Day
The Good Scientist was invited to present at the Meta Llama User Day, for our platform's innovative use of Llama3.
2023 Advanced Computing User Day
Presented our usage of the SURF supercomputer to accelerate our impact in the field of AI.
Google Deepmind