Logo image
Integrating managerial and investor textual data for financial distress prediction: A framework combining multi-source financial information fusion network with LLM
Journal article   Peer reviewed

Integrating managerial and investor textual data for financial distress prediction: A framework combining multi-source financial information fusion network with LLM

Jing Qiu, Shaoze Cui, Weiguo Fan and Ruize Gao
Information processing & management, Vol.63(6), 104750
09/2026
DOI: 10.1016/j.ipm.2026.104750

View Online

Abstract

•We simultaneously integrate financial indicators, investor comments, and MD&A data to enrich the information available to the FDP model, thereby offering incremental prediction signals that enhance its robustness and generalizability.•We propose an LLM-based aspect-sentiment triplet analysis method for MD&A data that captures aspects, opinions, and sentiments, which enables fine-grained and interpretable feature extraction.•We propose a multi-source financial information attention-based fusion approach to integrate multimodal data, enabling adaptive adjustment of modality weights and thereby improving the FDP prediction performance. While prior financial distress prediction (FDP) studies have increasingly incorporated multi-source data, existing approaches rarely integrate financial indicators with fine-grained narrative signals in a unified framework. To bridge this gap, we propose a novel FDP framework that integrates simultaneously financial ratios, Management Discussion and Analysis (MD&A), and investor comments from social media. We first propose an LLM-BERT based triplet extraction approach to capture aspect-level semantic and sentiment information from MD&A texts. Second, we utilize FinBERT to extract sentiment features from investor comments. These textual features are then combined with financial ratios and integrated into a Multi-source Financial Information Fusion Network (MFIFN), trained with focal loss to mitigate class imbalance. Based on the dataset of 24,429 firm-year samples from Chinese listed companies between 2014 and 2023 (including both distressed and non-distressed firms), experimental results demonstrate that incorporating social media and MD&A features provides incremental predictive values on top of financial ratios. In particular, the proposed MFIFN model achieves an AUC of 0.9541. Furthermore, the LLM-BERT based triplet extraction method improves feature quality, delivering consistent performance gains across compared with traditional textual feature extraction methods. These findings suggest that integrating financial indicators with fine-grained narrative signals can enhance early warning systems, providing stakeholders with more timely and comprehensive risk assessment tools.
Financial distress prediction Large language model Management discussion and analysis Multi-source financial information fusion network Social media comments

Details

Metrics

1 Record Views
Logo image