:::
A multitude of interconnected risk events---ranging from regulatory changes to geopolitical tensions---can trigger ripple effects across firms. Identifying inter-firm risk relations is thus crucial for applications like portfolio management and investment strategy. Traditionally, such assessments rely on expert judgment and manual analysis, which are, however, subjective, labor-intensive, and difficult to scale. To address this, we propose a systematic method for extracting inter-firm risk relations using Form 10-K filings---authoritative, standardized financial documents---as our data source. Leveraging recent advances in natural language processing, our approach captures implicit and abstract risk connections through unsupervised fine-tuning based on chronological and lexical patterns in the filings. This enables the development of a domain-specific financial encoder with a deeper contextual understanding and introduces a quantitative risk relation score for transparency, interpretable analysis. Extensive experiments demonstrate that our method outperforms strong baselines across multiple evaluation settings.
In this paper, we examine the existence of the Rényi divergence between two time invariant hidden Markov models with arbitrary positive initial distributions. By making use of a Markov chain representation of the probability distribution for the hidden Markov model and eigenvalue for the associated Markovian operator, we obtain, under some regularity conditions, convergence of the Rényi divergence. By using this device, we also characterize the Rényi divergence and obtain the Kullback–Leibler divergence as of the Rényi divergence. Several examples, including classical finite state hidden Markov models, Markov switching models, and recurrent neural networks, are given for illustration. Moreover, we develop a non-Monte Carlo method that computes the Rényi divergence of two-state Markov switching models via the underlying invariant probability measure, which is characterized by the Fredholm integral equation.
The prevalence of hearing aids is increasing. However, optimizing their amplification remains challenging due to the complexity of integrating multiple components in traditional methods. To address this, we present NeuroAMP, a novel deep neural network for end-to-end, personalized amplification in hearing aids. NeuroAMP leverages spectral features and the listener’s audiogram as inputs, and we explore four architectures: Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Convolutional Recurrent Neural Network (CRNN), and Transformer. We also introduce Denoising NeuroAMP, an extension that integrates noise reduction with amplification for improved real-world performance. To enhance generalization, we employed a comprehensive data augmentation strategy during training on diverse speech (TIMIT, TMHINT) and music (Cadenza Challenge MUSIC) datasets. Evaluation using the Hearing Aid Speech Perception Index (HASPI), Hearing Aid Speech Quality Index (HASQI), and Hearing Aid Audio Quality Index (HAAQI) shows that the Transformer-based NeuroAMP achieves the best performance, with SRCC scores of 0.9927 (HASQI) and 0.9905 (HASPI) on TIMIT, and 0.9738 (HAAQI) on Cadenza dataset. Notably, the augmentation strategy maintains robust performance on unseen datasets (e.g., VoiceBank-DEMAND, MUSDB18-HQ). Furthermore, Denoising NeuroAMP outperforms both the conventional NAL-R+WDRC method and a two-stage baseline on the VoiceBank-DEMAND dataset, achieving HASPI of 0.90 and HASQI of 0.59. These results highlight the strong potential of NeuroAMP and Denoising NeuroAMP to provide a novel and effective framework for personalized hearing aid amplification.
Tags play a critical role in enhancing product discoverability, optimizing search results, and enriching recommendation systems on e-commerce platforms. Despite the recent advancements in large language models (LLMs), which have shown proficiency in processing and understanding textual information, their application in tag generation remains an under-explored yet complex challenge. To this end, we introduce a novel method for automatic product tagging using LLMs to create behavior-enhanced tags (BETags). Specifically, our approach begins by generating base tags using an LLM. These base tags are then refined into BETags by incorporating user behavior data. This method aligns the tags with users' actual browsing and purchasing behavior, enhancing the accuracy and relevance of tags to user preferences. By personalizing the base tags with user behavior data, BETags are able to capture deeper behavioral insights, which is essential for understanding nuanced user interests and preferences in e-commerce environments. Moreover, since BETags are generated offline, they do not impose real-time computational overhead and can be seamlessly integrated into downstream tasks commonly associated with recommendation systems and search optimization. Our evaluation of BETag across three datasets--- Amazon (Scientific), MovieLens-1M, and FreshFood---shows that our approach significantly outperforms both human-annotated tags and other automated methods. These results highlight BETag as a scalable and efficient solution for personalized automated tagging, advancing e-commerce platforms by creating more tailored and engaging user experiences.
This paper tackles key challenges in Software-Defined Networking (SDN) by proposing a novel approach for optimizing resource allocation and dynamic priority assignment using OpenFlows priority field. The proposed Lagrangian relaxation (LR)-based algorithms significantly reduces network delay, achieving performance management with dynamic priority levels while demonstrating adaptability and efficiency in a sliced network. The algorithms’ effectiveness were validated through computational experiments, highlighting the strong potential for QoS management across diverse industries. Compared to the Same Priority baseline, the proposed methods: RPA, AP–1, and AP–2, exhibited notable performance improvements, particularly under strict delay constraints. For future applications, the study recommends expanding the algorithm to handle larger networks, integrating it with artificial intelligence technologies for proactive resource optimization. Additionally, the proposed methods lay a solid foundation for addressing the unique demands of 6G networks, particularly in areas such as base station mobility (Low-Earth Orbit, LEO), ultra-low latency, and multi-path transmission strategies.
.
State-of-the-art (SOTA) semi-supervised learning techniques, such as FixMatch and it's variants, have demonstrated impressive performance in classification tasks. However, these methods are not directly applicable to regression tasks. In this paper, we present RankUp, a simple yet effective approach that adapts existing semi-supervised classification techniques to enhance the performance of regression tasks. RankUp achieves this by converting the original regression task into a ranking problem and training it concurrently with the original regression objective. This auxiliary ranking classifier outputs a classification result, thus enabling integration with existing semi-supervised classification methods. Moreover, we introduce regression distribution alignment (RDA), a complementary technique that further enhances RankUp's performance by refining pseudo-labels through distribution alignment. Despite its simplicity, RankUp, with or without RDA, achieves SOTA results in across a range of regression benchmarks, including computer vision, audio, and natural language processing tasks.
Hepatocellular carcinoma (HCC), the most common type of liver cancer, poses significant challenges in detection and diagnosis. Medical imaging, especially computed tomography (CT), is pivotal in non-invasively identifying this disease, requiring substantial expertise for interpretation. This research introduces an innovative strategy that integrates two-dimensional (2D) and three-dimensional (3D) deep learning models within a federated learning (FL) framework for precise segmentation of liver and tumor regions in medical images. The study utilized 131 CT scans from the Liver Tumor Segmentation (LiTS) challenge and demonstrated the superior efficiency and accuracy of the proposed Hybrid-ResUNet model with a Dice score of 0.9433 and an AUC of 0.9965 compared to ResNet and EfficientNet models. This FL approach is beneficial for conducting large-scale clinical trials while safeguarding patient privacy across healthcare settings. It facilitates active engagement in problem-solving, data collection, model development, and refinement. The study also addresses data imbalances in the FL context, showing resilience and highlighting local models' robust performance. Future research will concentrate on refining federated learning algorithms and their incorporation into the continuous implementation and deployment (CI/CD) processes in AI system operations, emphasizing the dynamic involvement of clients. We recommend a collaborative human-AI endeavor to enhance feature extraction and knowledge transfer. These improvements are intended to boost equitable and efficient data collaboration across various sectors in practical scenarios, offering a crucial guide for forthcoming research in medical AI.
The utilization of face masks is an essential healthcare measure, particularly during times of pandemics, yet it can present challenges in communication in our daily lives. To address this problem, we propose a novel approach known as the human-in-the-loop StarGAN (HL–StarGAN) face-masked speech enhancement method. HL–StarGAN comprises discriminator, classifier, metric assessment predictor, and generator that leverages an attention mechanism. The metric assessment predictor, referred to as MaskQSS, incorporates human participants in its development and serves as a “human-in-the-loop” module during the learning process of HL–StarGAN. The overall HL–StarGAN model was trained using an unsupervised learning strategy that simultaneously focuses on the reconstruction of the original clean speech and the optimization of human perception. To implement HL–StarGAN, we created a face-masked speech database named “FMVD,” which comprises recordings from 34 speakers in three distinct face-masked scenarios and a clean condition. We conducted subjective and objective tests on the proposed HL–StarGAN using this database. The outcomes of the test results are as follows: (1) MaskQSS successfully predicted the quality scores of face-masked voices, outperforming several existing speech assessment methods. (2) The integration of the MaskQSS predictor enhanced the ability of HL–StarGAN to transform face-masked voices into high-quality speech; this enhancement is evident in both objective and subjective tests, outperforming conventional StarGAN and CycleGAN-based systems.
Speech quality estimation has recently undergone a paradigm shift from human-hearing expert designs to machine-learning models. However, current models rely mainly on supervised learning, which is time-consuming and expensive for label collection. To solve this problem, we propose VQScore, a self-supervised metric for evaluating speech based on the quantization error of a vector-quantized-variational autoencoder (VQ-VAE). The training of VQ-VAE relies on clean speech; hence, large quantization errors can be expected when the speech is distorted. To further improve correlation with real quality scores, domain knowledge of speech processing is incorporated into the model design. We found that the vector quantization mechanism could also be used for self-supervised speech enhancement (SE) model training. To improve the robustness of the encoder for SE, a novel self-distillation mechanism combined with adversarial training is introduced. In summary, the proposed speech quality estimation method and enhancement models require only clean speech for training without any label requirements. Experimental results show that the proposed VQScore and enhancement model are competitive with supervised baselines.