Welcome to MLDA 2025

4th International Conference on Machine Learning, NLP and Data Mining (MLDA 2025)

July 25 ~ 26, 2025, Virtual Conference



Accepted Papers
AHT-VIT: Adaptive Halting Transformer with Planned Depth Execution

Vitalii Shutov, Arip Asadulaev, ITMO University, Russian Federation

ABSTRACT

Vision Transformers (ViTs) offer strong performance but face high computational costs from processing all tokens through their full depth. Standard ViTs lack adaptivity. This work introduces Adaptive Halting Transformer (AHT-ViT) to enhance efficiency by dynamically adjusting processing depth per token. AHT-ViT employs hierarchical ”planner” modules predicting token-specific target halting depths and an extremely parameter-efficient ”supervisor” mechanism (two shared parameters) generating per-layer halting scores. Tokens halt when their cumulative score crosses a threshold. A novel KL divergence-based loss, Ltarget depth, explicitly aligns executed halting distributions with planned depths. Evaluation on ImageNet, Places365, and CIFAR-100 using DeiT-S shows AHT-ViT achieves an improved accuracy-efficiency trade-off compared to its static baseline and demonstrates competitive performance against other adaptive methods (DynamicViT, A-ViT) evaluated under the same conditions, while significantly reducing FLOPs. Key hyperparameters were selected via grid search on a validation split.

Keywords

Vision Transformer, Adaptive Computation, Early Exit, Dynamic Depth, Model Efficiency, Image Classification.


Rag in Specialized Domains: a Survey of QA Chatbots

Saikrishna Rajanidi, Anbazhagan M, Ramya G. R, Department of Computer Science and Engineering, Amrita School of computing Amrita Vishwa Vidyapeetham Coimbatore, India

ABSTRACT

This paper explores the evolution of large language models (LLMs) and the growing role of retrieval-augmented generation (RAG) systems in overcoming challenges in domain-specific applications. Although LLMs have revolutionized natural language processing (NLP), they face critical limitations in high-stakes domains such as medicine, engineering, and law—where accuracy, factuality, and trust are paramount. These shortcomings include hallucinations, outdated knowledge, and vulnerability to adversarial prompts. RAG systems address these issues by integrating LLMs with external, domain-specific knowledge sources to improve factual grounding and response reliability. Frameworks like Almanac in clinical settings and KEAG in complex QA tasks demonstrate how RAG reduces hallucinations, enhances interpretability, and delivers accurate, evidence-backed responses. In healthcare, combining LLMs with RAG has raised accuracy from around 93.25 percent up to 99.25 percent, showing its impact on real-world decision support. This paper proposes a structured synthesis of advancements, challenges, and optimization strategies in RAG for specialized domains, paving the way for safer, transparent, and adaptive AI systems.

Keywords

Retrieval Augmented Generation, Large Language Models, Fine Tuning, Maximum Marginal Relevance Retrieval, Neural Generative Question Answering.