Chuyển tới nội dung

18.10.2025: URAx Achieves Top 3 in the LegalSLM Challenge, Paper Accepted, and Heads Toward VLSP 2025 in Hanoi

The URAx Research Team from Ho Chi Minh City University of Technology (HCMUT – VNU-HCM) has reached an important milestone by achieving a Top 3 nationwide ranking in the VLSP 2025 Challenge on Vietnamese Legal Small Language Models (LegalSLM). This competition is one of the major shared tasks of the Vietnamese Language and Speech Processing (VLSP 2025) Workshop, which will be held in Hanoi on October 29–30, 2025.

VLSP Challenge on Vietnamese Small Legal Language Model Leaderboard

The LegalSLM Challenge aims to develop compact and domain-specialized Vietnamese small legal language models, with no more than four billion parameters. These models are designed to reason over statutory texts, perform legal question answering, and generate accurate legal explanations, all while maintaining computational efficiency for deployment in real-world environments.

Amid strong competition from academic institutions and industry teams, URAx distinguished itself by finishing in the Top 3 overall. The team achieved the highest score in the Free-Text Legal Question Answering track, which highlights its ability to generate coherent and statute-grounded responses in complex legal contexts. This result demonstrates both technical depth and domain understanding, positioning URAx among the leading research teams in Vietnamese Legal NLP.

Research Direction: Small Language Models for Vietnamese Legal Reasoning

Building upon the challenge, URAx expanded its research direction toward exploring how small language models can perform Vietnamese legal reasoning effectively under resource constraints.

This research is presented in the paper titled “Can Small Language Models Handle Vietnamese Legal Reasoning? Insights from Multi-Task Evaluation.” The study was conducted by URA student members Long S. T. Nguyen, Hung C. Luu, Quynh T. N. Vo, Hy N. G. La, Hoai M. Tran, Anh T. D. Dinh, Tuan H. Nguyen, and Tri N. Ho under the supervision of Assoc. Prof. Quan Thanh Tho from Ho Chi Minh City University of Technology, VNU-HCM.

The paper has been accepted for presentation at the VLSP 2025 Workshop, marking an important recognition of the team’s contribution to Vietnamese legal language modeling and its growing role in the national AI research community.

Study Overview and Key Findings

The study investigates a central question: Can small-to-medium language models with fewer than four billion parameters achieve competitive performance in Vietnamese legal reasoning while remaining lightweight and deployable?

To answer this question, the team proposed a two-stage adaptation framework using the Qwen3-4B model as the backbone. The first stage, continual pretraining, was conducted on more than 145,000 Vietnamese legal documents to inject statutory knowledge and linguistic understanding. The second stage, multi-task instruction fine-tuning, was performed on three types of legal reasoning tasks: Legal Citation Usefulness (Natural Language Inference), Multiple-Choice Legal Question Answering, and Free-Text Legal Question Answering.

Experimental results showed that small language models, when adapted with domain-specific techniques, can achieve competitive accuracy on discriminative reasoning tasks such as multiple-choice and entailment-based legal reasoning. However, free-text legal question answering remains a significant challenge, indicating that future improvements will require hybrid approaches integrating retrieval-augmented generation and larger-scale legal QA datasets.

Significance and Impact

The achievement reflects the research capability and innovation of Ho Chi Minh City University of Technology in the field of Artificial Intelligence and Natural Language Processing. It also emphasizes the university’s pioneering role in developing Vietnamese Legal AI, a domain that bridges law and technology.

The URAx team’s findings demonstrate that domain-specialized small language models can become practical and reliable tools for various applications, including intelligent legal assistants, automated legal document retrieval systems, and legal question-answering platforms. The work contributes to the long-term goal of building accessible, trustworthy, and socially beneficial AI systems designed specifically for the Vietnamese context.

Looking Ahead to VLSP 2025

With its Top 3 ranking in the VLSP 2025 LegalSLM Challenge and its accepted paper for presentation, the URAx team is now preparing to present its research results at the VLSP 2025 Workshop in Hanoi.

This dual achievement represents both a scientific success and a meaningful step forward for Vietnamese AI research. It highlights the growing potential of the Vietnamese research community in advancing legal language modeling, domain adaptation, and resource-efficient NLP.

The URAx team’s work stands as a proud moment for the URA Research Group and Ho Chi Minh City University of Technology, reaffirming their dedication to high-quality research and their mission to cultivate the next generation of AI innovators in Vietnam.

Join the conversation

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *