A full day at Sheraton Saigon (09/23/2025) showed that AI in Vietnam has shifted: from technical demos to standardized pipelines, inference cost optimization, and in-depth discussions about “sovereign AI,” along with the things the URA team learned.
On 09/23/2025, members of the URA research group attended NVIDIA AI Day at Sheraton Saigon Grand Opera. The program ran all day at a moderate pace, interleaving:
- Keynotes/technical sessions: updates on agentic AI, AI factory, inference optimization, and observability.
- Hands-on/Lab (DLI): learn-by-doing, closely tied to real scenarios.
- Demo & networking area: where engineering teams, startups, and infrastructure providers shared “fix the pain you actually have” lessons.
What impressed us most was the “do it for real” spirit: from data governance and model selection to measuring cost/latency in production—less talk of ideas, more concrete ways of doing.
What the team learned
Leaving the conference room, the whole team felt we learned a lot—and most notably, a new way of looking at agentic AI: it’s not just a chatbot that answers better, but a system that can plan, call the right tools, and watch for risks on its own. When pieces like Planner, Retriever/Memory (RAG + vector store), Tooling, Guardrails/Policy, and Tracing/Observability are put together, the reasoning → action flow becomes transparent and controllable—very different from previously “guessing what the model was doing.”
At the same time, we absorbed the idea of packaging models as services: using NIM/microservices to standardize APIs and bring inference into CI/CD like other services. Separating the model layer from the application, tracking cost per 1,000 requests, and reusing building blocks (LLM/vision/speech) helps clearly reduce technical debt—especially when model versions need to be swapped or when A/B testing different configurations/prompts.
Finally, the Sovereign AI discussion was very direct: to build AI for Vietnamese data, you must control your data and infrastructure. From classifying data (public/sensitive/strict) to on-prem training/fine-tuning with auditable pipelines, then multi-stage deployment (dev–staging–prod) with an “internet-off” option when needed—all of this forms a serious operational framework so AI can go into production safely and in compliance.
Team spotlight
Appearing in the talk “Building AI Factories for the Next Industrial Revolution” by Dr. Ettikan Karuppiah (Regional Chief Technology Officer for South Asia–Pacific, NVIDIA) at the conference, the URASys (Unified Retrieval Agent-Based System) developed by the research group at Ho Chi Minh City University of Technology was mentioned as an illustrative case of the AI agent application trend in Vietnam.

URASys — a project by the URA research group at Ho Chi Minh City University of Technology (VNU-HCM) — was presented by NVIDIA as the Vietnam use case at AI Day Ho Chi Minh City 2025.
URASys was the only project authored by students and graduate students introduced in this presentation, alongside two other projects from industry. This is not only a great honor for the team—having their research recognized and presented widely—but also opens up many promising opportunities for connection, collaboration, and development for students, graduate students, and faculty of Ho Chi Minh City University of Technology in the field of artificial intelligence.
URASys was developed by a research group at Ho Chi Minh City University of Technology, consisting of two third-year students Nguyễn Song Thien Long and Vo Thi Nhu Quynh, and two master’s students Le Hoang Anh Tai and Huynh Tieu Phung, under the scientific supervision of Assoc. Prof. Dr. Quan Thanh Tho, Head of the Faculty of Computer Science and Computer Engineering.

URASys architecture — a multi-agent Q&A system that combines an FAQ Agent and a Document Agent, inspired by the real admissions counseling workflow.
This is a multi-agent AI architecture that combines Retrieval-Augmented Generation (RAG) to perform multiple tasks—from orchestration and search to retrieval and answer generation. The system has the following key features:
- Manager Agent: analyzes and breaks down questions, orchestrating other agents.
- FAQ Agent + Document Agent: two parallel retrieval pipelines—FAQ provides a database of frequently asked questions, while the Document Agent retrieves from official textual documents.
- RAG – Retrieval-Augmented Generation: when a user asks a question, the system first retrieves relevant text passages from the database, then feeds these passages to the AI model to generate an answer.
- “Just Enough” principle: only returns results when sufficient necessary evidence has been gathered; if the query is ambiguous, the system asks the user for clarification; if no data is available, it responds unanswerable (no information found).
- Two-stage indexing: Chunk-and-Title (creating passages rich in keywords and semantics) and Ask-and-Augment (creating multiple question–answer paraphrase pairs), which optimizes retrieval effectiveness, especially in Vietnamese.