
This afternoon, Mr. Vo Duy Hung presented the groundbreaking study “VIEACT-TTS-VC: A Vietnamese End-to-End Text-to-Speech and Voice Conversion Framework Using Low-Resource Adaptable Style Transfer” at the prestigious 10th ASIA International Conference (AIC 2024), held at Universiti Teknologi Malaysia (UTM), Malaysia, from December 20–22, 2024. This innovative research tackles the unique challenges of synthesizing natural and intelligible Vietnamese speech while adapting to diverse speaker styles in low-resource settings. Vietnamese, with its six tonal variations and complex regional accents, presents significant hurdles that VIEACT-TTS-VC successfully overcomes through a robust framework built on the VITS baseline. The system excels in producing high-quality speech with minimal data, enabling both text-to-speech synthesis and voice conversion while preserving linguistic fidelity. During evaluations, the model delivered outstanding results, achieving a Mean Opinion Score (MOS) of 4.31 ± 0.158 for comprehensibility and 3.68 ± 0.135 for naturalness, even with only five minutes of training data. By outperforming state-of-the-art models in speaker style adaptation and robustness, VIEACT-TTS-VC establishes itself as a benchmark for AI-driven speech synthesis in Vietnamese, paving the way for adaptable and scalable solutions in multilingual AI research and development.