Chuyển tới nội dung

13.12.2024: TI-JEPA Takes the Stage at SoICT: Solving the Semantic Gap

This afternoon, Vo Hoang Nhat Khang and Nguyen Phan Tri Duc presented their paper, “TI-JEPA: An Innovative Joint Embedding Strategy for Text-Image Multimodal Systems.” Their work addresses the semantic gap between text and image modalities, a key challenge in Artificial Intelligence, which often leads to collapsed representations and hinders effective multimodal fusion. They introduced TI-JEPA, a novel pre-training approach that preserves meaningful cross-modal relationships through a self-supervised framework, ensuring robust alignment between textual and visual data. The approach achieved state-of-the-art results in multimodal sentiment analysis and demonstrated significant potential for broader applications like Visual Question Answering, setting a new benchmark for multimodal systems.

Join the conversation

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *