Samsung Advances Premium Mobile AI Experiences
June 10, 2024 0 By Rowena CletusSamsung is leading the way in premium mobile AI experiences. To understand how Galaxy AI is maximizing user potential, we are exploring Samsung Research centers worldwide. With support for 16 languages, Galaxy AI helps users expand their language capabilities, even offline, through on-device translation features like Live Translate, Interpreter, Note Assist, and Browsing Assist. Recently, we visited Jordan to learn about developing an AI model for Arabic. Now, we head to Vietnam to see how data is prepared to train AI models.
The Challenge of Vietnamese Language
Vietnamese, spoken by 97 million people, poses a unique challenge for AI models. Words like “ghost” (ma), “grave” (mả), and “mother” (má) sound similar but differ in tone, making it difficult for AI to distinguish them without understanding context and emotions.
At the Samsung R&D Institute Vietnam (SRV), finely refined data helps AI models recognize these subtle language differences. High-quality data directly affects the accuracy of automatic speech recognition (ASR), neural machine translation (NMT), and text-to-speech (TTS) – essential processes for Galaxy AI features.
Overcoming Complex Language Barriers
“Vietnamese is a complex and diverse language with rich expressions, many of which are challenging to capture,” says Ngô Hồng Thái, NMT lead at SRV. Vietnamese, with its six distinct tones, required a meticulous approach to develop an effective AI model.
“Creating an AI model for Vietnamese was more daunting than our typhoons!” Thái adds. The AI model had to differentiate between short audio frames of around 20 milliseconds to recognize words accurately. Homophones and homonyms added another layer of complexity, as AI models need to be trained to differentiate between tones and similar words accurately.
Rigorous Data Preparation
The data refinement process includes three steps: reviewing and correcting the audio and text, performing random quality checks, and normalizing and cleaning the dataset before training. Nguyen Manh Duy, TTS lead at SRV, explains that this process involved addressing misspelled words, background noise, and incorrect pronunciations to ensure high-quality training data.
Limited accessible data for Vietnamese made the refinement stage crucial. The team collected data reflecting Vietnam’s northern, central, and southern accents to improve the AI model’s accuracy, resulting in an enormous amount of information to refine and verify.
Continuous Improvement
After months of hard work, Vietnamese became one of the first languages supported by Galaxy AI. The team continues to enhance the AI model by incorporating user feedback on the relevance of words and phrases in Galaxy AI.
“We have just taken our first steps into a more open world — and we have so much more to explore together,” says Tran Tuan Minh, leader of the AI language development project at SRV.
In the next episode of The Learning Curve, we will head to China to explore how AI models are trained and fine-tuned.
About The Author
A connoisseur of fashionable mobile tech, Rowena believes that technology should advance to a point where function can follow form. She covers a variety of topics, but is most passionate about tech that improve our humanity.