The man says “chhum reap sour” (hello) and the machine replies hello. Then he says “chhum reap lear” (goodbye) and the robot answers “lear heoy” (bye) . When the man says “Cambodia” the robot tells him “I love Cambodia”.
The conversation was between a small box-like robot with the Cambodian national flag printed on it and the words “National Polytechnic Institute of Cambodia” (NPIC) on its 3.2 inch screen and one of its student-creators from the institute.
“We want to create a Khmer automatic speech recognition system that can turns spoken Khmer spoken into a written Khmer transcript generated by itself without using the internet,” Ny Virbora or Bora, lead student on a team with three other members, told The Post.
Since they study software programming related to artificial intelligence (AI) their professor recommended that the four-member group – Cheat Chea, Sokheang Ching, Ny Piseth and Ny Virbora – work together to create a robot that recognises Khmer speech.
‘Assistant’ gets entry level job
The robot – named “Assistant” – is only 22cm in height and works automatically when the power is on. It can communicate in Khmer through the Khmer Automatic Speech Recognition System using Carnegie Mellon University’s (CMU) Sphinx speech recognition system with a limited dataset.
“The robot can transcribe speech to words and can reply to some specific questions from us. Therefore, we interrogate the robot in Khmer and it replies in Khmer with scripts on screen,” says Bora, who studies electronics engineering.
The team is considering marketing the robots to private companies or corporate clients for customer service purposes but they also believe it could be developed for the education sector and that has been the group’s initial focus.
“We can use automaton in kindergarten and primary school levels while it can handle some inquiries, display pictures and videos to students,” said Bora.
Assistant uses noise reduction technology to help it understand what is being said to it and its limited dataset includes 85 speakers and 157 words selected from the Khmer language that it uses to create sentences when replying to people.
To evaluate the speech recognition accuracy, 100 Khmer transcripts were randomly created from the training dictionary and used for calculating the word and sentence error rate, according to the research team.
“The recognition accuracy for Khmer speech was up to 89.91% of word recognition accuracy and 90.02% of sentence recognition accuracy,” Bora said.
He said the ability to reply to inquiries depends on the database used to train Assistant. Train in this context means a set of software used for training scripts and acoustic models for creating a speech recognition model for any language by providing sufficient acoustic data that can run with the CMU Sphinx speech recogniser.
“This automaton is not that different from a toddler. To teach young kids to recognise words, firstly we need to show that word and we speak the word to him several times. But the machine is unlike a human in that if a man talks to it then it would recognise that man only. So we need to input more data to make it efficiently recognise a wide range of human voices,” Bora said.
“Voices from more than 80 people were used to train this robot to recognise various accents. If you train with only one person’s accent then the automaton of course recognises only that person,” said Srun Channareth, a professor at National Polytechnic Institute of Cambodia and the Department of Electronics Master of Engineering who led the “Assistant” project.
‘Assistant’ finds success at work
Though Assistant is only knows 157 words, Nareth said, the accuracy rate with them is higher than 90%.
“For this model, it is considered our first success because we can trust that it will get it right. The tested result for accuracy and the ability to transcribe speech to text is at 90% and the ability to use its set of words is at 89% because sentences made up from those words can be confusing sometimes,” said Bora. He cited the results of Microsoft’s research, which estimates that a robot with 90 per cent accuracy can be used in the field.
His team is now beginning to recruit volunteers to help integrate more data into the robot to enhance its use of words and phrases.
“As for the number of words, it depends on the location or sector where the robot will be used,” Bora said. “If we use it for the education sector, we’ll see what words need to be used by looking at textbook lessons. We only include Khmer words, as they will be transcribed into sentences.”
“What’s special about our robot is that it works offline without using the internet. Working offline means that it runs faster and it is local computing that processes the data in the robot without having to connect to a server,” said Bora, 22.
The next step is making a larger database because Assistant now stores a limited set of data and increasing the size of its database will take time.
“From the beginning of the project to the first phase robot, our team spent more than a year, because we needed time to do a lot of research on intelligent technology (AI), even if the robot design itself was effortless,” said Bora.
In terms of hardware materials, the team only had to spend about $200 since the robot’s body was 3D-printed by the university’s own 3D-printer.
‘Assistant’ gets a promotion
“Soon, we will create a new robot model and develop new software that acts as an assistant for use by companies and restaurants as well as customer service locations,” Bora said. “The new one will be smaller and more portable, but with an expanded screen size of up to 7 inches.”
The robot can act as an information source for customers without them having to rely on human services or contact. For instance, at the airport passengers could ask the robot to show them the scheduled flights from Phnom Penh to Japan. Once the recognition system for the Khmer language is fully developed its output can then be translated to other languages as well.
In order for the robot to function properly, Bora added, there are still two major stages left. First, updating the software to make the system run faster, and second, increasing its ability to recognise words.
“We will also equip it with a camera so that it has the ability to recognise people and be able to call them by their names directly,” said Nareth. “We have a lot of projects, but what we lack is support for further research.”
‘Assistant’ goes back to school
Bora told The Post that training robots to recognise words is very difficult as the first step is to create a database that includes both voice and text data.
“In the coding system, we cannot write in Khmer directly because the computer does not know Khmer script,” he said. “So we use the Latin alphabet represented in Khmer as a unicode format.”
It took the team almost half-a-year to create their initial database since they had to collect the data and create a list of words. But for the new words being added, Bora says they have written software that speeds up the data collection and makes it less time consuming.
“Before, there were 10 of us. We collected data and that took us up to two days, but then with the software I wrote, I spent just over an hour on it and we collected from 10 to 20 words,” he said.
The team struggles with both hardware and software issues and bridging the gap between theory and practice. They can design robot bodies but it can be a hassle to produce real-world parts from computer-aided designs.
“We need to study theories and lessons related to the development of AI and speech recognition, but when we put all those theories into practice, obviously, the practice is not the same as in the theory at all, so we need more time to experiment,” Bora said.
Khmer-supported software takes a long time to get started, although there are some theory books and software support available now. But they cannot use other countries’ software completely because they want what they are doing to be accessible to Cambodians who only know the Khmer language.
“Whether or not there is a sponsor,” he said, “it is important that we focus on promoting strategies for further research. Although we cannot afford to buy supplies and materials the school can help some. Not only that, teachers or professors can also help find additional funding.”
‘Assistant’ explores new careers
Prof. Channareth said that if a company develops an advanced enough robot that can assist in communicating with customers or answering questions, they could use them to replace human workers and thereby save money and increase efficiency.
Industries from telephone companies to banks to restaurants and cafes or customer service providers can benefit from this approach and in fact already are in some parts of the world.
“When we enter a bank now, they have a machine that requires customers to press the number and choose the type of service they need and it then puts them in line to speak with a human teller to receive that service. What if that first machine they interacted with could just provide the service?
“However, we are mainly focused on educational applications that will be used by donors. Through working with charities the students can strengthen their programming skills and write code and become skilled in electronics engineering while capably contributing to the donors’ companies,” he said.