Alibaba AI model notches up reading comprehension milestone

Alibaba’s deep-learning model has topped humans for the first time in a reading comprehension test. The Chinese e-commerce giant’s Institute of Data Science and Technologies (iDST) reports that its deep neural network model scored 82.44 in the Stanford Question Answering Dataset (SQuAD) on 11th January, beating the human score of 82.304 for providing exact answers to questions.

“It is our great honour to witness the milestone where machines surpass humans in reading comprehension,” says Luo Si, iDST’s Chief Scientist for Natural Language Processing. “We are thrilled to see NLP research has achieved significant progress over the year. We look forward to sharing our model-building methodology with the wider community and exporting the technology to our clients in the near future.”

Teams competing in the challenge need to build machine learning models that can provide answers to the questions in the dataset, such as “what causes rain?” The Alibaba model’s accuracy was tied to its ability to read from paragraphs to sentences to words, locating precise phrases that contain potential answers. That model, which leverages the Hierarchical Attention Network, is viewed as having strong commercial value. Alibaba has used the underlying technology in its 11.11 Global Shopping Festival for several years, with machines answering large amounts of inbound customer inquiries. Other potential customer-service uses include tutorials for visitors to museums and online responses to inquiries from medical patients.