AI classification: improving machine learning without negative data

AI classification: improving machine learning without negative data
© Stock/NicoElNino

A research team from the RIKEN Center for Advanced Intelligence Project (AIP) has successfully developed a new method for machine learning that allows an AI to make classifications without what is known as “negative data,” a finding which could lead to wider application to a variety of classification tasks.

It is expected that the new machine learning method without negative data will lead to a wider application of AI classification for various tasks.

AI classification technology

AI classification technology allows for machine learning of positive and negative data. This is commonly applied in classifying things such as spam mail, fake news, objects, and faces.

According to lead author Takashi Ishida from RIKEN AIP, “Previous classification methods could not cope with the situation where negative data were not available, but we have made it possible for computers to learn with only positive data, as long as we have a confidence score for our positive data, constructed from information such as buying intention or the active rate of app users. Using our new method, we can let computers learn a classifier only from positive data equipped with confidence.”

What is negative data?

For AI classification technology, a computer must learn the classification boundary separating positive and negative data so that the computer may then determine whether data is positive or negative. For example, positive data might be photos including a happy face, while negative data photos include a sad face.

The issue with AI classification technology is that it is necessary to have both positive and negative data for the machine learning process, however in many cases negative data is unavailable.

 

Schematic showing positive data (apples) and a lack of negative data (bananas), with an illustration of the confidence of the apple data. © IKEN Center for Advanced Intelligence Project

Improving machine learning

The researchers added the confidence score, a mathematical probability of whether the data is positive or negative. This means that computers can learn a classification boundary using only positive data and a confidence score.

According to Ishida, “This discovery could expand the range of applications where classification technology can be used. Even in fields where machine learning has been actively used, our classification technology could be used in new situations where only positive data can be gathered due to data regulation or business constraints. In the near future, we hope to put our technology to use in various research fields, such as natural language processing, computer vision, robotics, and bioinformatics.”

Laboratory Supplies Directory - Now Live

LEAVE A REPLY

Please enter your comment!
Please enter your name here