When AI Fails - Putting A Leading AI Algorithm To The Test

Late last year, Test Grid, a leading psychometric assessment & recruitment agency published a blog post about AI – exploring where we were at with AI recruitment technology.

As a quick refresher: artificial intelligence (AI) is an area of computer science focused on creating intelligent machines (or ‘robots’!) that can think and behave just like humans. Using machine learning, we can now train machines to learn from data: they can identify trends and patterns, and make predictions based on these.

Obviously, the time-saving potential of AI in recruitment settings is huge: consider how much time recruiters can save by leveraging AI to perform tasks such as screening resumes, compiling candidate shortlists, reviewing video interviews, scheduling interviews, and even answering candidate queries in real-time. Test Grid certainly believes in the capacity for AI to free up recruiters’ time by automating repetitive, administrative tasks, and enabling them to focus on the more ‘human’ elements of their roles.

However, given that AI claims to train machines to “think and behave just like humans”, they were interested to test the cognitive ability of a popular AI tool, to see exactly how its intelligence stacked up against that of real breathing, walking, living humans.

PUTTING AI TO THE TEST..(LITERALLY)
Test Grid kindly asked a leading AI (‘word2vec’) algorithm to take one of our Verbal Reasoning assessments, using natural language processing. Natural language processing (NLP) involves converting language data into a form that computers can understand.

The Verbal Reasoning assessment was one of their most commonly used tests: a mid-level assessment covering a range of verbal reasoning aptitude areas, such as vocabulary, similarities, and verbal analogies. It is typically used for professional and managerial roles.

THE RESULTS?
The AI tool scored in the 19th percentile, which is classified as Below Average. In other words, it performed better than 19 percent of the general population on this assessment.

Which leads us to the question: do we really want to give AI tools free rein to make unsupervised decisions about people?

Test Grid maintains the belief that the value of AI (as it stands currently) lies in supplementing, not replacing, human capability.

What do you think?