Bursting a Major AI Reliability Myth

Human beings surely have the means to do a lot, but if we really get down to business, there is still not much that we do better than growing on a consistent basis. This tendency to improve, no matter the situation, has already enabled the world to clock some huge milestones, with technology emerging as quite a major member of the group. The reason why we hold technology in such a high regard is, by and large, predicated upon its skill-set, which guided us towards a reality that nobody could have ever imagined otherwise. Nevertheless, if we look beyond the surface for one hot second, it will become clear how the whole runner was also very much inspired from the way we applied those skills across a real world environment. The latter component, in fact, did a lot to give the creation a spectrum-wide presence, and as a result, initiated a full-blown tech revolution. Of course, the next big thing this revolution did was to scale up the human experience through some outright unique, but even after achieving a feat so notable, technology will somehow continue to bring forth the right goods. The same has turned more and more evident in recent times, and assuming one new discovery ends up with the desired impact, it will only put that trend on a higher pedestal moving forward.

The researching team at Massachusetts Institute of Technology has successfully concluded a study where they tested a method which is usually put into practice for verifying an AI’s system efficiency. To give you some context, the method in focus here, named formal specifications, uses mathematical formulas that can be translated into natural-language expressions. The purpose of doing so is essentially to spell out decisions an AI will make in a way that is interpretable to humans. However, if we put our stock in the latest study, we’ll see how the method is not as interpretable as it has been advertised since a long time now. Before reaching on such a sensational claim, the researchers conducted an experiment where they enlisted both a contingent of experts in formal specifications method and a group of non-experts. These people were provided formal specifications in three ways i.e. a “raw” logical formula, the formula translated into words closer to natural language, and a decision-tree format. Once that bit was done, they were asked to validate a fairly simple set of behaviors with a robot playing a game of capture the flag. Their practical task was just to answer the question “If the robot follows these rules exactly, does it always win?” Going by the available details, the validation performance could only manage about 45% worth of accuracy.

“When researchers say ‘our machine learning system is accurate,’ we ask ‘how accurate?” and ‘using what data?’ and if that information isn’t provided, we reject the claim. We haven’t been doing that much when researchers say ‘our machine learning system is interpretable,’ and we need to start holding those claims up to more scrutiny,” said Hosea Siu, a researcher in the MIT Lincoln laboratory’s AI Technology Group.

In order to understand the significance of such a development, we must acknowledge that interpretability is important because it allows humans to place their trust in a machine when used across a real and practical backdrop. However, given the machine learning process happens in somewhat of a “black box”, it has remained a massive challenge to gain meaningful clarity on the subject. Coming back to the study, interestingly enough, while experts on formal specifications did better than the rookies, the former group was found to ignore the rule sets allowing for game losses. Instead, they placed an overt amount of trust in the correctness of specifications presented to them, sparking a major confirmation bias concern.

“We don’t think that this result means we should abandon formal specifications as a way to explain system behaviors to people. But we do think that a lot more work needs to go into the design of how they are presented to people and into the workflow in which people use them,” said Siu.

The study, although pretty significant in its own right, is only one part of a larger project, which is designed to improve the relationship between robots and human operators. You see, programming robots in the way it’s done right often ends up diminishing the operators’ influence on the technology. Hence, the stated MIT project will try and let these operators teach tasks to robots directly, similar to how a human being would be trained. This won’t just bolster the individual’s trust in the robot, but it also help the robot become more adaptable.

“Our results push for the need to do human evaluations of certain systems and concepts of autonomy and AI before too many claims are made about their utility with humans,” said Siu.