This story is part of a group of stories called
Finding the best ways to do good. Made possible by The Rockefeller Foundation.
In the past few years, AI systems have gotten a lot better at, well, everything — from playing strategy games to writing news stories to creating photorealistic images. That’s been because of progress with machine-learning techniques, which allow the systems to accumulate years, or even decades or centuries, of experience solving a problem in the span of hours, days, or weeks.
But this technique doesn’t work for robotics. Unlike a computer game, which can be run super-fast and on lots of computers at the same time, robots exist in the same world as the rest of us. Having a robot do a million years of training takes, well, a million years. You could train an AI to operate a robot by training it on a simulation and then applying it to a real robot, but robots in the real world won’t be quite like the simulation — maybe a joint has slightly more friction, maybe a sensor is slightly ajar — and it’s hard for them to benefit from the things the AI learned in the simulation.
OpenAI, a San Francisco-based AI research outfit, announced on Tuesday that it has figured out a way around that problem. The group unveiled a robot hand, called Dactyl, which it has taught to solve a Rubik’s Cube.
How do you use machine learning to train a robotic hand? It’s difficult. Simulations are perfect. There’s only simulated friction, no gusts of air, no slight resistance in one joint due to maintenance problems. If you just try to solve a problem in a simulation and transfer it to a robot, you’ll get a pretty fragile robot — if it works at all.
What you can do instead — and what the team at OpenAI did — is simulate tons and tons of varied conditions, and use machine learning to train the robot to solve the cube regardless of the conditions. In other words, instead of just having the hand learn how to solve the cube, OpenAI taught it how to solve the cube in various conditions so that the robot hand would be able to keep going regardless of the physical factors it faced.
For example, in many of the simulations gravity was much weaker, or much stronger, or pointing in a completely different direction. Fingers swung less easily, or too easily, or not at all. Sensors — used to view the Rubik’s Cube and get a model of how it looked — malfunctioned. Throughout it all, the simulated hand endured and adapted.
“The only way it can really solve a task in all those environments is to learn to adapt really quickly,” OpenAI researcher Peter Welinder told me.
After thousands of years of “practice” in this environment, the hand learned some general principles about how to interact with the world, researchers told me. It was equipped to handle minor variance in its environment because it had been exposed via machine learning to significantly more varied environments. The researchers tested its resilience by messing with the robot as it tried to solve the cube: taping its fingers together, for example, or fitting it with a rubber glove, or poking it repeatedly with another robot hand. It could even handle having a blanket draped over it and then removed.
(Some efforts to confuse the robot did get the best of it; if you cover all its sensors, for example, it’s out of luck, and if you tie its thumb or its long flexible pinky finger down, it can’t recover.)
Solving a Rubik’s Cube one-handed is hard; most humans can’t do it. Dactyl is hardly the first robot to do things that some humans can’t — robots used for extremely delicate precision surgery have been around for a while — but playing with a manipulable human toy in the same way as many humans do it is a new achievement in robotics. And even more exciting, the approach looks like it could be applied to all sorts of other uses — meaning we may be on the brink of robots being powered by machine learning.
Machine learning has made impressive progress
Dactyl, the researchers told me, starts out every test “learning” what its environment is like — which direction gravity is pointed, how its fingers are working at this moment, what other sources of resistance or distraction are present. Most machine-learning algorithms don’t learn anything during test time (they have a separate training period and a testing period, and can’t learn anything from the test period), but the researchers write that Dactyl’s behavior is consistent with the robot “learning” about its environment.
That is, Dactyl “figures out” what gravity is, and which of its limbs work, and then adjusts its strategy based on its new understanding of the world.
It’s of a lot of interest to ML researchers when an AI system appears to show “learning” behavior in its test environment. But due to the way machine learning works, it’s hard to nail down for sure what Dactyl is doing; the best the researchers can do is identify reasons to think that the AI is learning about its environment, and that’s what their paper does.
Whatever the robot is doing, it works.
In fact, Dactyl got so good at correcting for handicaps that the researchers told me they eventually had a hard time telling when part of their system was broken; usually, it’s easy to tell when a part is broken because the robot won’t work, but after training, the hand was compensating so effectively for malfunctioning joints and sensors that it was often tricky to figure out that something had gone wrong.
Bringing the power of machine-learning techniques to bear on robotics seems likely to make robots work a lot better. That’s because modern machine-learning techniques can do some really cool stuff. The techniques have seen their potential realized in the last few years mostly because our computers have gotten better, letting us train simulations for longer and on larger data sets. And many researchers expect that trend to continue.
“The techniques we use here are extremely general,” OpenAI researcher Matthias Plappert told me. “The algorithm was used to train both DOTA —” OpenAI’s multiplayer strategy game AI — “and now the robotics hand.”
The fact that the same technique can be used to approach such different problems is what has some researchers thinking and talking about general AI — machine intelligence that can surpass humans in many different areas. While AI systems today are trained to solve a problem in a specific area — say, detecting tumors or solving a Rubik’s Cube or winning video games — most researchers agree that someday we’ll have AI systems with more general problem-solving capacity, ones that can work across many different fields.
That’s OpenAI’s ultimate goal, if it can be done safely and responsibly — which not all experts are sure of. Microsoft has invested $1 billion in helping them get there. One thing is certain: As AI gets more sophisticated, and researchers find more and more ways to bring the most impressive AI techniques to bear on problems that were previously believed to be immune to them, there’ll be fewer and fewer things we do better than they do.
So enjoy watching the human record-holder solve a Rubik’s Cube one-handed in 6.82 seconds, an accomplishment no robot has surpassed — yet.
Sign up for the Future Perfect newsletter. Twice a week, you’ll get a roundup of ideas and solutions for tackling our biggest challenges: improving public health, decreasing human and animal suffering, easing catastrophic risks, and — to put it simply — getting better at doing good.