The headlines declaiming no humans were involved in AlphaGo Zero’s mastery continue to amuse me. Now if a machine had taught AGZ the principles and set it off on its path, that would be really something.
The press’ continued erasure of the humans who built the machines and provided the first principles from which to learn the game are indicative of the larger issue in that we don’t see the humans behind these decisions which are made, and thus have no insight into bias etc.
The linked headline above at least specifies that humans were not involved in the mastery, rather than the creation. AGZ played significantly more games than AGL, though this is also not often mentioned either. If you harken back to Gladwell’s supposition that it takes 10,000 hours of practice to become an expert, and we are looking at machines playing from 100,000 games to 4MM games, to ‘learn’ to excel, it is not surprising that they are outplaying humans. We simply do not have the capacity (or longevity, and likely desire) to play so many games.
The description below, of allowing AGZ t to have access to all past experiences, which AGL did not have except when playing humans is very interesting and I’d love to know more about this decision and why it was taken.
AlphaGo Zero’s creators at Google DeepMind designed the computer program to use a tactic during practice games that AlphaGo Lee didn’t have access to. For each turn, AlphaGo Zero drew on its past experience to predict the most likely ways the rest of the game could play out, judge which player would win in each scenario and choose its move accordingly.
AlphaGo Lee used this kind of forethought in matches against other players, but not during practice games. AlphaGo Zero’s ability to imagine and assess possible futures during training “allowed it to train faster, but also become a better player in the end,” explains Singh, whose commentary on the study appears in the same issue of Nature.