Artificial Intelligence, games and COVID connections today

Artificial Intelligence, games and COVID connections today

AI, games and COVID connections today. Let’s first, as I typically do (bear with me), consider where this came from. Although debatable, the technological advance that most altered the course of modern history was the invention of the printing press in the 15th century, which allowed the search for empirical knowledge to supplant liturgical doctrine, and the Age of Reason to gradually supersede the Age of Religion. Individual insight and scientific knowledge replaced faith as the principal criterion of human consciousness. Information was stored and systematized in expanding libraries. The Age of Reason originated the thoughts and actions that shaped the contemporary world order.

But that order is now in upheaval amid a new, even more sweeping technological revolution whose consequences we have failed to fully reckon with, and whose culmination may be a world relying on machines powered by data and algorithms and ungoverned by ethical or philosophical norms. Obviously, the internet age in which we already live prefigures some of the questions and issues that AI will only make more acute. The Enlightenment sought to submit traditional verities to a liberated, analytic human reason. The internet’s purpose is to ratify knowledge through the accumulation and manipulation of ever-expanding data. Human cognition loses any personal character or personality. Individuals turn into data.

Users of the internet emphasize retrieving and manipulating information over contextualising or conceptualising its meaning. They rarely interrogate history or philosophy; as a rule, they demand information relevant to their immediate practical needs. In the process, search-engine algorithms acquire the capacity to predict the preferences of individual clients, enabling the algorithms to personalize results and make them available to other parties for political or commercial purposes. Truth becomes relative. Information threatens to overwhelm wisdom.

Inundated via social media with the opinions of multitudes, users are diverted from introspection; in truth many technophiles use the internet to avoid the solitude they dread. All of these pressures weaken the fortitude required to develop and sustain convictions that can be implemented only by traveling a lonely road, which is the essence of creativity. Indeed, AI today typically refers to a method called machine learning or deep learning that processes data similarly to how human brains work. It's revolutionary because AI models are trained through exposure to real-world data. For example, AI can learn what cat faces look like by analysing thousands of cat photos, in comparison to traditional programming where a developer would try to describe the full feline variety of fur, whiskers, eyes and ears.

The impact of internet technology on politics is particularly pronounced. The ability to target micro-groups has broken up the previous consensus on priorities, by permitting a focus on specialised purposes or grievances. Ask a Canadian trucker. Political leaders, overwhelmed by niche pressures, are deprived of time to think or reflect on context, contracting the space available for them to develop vision. Look how much ‘air time’ on the internet anti-vaxxers receive after all.

The digital world’s emphasis on speed inhibits reflection; its incentive empowers the radical over the thoughtful; its values are shaped by subgroup consensus, not by introspection. For all its achievements, it runs the risk of turning on itself as its impositions overwhelm its conveniences. As the internet and increased computing power have facilitated the accumulation and analysis of vast data, unprecedented vistas for human understanding have emerged. Perhaps most significant is the project of producing artificial intelligence, a technology capable of inventing and solving complex, seemingly abstract problems by processes that seem to replicate those of the human mind.

This goes far beyond automation as we have known it. Automation deals with means; it achieves prescribed objectives by rationalising or mechanising instruments for reaching them. AI, by contrast, deals with ends; it establishes its own objectives. To the extent that its achievements are in part shaped by itself, AI is inherently unstable. AI systems, through their very operations, are in constant flux as they acquire and instantly analyze new data, then seek to improve themselves on the basis of that analysis. Through this process, artificial intelligence develops an ability previously thought to be reserved for human beings. It makes strategic judgments about the future, some based on data received as code (for example, the rules of a game), and some based on data it gathers itself (for example, by playing 1 million iterations of a game). In doing so, it helped me and BenevolentAI find the best drug to treat patients in hospital with COVID-19 pneumonia. To be clear, the Pfizer/Merck pills and all the antibodies are designed to prevent hospitalisation. Baricitinib now has WHO’s highest evidence level, and this all came from AI back in early 2020:

To this just now:

The driverless car illustrates the difference between the actions of traditional human-controlled, software-powered computers and the universe AI seeks to navigate. Driving a car requires judgments in multiple situations impossible to anticipate and hence to program in advance. What would happen, to use a well-known hypothetical example, if such a car were obliged by circumstance to choose between killing a grandparent and killing a child? Whom would it choose? Why? Which factors among its options would it attempt to optimize? And could it explain its rationale? Challenged, its truthful answer would likely be, were it able to communicate: “I don’t know (because I am following mathematical, not human, principles),” or “You would not understand (because I have been trained to act in a certain way but not to explain it).” Yet driverless cars are likely to be prevalent on roads within a decade.

We must expect AI to make mistakes faster, and of greater magnitude, than humans do. AI research now seeks to bring about a “generally intelligent” AI capable of executing tasks in multiple fields. A growing percentage of human activity will, within a measurable time period, be driven by AI algorithms. But these algorithms, being mathematical interpretations of observed data, do not explain the underlying reality that produces them. Paradoxically, as the world becomes more transparent, it will also become increasingly mysterious. What will distinguish that new world from the one we have known? How will we live in it? How will we manage AI, improve it, or at the very least prevent it from doing harm, culminating in the most ominous concern: that AI, by mastering certain competencies more rapidly and definitively than humans, could over time diminish human competence and the human condition itself as it turns it into data.

Philosophy to AlphaGo

AI will in time bring extraordinary benefits to medical science, clean-energy provision, environmental issues, and many other areas, no doubt. But precisely because AI makes judgments regarding an evolving, as-yet-undetermined future, uncertainty and ambiguity are inherent in its results. There are three areas of special concern:

First, that AI may achieve unintended results. Science fiction has imagined scenarios of AI turning on its creators. More likely is the danger that AI will misinterpret human instructions due to its inherent lack of context. A famous example was the AI chatbot called Tay, designed to generate friendly conversation in the language patterns of a 19-year-old girl. But the machine proved unable to define the imperatives of “friendly” and “reasonable” language installed by its instructors and instead became racist, sexist, and otherwise inflammatory in its responses. Some in the technology world claim that the experiment was ill-conceived and poorly executed, but it illustrates an underlying ambiguity: to what extent is it possible to enable AI to comprehend the context that informs its instructions? What medium could have helped Tay define for itself offensive, a word upon whose meaning humans do not universally agree? Can we, at an early stage, detect and correct an AI program that is acting outside our framework of expectation? Or will AI, left to its own devices, inevitably develop slight deviations that could, over time, cascade into catastrophic departures?

Second, that in achieving intended goals, AI may change human thought processes and human values. AlphaGo defeated the world Go champions by making strategically unprecedented moves, moves that humans had not conceived and have not yet successfully learned to overcome. Are these moves beyond the capacity of the human brain? Or could humans learn them now that they have been demonstrated by a new master?

Before AI began to play Go, the game had varied, layered purposes: a player sought not only to win, but also to learn new strategies potentially applicable to other of life’s dimensions. For its part, by contrast, AI knows only one purpose: to win. It “learns” not conceptually but mathematically, by marginal adjustments to its algorithms. So, in learning to win Go by playing it differently than humans do, AI has changed both the game’s nature and its impact. Does this single-minded insistence on prevailing characterize all AI?

Other AI projects work on modifying human thought by developing devices capable of generating a range of answers to human queries. Beyond factual questions (“What is the temperature outside?”), questions about the nature of reality or the meaning of life raise deeper issues. Do we want children to learn values through discourse with untethered algorithms? Should we protect privacy by restricting AI’s learning about its questioners? If so, how do we accomplish these goals?

If AI learns exponentially faster than humans, we must expect it to accelerate, also exponentially, the trial-and-error process by which human decisions are generally made: to make mistakes faster and of greater magnitude than humans do. It may be impossible to temper those mistakes, as researchers in AI often suggest, by including in a program caveats requiring “ethical” or “reasonable” outcomes. Entire academic disciplines have arisen out of humanity’s inability to agree upon how to define these terms. Should AI therefore become their arbiter? Many opine on this and indeed devote their lives to doing so.

Third, that AI may reach intended goals, but be unable to explain the rationale for its conclusions. In certain fields, pattern recognition, big-data analysis, gaming as discussed more below, AI’s capacities already may exceed those of humans. If its computational power continues to compound rapidly, AI may soon be able to optimize situations in ways that are at least marginally different, and probably significantly different, from how humans would optimize them. But at that point, will AI be able to explain, in a way that humans can understand, why its actions are optimal? Or will AI’s decision making surpass the explanatory powers of human language and reason? Through all human history, civilisations have created ways to explain the world around them—in the Middle Ages, religion; in the Enlightenment, reason; in the 19th century, history; in the 20th century, ideology. The most difficult yet important question about the world into which we are headed is this: What will become of human consciousness if its own explanatory power is surpassed by AI, and societies are no longer able to interpret the world they inhabit in terms that are meaningful to them?

How is consciousness to be defined in a world of machines that reduce human experience to mathematical data, interpreted by their own memories? Who is responsible for the actions of AI? How should liability be determined for their mistakes? Can a legal system designed by humans keep pace with activities produced by an AI capable of outthinking and potentially outmanoeuvring them? I mean last week we had another ‘outbreak’ stories concerned with AI and consciousness, this one yesterday:

This came after OpenAI’s chief scientist claimed aspects of AI were already conscious. Ultimately, the term artificial intelligence may be a misnomer. To be sure, these machines can solve complex, seemingly abstract problems that had previously yielded only to human cognition. But what they do uniquely is not thinking as heretofore conceived and experienced. Rather, it is unprecedented memorisation and computation. Because of its inherent superiority in these fields, AI is likely to win any game assigned to it. But for our purposes as humans, the games are not only about winning; they are about thinking. By treating a mathematical process as if it were a thought process, and either trying to mimic that process ourselves or merely accepting the results, we are in danger of losing the capacity that has been the essence of human cognition.

The implications of this evolution are shown by AlphaZero, which plays chess at a level superior to chess masters and in a style not previously seen in chess history. On its own, in just a few hours of self-play, it achieved a level of skill that took human beings 1,500 years to attain. Only the basic rules of the game were provided to AlphaZero. Neither human beings nor human-generated data were part of its process of self-learning. If AlphaZero was able to achieve this mastery so rapidly, where will AI be in five years? What will be the impact on human cognition generally? What is the role of ethics in this process, which consists in essence of the acceleration of choices?

Typically, these questions are left to technologists and to the intelligentsia of related scientific fields. Philosophers and others in the field of the humanities who helped shape previous concepts of world order tend to be disadvantaged, lacking knowledge of AI’s mechanisms or being overawed by its capacities. In contrast, the scientific world is impelled to explore the technical possibilities of its achievements, and the technological world is preoccupied with commercial vistas of fabulous scale. The incentive of both these worlds is to push the limits of discoveries rather than to comprehend them. And governance, insofar as it deals with the subject, is more likely to investigate AI’s applications for security and intelligence than to explore the transformation of the human condition that it has begun to produce.

"The most striking thing for me is we don't need any human data anymore," says Demis Hassabis, of DeepMind. While the first version of AlphaGo needed to be trained on data from more than 100,000 human games, AlphaGo Zero learned to play from a blank slate. Not only has DeepMind removed the need for the initial human data input, Zero is also able to learn faster than its predecessor. David Silver, the main programmer on DeepMind's Go project, says the original AlphaGo that defeated 18-time world champion Lee Sedol 4-1 required several months of training. "We reached a superior level of performance after training for just 72 hours with AlphaGo Zero," he says. Only 4.9 million simulated games were needed to train Zero, compared to the original AlphaGo's 30 million. After the three days of learning Zero was able to defeat the Lee Sedol-conquering version 100-0. After it had been playing the game for 40 days, Zero defeated DeepMind's previous strongest version of AlphaGo, called Master, which then defeated Chinese master Ke Jie.

When AlphaGo Zero started playing Go against itself, it was only presented with a set of rules, a board and the white and black counters. It didn't have knowledge of what strategies, moves, or tactics would be required to win. "The only inputs it takes are the black and white stones of the board," Silver says, adding that he believes the company could make a system that's able to learn the rules of the game as well. From the starting point of giving Zero the rules the system then plays games against itself. During this time it learns the moves it can make that will lead to a victory. For DeepMind to improve upon its already successful system and achieve this, it had to redesign the algorithms used within the AI. The overall process uses a reinforcement learning algorithm that's combined with a search system. In its simplest form, this means that Zero learns from trial and error and can use its search system to scope out each potential move. When Zero played a game against itself, it was given feedback from the system. A +1 is given if it wins and a -1 if it loses. After each game the neural network behind Zero automatically reconfigures to a new, theoretically better, version. On average the system took 0.4 seconds of thinking time before making a move.

"In the original version, we tried this a couple of years ago and it would collapse," Hassabis said. He cited DeepMind's "novel" reinforcement algorithms for Zero's new ability to learn without prior knowledge. Additionally the new system only uses one neural network instead of two and four of Google's AI processors compared to the 48 needed to beat Lee. In the development of Zero, DeepMind was able to do more with less. In its internal testing, detailed in its NAture paper, the firm says Zero was able to beat all of its previous versions: AlphaGo Master, AlphaGo Lee, AlphaGo Fan, Crazy Stone, Pachi and GruGo.

"It is possible to train to superhuman level, without human examples or guidance, given no knowledge of the domain beyond basic rules," the research paper concludes. The system learned common human moves and tactics and supplemented them with its own, more efficient moves. "It found these human moves, it tried them and then ultimately it found something it prefers," Silver says. "Hopefully, humans are going to start looking at this and incorporating that into their own play."

Other applications (briefly)

As with Deep Blue's victory against chess grandmaster Gary Kasparaov in 1996, DeepMind's continued success at Go have wider implications and this is one reason gaming matters. For Hassabis and colleagues, the ongoing challenge is applying what has been learned through the AlphaGo project to other AI problems with real-world applications. "We tried to devise the algorithm so that it could play, in principle, other games that are in a similar class (that would include Chess) and more generally planning domains," Silver says.

This includes looking at protein folding, drug discovery, material design, and quantum chemistry. Part of solving these problems lies in the ability to be create simulations of potential outcomes. The game of Go is constrained to a fixed and strict environment: there's no randomness, luck, or chance affecting the outcome. Applying this approach to real-world scenarios where there's a level of unpredictability is much harder.

Most proteins self-assemble into specific 3D structures that, together with other biological molecules, determine the function and behaviour of cells. Over the past five decades, biologists have experimentally determined the structures of more than 180,000 proteins and deposited them in the Protein Data Bank, a freely available online resource. Despite this painstaking effort, the structures of hundreds of millions of proteins remain unknown, including more than two-thirds of those in the human proteome — the full set of proteins produced by our genome. In the last few months we’ve seen described a machine learning method, AlphaFold2, the predicts protein structures with near-experimental accuracy and report its application to the human proteome. DeepMind has also announced that it has applied AlphaFold2 to the proteomes of 20 model organisms. AlphaFold2 is free for academics to use and, in collaboration with the European Bioinformatics Institute in Hinxton, DeepMind will make the predicted structures of almost all known proteins freely available to all.

AlphaFold2 — as the name implies — is the second iteration of a system that DeepMind introduced three years ago at the Thirteenth Critical Assessment of Structure Prediction (CASP13) competition. The first version of AlphaFold was technically impressive, and outperformed the other CASP13 entrants at the task of predicting protein structures from amino-acid sequences. However, it had a median accuracy of 6.6 ångströms for the most difficult set of proteins tested — that is, for the middle-ranked protein in the set, the atoms in the proposed structures were, on average, 6.6 Å away from their actual positions. This is much less accurate than experimental methods. Moreover, the original AlphaFold arguably represented only an incremental improvement over competing algorithms, in both design and performance. AlphaFold2 fundamentally changes this. Its median accuracy at CASP14, which was held in 2020, was 1.5 Å — comparable to the width of an atom and approaching the accuracy of experimental methods. Moreover, its design has few parallels with existing algorithms.

The prediction of protein structures is difficult for many reasons: the number of plausible shapes for any given protein is huge, but an algorithm must pick just one; the number of known structures is (relatively) small, limiting the data available for training structure-predicting systems; the rules underlying protein biophysics are only approximately known, and are expensive to simulate; and the forces that determine a protein’s structure result not only from local interactions between nearby chemical groups in the protein molecule, but also from long-range interactions spanning the whole protein.

Central to this design is a machine-learning framework — known as an artificial neural network — that considers both local and long-range interactions in protein molecules. This differs from previous algorithms, which commonly considered only local interactions to reduce the computational burden of structure prediction. AlphaFold2 does not try to capture long-range interactions through computational brute force, which would be hopeless even with the resources available at Google. Instead, the authors introduced computational operations that efficiently capture long-range interactions on the basis of fundamental aspects of protein geometry. For example, the operations account for the fact that the coordinates of any three atoms in a protein must satisfy the triangle inequality rule (in other words, the sum of the lengths of any two sides of the triangle defined by the coordinates must be greater than or equal to the length of the remaining side).

AlphaFold2 applies these operations repeatedly (about 200 times) to gradually refine a model of a protein into its final 3D structure. Such iterative refinement, used millions of times, rather than hundreds, is a central component of physics-based approaches to protein-structure prediction. But it is rarely used in machine-learning approaches — which instead predict structures by recognising patterns of mutation in evolutionarily related proteins to detect co-evolving, and therefore spatially proximal, amino-acid residues. AlphaFold2 breaks the mould by combining these two strategies. Crucially, it does not impose known rules of protein biophysics or try to mimic the physical process of protein folding, as has previously been attempted. Instead, it performs purely geometric refinements learnt from its repeated attempts to predict protein structures. In this sense, it exemplifies the learning-driven revolution that has swept the field of protein modelling.

Another group just reported the use of AlphaFold2 to predict the structures of almost all human proteins that independently acquire well-defined 3D shapes, for a total of 23,391 proteins. Predictions at this scale were previously possible, but three features of the new system provide a big leap forward. First, the accuracy of the predictions is sufficiently high to generate biological insights and hypotheses that can be tested experimentally. Second, a calibrated self-assessment of each prediction provides a reliable estimate of correctness at the level of individual amino-acid residues, enabling biologists to make inferences about confidently predicted regions. Third, AlphaFold2 is applicable to whole proteins, including large ones that have multiple, independently self-assembling units — a common feature of mammalian proteins. The resulting resource ‘confidently’ predicts nearly 60% of all human-protein regions; most of the remaining regions might be unable to acquire well-defined structures or be able to do so only in the presence of other biomolecules.

The confidence of protein-structure predictions by AlphaFold2. Jumper et al. reported a machine-learning system, called AlphaFold2, that predicts the 3D structures of proteins from amino-acid sequences. Tunyasuvunakool et al. used the same system to predict the structures of all human proteins that self-assemble into specific 3D structures. AlphaFold2 produces a confidence metric called the predicted local distance difference test (pLDDT) to estimate how well the predicted position of each amino-acid residue agrees with experimentally determined positions, on a scale of 1 to 100. The charts show the fractions of residues corresponding to different ranges of pLDDT for: a, residues that were previously resolved in structure-determination experiments (3,440,359 residues); b, residues that could not be resolved in experiments (589,079 residues); c, all of the residues in human proteins (10,537,122 residues).

AlphaFold2 has already helped structural biologists to solve crystallographic protein structures and refine ones derived from cryo-electron microscopy experiments. It provides biophysicists studying protein motion with starting (static) structures, and those studying protein interactions with hypotheses about how protein surfaces bind to each other. AlphaFold2 also presents opportunities to formulate new algorithms for bioinformatics based on protein structures, and might help systems biologists to understand the behaviour of cellular pathways and molecular machines on the basis of the structures that comprise them. And the study of evolution, which has long relied on genetic sequences, can now more readily be formulated in terms of the onset of new classes of protein structure (folds) and their relationship to cellular function and organismal fitness.

It is tempting to compare the scale of this advance to that of the Human Genome Project, but there are important differences. In contrast to the human genome sequence, the predicted structures have not been experimentally verified; it will take time for evidence of their correctness to emerge, so that scientists can gain confidence in the predictions. Of course, experimental measurements can also be affected by ‘noise’, bias and incompleteness, 20 years passed between the publication of the first draft of the human genome and the complete sequence, and modern structure-determination techniques routinely involve some computational inference. As predictions improve, disagreements between protein models and experiments could become difficult to resolve, a situation familiar to physicists but largely unprecedented in biology.

Disordered protein regions, which do not have well-defined shapes but often encode functionally crucial parts of proteins, present an ongoing and fundamental challenge to AlphaFold2 and, therefore, to our understanding of protein structure. Future methods must take this disorder into account and begin to reflect the flexibility inherent in most proteins.

Other differences between the Human Genome Project and the present advance are in AlphaFold2’s favour. Structure predictions are (relatively) cheap and will soon be available for all proteins, whereas genetic-sequencing technology took years to deploy and mature. Computational methods evolve rapidly, and it might therefore soon be possible to predict the structures of multi-protein complexes, alternative conformations of a protein (for proteins that adopt them) and the structures of designed proteins with a level of accuracy similar to that currently achieved by AlphaFold2. Finally, protein structures provide immediate biological insights, because they fit within established conceptual frameworks that relate a protein’s structure to its function — unlike genetic sequences, which were largely inscrutable at the dawn of the genomics era. The fruits of this revolution might thus be more swiftly reaped.

Poker anyone?

A classic gambling game for anyone who’s anyone. Specifically, let’s focus on Texas Hold ’Em, a variation where two face-down cards are dealt to each player and five community cards are dealt in three stages (three cards on the flop, one card on the turn, and one card on the river). On every turn, each player has betting options to check, call, raise, or fold. Turns happen before the flop is dealt and after each following deal. At the end of all betting, the player with the best five-card hand using a combination of the community cards and their own two cards wins all of the money bet for that round.

There are two ways to win a hand in Texas Hold ‘Em:

If all other players fold their hands, then the last player who hasn’t folded wins all of the money.

If there are at least two players still remaining after all betting ends, then the player with the better five-card hand wins all of the money (referred to as a showdown).

Because a player can win a hand by getting all other players to fold, Texas Hold ’Em provides an opportunity to win even if your hand is weak. Consider the following strategies:

If you have a bad hand (as in a completely trash hand that can’t beat anything), you can either fold and forfeit any opportunity to win the money or you can bluff by betting a large amount of money and make the other players think you have a very strong hand. If you are able to convince the other players that your hand is very strong, you may be able to make them fold their hand. Of course, if they also have a very strong hand then you’ve lost your money. If you have a decent hand, you can also bet a large amount of money and bluff to protect your hand, but you also have to determine whether or not the other players have a worse hand that you could beat. If so, you want to maximize the amount of money you make by betting enough so that the other players will call but not so much that they will fold.

If you have a strong hand (as in a super good hand that for sure won’t be beat), you should be betting enough so that the other players will call but not so much that they will fold. In general, the strategy is based on what you think other players will do. In this sense, Poker is a very psychological game (something that artificial intelligence doesn’t quite understand).

Take a game like Chess. Chess is very definitive. Other than the first few moves, there is always a best move in Chess, which is why Chess AI’s are capable of obliterating even the best grandmasters. In the case of Chess, machines will calculate the results of each move and pick the one that’s most likely to win. However, the introduction of bluffing complicates things. In Texas Hold ’Em, there is the psychological aspect of the game (even though there is still a mathematical aspect), which is hard for machines to learn. Unlike in Chess where the best move can be determined given the current moves and the opponent’s likely moves, Texas Hold ’Em requires the occasional use of following what you feel (even if logic is saying no); for example, when another player betting big and you say “I don’t believe you”, sometimes calling is just as good as always folding if your mathematical probability of winning seems to be less than 50%. There’s also the not-so-small issue of 2 players in chess, and up to 6 in Poker.

So how do you change the AI to be able to adapt to each different player’s strategy? Because understanding how your opponent plays is critical to being a good Poker player. Some players are “tight” players who only play a hand if their original two-card hand is above-average and will only call if they have a very strong hand. Other players are “loose” players who will play any hand and will often call with only decent hands. The key is to break up the game into smaller parts and adjust the strategy as the game progresses. The AI can thus use machine learning to find weaknesses in its opponent’s strategy and exploit them.

There’s also the problem where you don’t know your opponent’s hand. In Chess, both players know the exact state of the board at all times. In Poker, neither player ever knows the exact state because there are two concealed cards. This makes it hard for anyone to predict the final outcome of the game. It’s also hard to factor in luck, because Poker truly is a game of chance. You can start with the best hand (pair of Ace’s), but you’d be shaking if the five community cards were a 5, 6, 7, 9, and 10 of the same suit.

While there’s no specific way to account for this (artificial intelligence that plays Poker will always be an approximate solution), researchers try to address this issue by making the game and abstraction where similar hands are grouped together. This makes it slightly easier for the AI to consider such a large number of possible hands that the other players could have.

Current artificial intelligence is very much already able to beat professional poker players. It’s also becoming more common to consult with artificial intelligence in terms of strategy. Now, more and more people are able to use artificial intelligence to improve their strategies than ever before. While old-school Poker players would learn by losing their money, contemporary players learn by playing against their machines. This has huge impacts on the Poker world; the 2019 World Series of Poker had 8 thousand entrants, more than ever before. It won’t be long until an artificial intelligence wins the World Series of Poker…just like AlphaGo did:


Driving a racing car requires a tremendous amount of skill. Now, artificial intelligence has challenged the idea that this skill is exclusive to humans — and it might even change the way automated vehicles are designed. A modern Formula 1 race is a breath-taking display of engineering precision. Yet the popularity of the sport arguably has less to do with the performance of the cars than with the skill and daring displayed by the drivers as they push those cars to the limit. Success on the race track has been a celebrated human achievement for more than a century. Will it now become a similar triumph for AI. Last week a team published in Nature work that takes a step in this direction by introducing Gran Turismo (GT) Sophy, a neural-network driver capable of outperforming the best human players of the video game Gran Turismo:

The objective in racing is easily defined: if you complete the circuit in less time than your competitors, you win. However, achieving this goal involves a complicated battle with physics, because negotiating the track requires careful use of the frictional force between the tyre and the road, and this force is limited. Using some of that friction for braking, for instance, leaves less force available for rounding a corner.

More specifically, each tyre can produce a frictional force proportional to the vertical force, or load, that connects it to the road. As the car accelerates, the load shifts to the rear tyres, leaving less frictional force for the front tyres. This can induce understeer, in which the steering wheel cannot generate more cornering force and effectively becomes a hand rest as the car ploughs out of the turn. By contrast, when the car brakes, the load shifts to the front of the car. This can lead to oversteer, meaning that the rear tyres lose traction and the car spins. Add in a complicated track topography, and the complexities of tuning load transfer with the suspension of the vehicle, and the challenges of racing become obvious.

To win the race, the driver must choose trajectories that allow the car to stay within these ever-changing friction limits as much as it physically can. Brake too early going into a turn and your car is slow, losing time. Brake too late and you won’t have enough cornering force to hold your desired racing line as you near the tightest part of the turn. Brake too hard and you might induce a spin. Professional racing drivers are eerily good at finding and maintaining the limits of their car, lap after lap, for an entire race.

As complex as the handling limits of a car can be, they are well described by physics, and it therefore stands to reason that they could be calculated or learnt. Indeed, the automated Audi TTS, Shelley, was capable of generating lap times comparable to those of a champion amateur driver by using a simple model of physics2. By contrast, GT Sophy doesn’t make explicit calculations based on physics. Instead, it learns through a neural-network model. However, given the track and vehicle motion information available to Shelley and GT Sophy, it isn’t too surprising that GT Sophy can put in a fast lap with enough training data.

What really stands out is GT Sophy’s performance against human drivers in a head-to-head competition. Far from using a lap-time advantage to outlast opponents, GT Sophy simply outraces them. Through the training process, GT Sophy learnt to take different lines through the corners in response to different conditions. In one case, two human drivers attempted to block the preferred path of two GT Sophy cars, yet the AI succeeded in finding two different trajectories that overcame this block and allowed the AI’s cars to pass:

Neural-network drivers outperform human players. Wurman et al. report a neural-network algorithm — called GT Sophy — that is capable of winning against the best human players of the video game Gran Turismo. When two human drivers attempted to block the preferred path of two GT Sophy cars, the algorithm found two ways to overtake them.

GT Sophy also proved to be capable of executing a classic manoeuvre on a simulation of a famous straight of the Circuit de la Sarthe, the track of the car race 24 Hours of Le Mans. The move involves quickly driving out of the wake of the vehicle ahead to increase the drag on the lead car in a bid to overtake it. GT Sophy learnt this trick through training, on the basis of many examples of this exact scenario — although the same could be said for every human racing-car driver capable of this feat. Outracing human drivers so skilfully in a head-to-head competition represents a landmark achievement for AI.

The implications of this work go well beyond video-game supremacy. As companies work to perfect fully automated vehicles that can deliver goods or passengers, there is an ongoing debate as to how much of the software should use neural networks and how much should be based on physics alone. In general, the neural network is the undisputed champion when it comes to perceiving and identifying objects in the surrounding environment. However, trajectory planning has remained the province of physics and optimisation. Even vehicle manufacturer Tesla, which uses neural networks as the core of autonomous driving, has revealed that its neural networks feed into an optimisation-based trajectory planner. But GT Sophy’s success on the track suggests that neural networks might one day have a larger role in the software of automated vehicles than they do today.

So, will the Formula 1 battles between Lewis Hamilton and Max Verstappen give way to contests between GT Sophy variants? After all, the physics of Gran Turismo is a close match for real racing cars. Gran Turismo’s director, Kazunori Yamauchi, even used the video game to find ways of tweaking his real racing car to overcome a recurring problem that he was having when taking a corner at the Nürburgring, a Grand Prix track in Germany that has the nickname The Green Hell (see

Still, some challenges remain in moving from the console to the track. For example, GT Sophy has not yet learnt that it is sometimes better to follow the car ahead to make up time, instead of dogfighting at every corner. Of course, they report GT Sophy’s rookie season, and there is no obvious reason why such a strategy could not be learnt with greater experience, too.

More challenging might be the variation that occurs with each lap. Unlike in the Gran Turismo races used by Wurman and co-workers, the condition of the tyres on real racing cars changes from lap to lap, and human drivers must adapt to such changes throughout the race. Would GT Sophy be able to do the same with more data? And where would such data come from? It’s easy to run simulations, but no racing car in existence has completed enough laps to train GT Sophy in its current form, much less an AI that could handle tyre variability. However, there is evidence that neural networks can capture changing vehicle dynamics on different road surfaces, so perhaps Verstappen and Hamilton should keep one eye on their rear-view mirrors.

Human interactions

In a new study,  researchers sought to find out how well humans could play the cooperative card game Hanabi with an advanced AI model trained to excel at playing with teammates it has never met before. In single-blind experiments, participants played two series of the game: one with the AI agent as their teammate, and the other with a rule-based agent, a bot manually programmed to play in a predefined way. The results surprised the researchers. Not only were the scores no better with the AI teammate than with the rule-based agent, but humans consistently hated playing with their AI teammate. They found it to be unpredictable, unreliable, and untrustworthy, and felt negatively even when the team scored well:

"It really highlights the nuanced distinction between creating AI that performs objectively well and creating AI that is subjectively trusted or preferred," says Ross Allen, co-author of the paper and a researcher in the Artificial Intelligence Technology Group. "It may seem those things are so close that there's not really daylight between them, but this study showed that those are actually two separate problems. We need to work on disentangling those." Humans hating their AI teammates could be of concern for researchers designing this technology to one day work with humans on real challenges — like defending from missiles or performing complex surgery. This dynamic, called teaming intelligence, is a next frontier in AI research, and it uses a particular kind of AI called reinforcement learning.

A reinforcement learning AI is not told which actions to take, but instead discovers which actions yield the most numerical "reward" by trying out scenarios again and again. It is this technology that has yielded the superhuman chess and Go players. Unlike rule-based algorithms, these AI aren’t programmed to follow "if/then" statements, because the possible outcomes of the human tasks they're slated to tackle, like driving a car, are far too many to code.

"Reinforcement learning is a much more general-purpose way of developing AI. If you can train it to learn how to play the game of chess, that agent won't necessarily go drive a car. But you can use the same algorithms to train a different agent to drive a car, given the right data” Allen says. "The sky's the limit in what it could, in theory, do." The game of Hanabi is akin to a multiplayer form of Solitaire. Players work together to stack cards of the same suit in order. However, players may not view their own cards, only the cards that their teammates hold. Each player is strictly limited in what they can communicate to their teammates to get them to pick the best card from their own hand to stack next.

The Lincoln Laboratory researchers did not develop either the AI or rule-based agents used in this experiment. Both agents represent the best in their fields for Hanabi performance. In fact, when the AI model was previously paired with an AI teammate it had never played with before, the team achieved the highest-ever score for Hanabi play between two unknown AI agents.  "That was an important result," Allen says. "We thought, if these AI that have never met before can come together and play really well, then we should be able to bring humans that also know how to play very well together with the AI, and they'll also do very well. That's why we thought the AI team would objectively play better, and also why we thought that humans would prefer it, because generally we'll like something better if we do well."

Neither of those expectations came true. Objectively, there was no statistical difference in the scores between the AI and the rule-based agent. Subjectively, all 29 participants reported in surveys a clear preference toward the rule-based teammate. The participants were not informed which agent they were playing with for which games. "One participant said that they were so stressed out at the bad play from the AI agent that they actually got a headache," says Jaime Pena, a researcher in the AI Technology and Systems Group and an author on the paper. "Another said that they thought the rule-based agent was dumb but workable, whereas the AI agent showed that it understood the rules, but that its moves were not cohesive with what a team looks like. To them, it was giving bad hints, making bad plays."

This perception of AI making "bad plays" links to surprising behaviour researchers have observed previously in reinforcement learning work. For example, in 2016, when DeepMind's AlphaGo first defeated one of the world’s best Go players, one of the most widely praised moves made by AlphaGo was move 37 in game 2, a move so unusual that human commentators thought it was a mistake. Later analysis revealed that the move was actually extremely well-calculated, and was described as “genius”:

Such moves might be praised when an AI opponent performs them, but they're less likely to be celebrated in a team setting. The Lincoln Laboratory researchers above found that strange or seemingly illogical moves were the worst offenders in breaking humans' trust in their AI teammate in these closely coupled teams. Such moves not only diminished players' perception of how well they and their AI teammate worked together, but also how much they wanted to work with the AI at all, especially when any potential payoff wasn’t immediately obvious. "There was a lot of commentary about giving up, comments like 'I hate working with this thing,'" adds Hosea Siu, also an author of the paper and a researcher in the Control and Autonomous Systems Engineering Group.

Participants who rated themselves as Hanabi experts, which the majority of players in this study did, more often gave up on the AI player. Siu finds this concerning for AI developers, because key users of this technology will likely be domain experts.

"Let's say you train up a super-smart AI guidance assistant for a missile defence scenario. You aren't handing it off to a trainee; you're handing it off to your experts on your ships who have been doing this for 25 years. So, if there is a strong expert bias against it in gaming scenarios, it's likely going to show up in real-world ops," he adds.  The researchers note that the AI used in this study wasn't developed for human preference. But, that's part of the problem — not many are. Like most collaborative AI models, this model was designed to score as high as possible, and its success has been benchmarked by its objective performance.

If researchers don’t focus on the question of subjective human preference, "then we won't create AI that humans actually want to use," Allen says. "It's easier to work on AI that improves a very clean number. It's much harder to work on AI that works in this mushier world of human preferences."

Solving this harder problem is the goal of the MeRLin (Mission-Ready Reinforcement Learning) project. The researchers think that the ability for the AI to explain its actions will engender trust. This will be the focus of their work for the next year. "You can imagine we rerun the experiment, but after the fact — and this is much easier said than done — the human could ask, 'Why did you do that move, I didn't understand it?" If the AI could provide some insight into what they thought was going to happen based on their actions, then our hypothesis is that humans would say, 'Oh, weird way of thinking about it, but I get it now,' and they'd trust it. Our results would totally change, even though we didn't change the underlying decision-making of the AI," Allen says.

Like a huddle after a game, this kind of exchange is often what helps humans build camaraderie and cooperation as a team. "Maybe it's also a staffing bias. Most AI teams don’t have people who want to work on these squishy humans and their soft problems," Siu adds, laughing. "It's people who want to do math and optimisation. And that's the basis, but that's not enough." Mastering a game such as Hanabi between AI and humans could open up a universe of possibilities for teaming intelligence in the future. But until researchers can close the gap between how well an AI performs and how much a human likes it, the technology may well remain at machine versus human.

A COVID story from the World of Warcraft

Most games ‘our children play’ aren’t AI like the above but the below illustrates a point. Whilst Fortnite, Minecraft, Grand Theft Auto V, Rocket League and so forth are now supremely popular, World of Warcraft may win awards for a steady loyal following that’s stuck with it. On Sept 13, 2005, an estimated 4 million players of the World of Warcraft encountered an unexpected challenge in the game, introduced in a software update released that day: a full-blown epidemic. Players exploring a newly accessible spatial area within the game encountered an extremely virulent, highly contagious disease. Soon, the disease had spread to the densely populated capital cities of the fantasy world, causing high rates of mortality and, much more importantly, the social chaos that comes from a large-scale outbreak of deadly disease. These unforeseen effects raised the possibility for valuable scientific content to be gained from this unintentional game error:

The above shows an urban centre in World of Warcraft during the epidemic: a gathering of individuals in a town. Infected individuals walk among the uninfected, the recently dead, and the skeletons of those who died earlier.

New game content for World of Warcraft is issued via a series of patches, released every 1 or 2 months. Patch 1.7, released on Sept 13, 2005, contained access to an area known as “Zul'Gurub”, which was intended for use by players whose characters had achieved a sufficient level within the game to be considered “relatively powerful”. The centrepiece of this area was a combative encounter with a powerful creature called “Hakkar”. Hakkar the primary source of infection in World of Warcraft shown here:

Occasionally, one of the players facing this ginormous winged serpent would be purposefully infected by a disease called “Corrupted Blood”. This infection, as intended, then rapidly began infecting other nearby players. To the powerful players who were battling Hakkar, the infection was just a hindrance, designed to make this particular combat more challenging. However, several aspects of the disease caused this minor inconvenience to blossom into an uncontrolled game-wide epidemic, a bit like, errrm, SARS-CoV-2. The ability of many characters to transport themselves instantly from one location to another was the first factor in the game that unexpectedly set the stage for the plague. This type of travel is frequently used to return to the capital cities of the game's geography from more remote regions for reasons of game play. Many victims of Corrupted Blood thus reached heavily populated areas before either being killed by or cured of the disease, mimicking the travel of contagious carriers over long distances that has been the hallmark of many disease outbreaks in history eg. the Mongol horde and the bubonic plague, or the cholera outbreaks of Europe during the mid-19th century...again I could go on. The highly contagious disease then spread to other players outside the intended, localised combat area near Hakkar.

The second factor that sustained the epidemic was that the disease could escape its origin in Zul'Gurub via interspecies transmission from player characters (ie. human beings) to animals and then back. Many players in the game have “pets”, non-player animal characters that assist them in the completion of certain functions within the game. The penalty assigned by the game for allowing a pet to die is prohibitively large, therefore players commonly dismiss their pets rather than subjecting them to dangerous effects such as disease. Dismissal temporarily removes the pet from the game, keeping them in stasis until they can be healed or otherwise safeguarded after the dangers of combat have gone. These pets, therefore, acted as carriers of the disease and also served as a source of disease by causing new outbreaks when brought out of stasis, even if their owner had recovered and was no longer infectious. Based on player accounts, pets, as opposed to the infective characters themselves, seem to have been the dominant vectors for the disease. Players would return to densely packed capital cities and retrieve pets that, being infectious, immediately triggered an outbreak. The density of susceptible characters within a specific radius was, therefore, the only apparent limit to transmission. Since players gathered in common areas, the outbreaks were characterised by staggeringly high reproductive rates (R0), which they estimate at 102 per hour for the capital cities and transportation hubs, based on the few parameters known for the disease.  Unfortunately, the actual value is not publicly known.

Although the reproductive value for this particular disease was too high to accurately reflect the dynamics of any real-world pathogen, future experiments could easily tailor the parameters controlling disease transmission and mortality to more accurately reflect a wide variety of pathogens. Smaller probabilities of transmission (rather than the simple density dependence), could cause the results of individual behaviour and social contact patterns to become even more important to disease spread. Alternatively, it may be that the ultimate use of these virtual experiments would be to examine whether or not behaviours, eg. altruistic medical attention, are subject to thresholds in the risk perceived by the individuals involved. These virtual worlds could, therefore, test human reactions to a wide range of disease scenarios, most of which could never occur in reality, to understand how people will behave if they do not know the probabilities of transmission for an infectious disease.

Although highly contagious, the disease in World of Warcraft may very well have run its course naturally in a very short period of time; a bit like Ebola. To the game's powerful players, the disease was no more threatening than the common cold in a healthy adult. Less powerful characters (who were never intended to enter Zul'Gurub or encounter the disease), died very quickly from its effects. With most of the susceptible portion of the population (the equivalent of children, elderly people, or the immunocompromised) already dead, and the living either leaving the urban centres to avoid infection, or temporarily leaving the game entirely (to wait for the software defect to be fixed), the density necessary for a sustained chain of infection should have dropped below the threshold needed to sustain the epidemic.

Unfortunately for players of the game, a last, seemingly unrelated factor, present since the origin of the game, allowed the outbreak to continue, turning the capital cities into death traps. Computer-controlled characters, such as shopkeepers and soldiers (called non-player characters), are necessary parts of the structure and function of the game. These characters are deliberately made very powerful, to prevent them from being victimised by players exhibiting homicidal tendencies and to prevent such incidents from disrupting the normal course of game play. During the epidemic, the non-player characters served as “asymptomatic” carriers capable of spreading the disease, creating a nearly unbreakable chain of infection between highly infectious non-player characters, player characters, and their pets. Also aiding in the continuation of the epidemic, a cycle involving the resurrection of weaker characters by those with healing abilities, saw the susceptible population continually replenished (only to be reinfected and die again).

All together, these seemingly innocuous aspects of the game world, each directly mirroring an aspect of real-world epidemiology, allowed what should have been a very minor point of interest in a small area of the game, catering to a very specific subset of the players, to become the first online instance of uncontrolled plague to affect millions of Americans, Asians, and Europeans at home. This summarises the dynamics of infection that produced the epidemic:

This flowchart shows network of intentional and unintentional interactions between characters within Zul'Gurub and those outside it that culminated in the epidemic. The proportion of transmission accounted for by each pathway is crucial for a full understanding of the dynamics of the epidemic, and must be recorded in future, intentionally triggered, outbreaks.

In an effort to control the outbreak, Blizzard Entertainment employees imposed quarantine measures, isolating infected players from as-yet uninfected areas. These strategies failed because of the highly contagious nature of the disease, an inability to seal off a section of the game world effectively, and more than likely player resistance to the notion. The game's developers did, however, have an option that remains unavailable to public-health officials: resetting the computers. When the servers ravaged by the epidemic were reset and the effect removed, the outbreak came to a halt. East vs Western effects to control contagion?

The real-virtual connection

The modern world has 2 distinct types of pathogens, real and virtual. Real pathogens are, logically, those that infect real organisms, many of which subsequently cause disease and become subject to the attentions of the medical and public-health professions. The second type of pathogen, the virtual virus, infects computers through software. The outbreak described here marks the first time that a virtual virus has infected a virtual human being in a manner even remotely resembling an actual epidemiological event. As technology and biology become more heavily integrated in daily life, this small step towards the interaction of virtual viruses and human beings could become highly significant.

Players in World of Warcraft can become highly involved in the game, investing not only their monthly payment but also hours of time within the game. Some challenges in the game require players to set aside several hours on 3 or 4 days of the week; in South Park the 4 kids didn’t leave their computers for days, becoming morbidly obese. Friendships are formed within the game, large numbers of players work together for months to achieve common goals, and many players strive to create a believable alter ego in the virtual world, complete with the weight of responsibility and the expectations of others – I think many of us have children engrossed in Epic’s Fortnite where this applies. Whereas “resurrection” most certainly allows riskier behaviours, it is not unlikely that the modification in behaviour produced could be estimated. Research into the behavioural and emotional involvement of game players, and their relationship with their virtual selves has shown that reactions to events in the game world can have serious, emotional repercussions. Sherry Turkle from MIT has said: “It's not that it's not part of your real life just because it's happening on the screen. It becomes integrated into really what you do every day. And so where you have loss of that part of your life that was involved in the habits and the rituals and the daily life, it's very traumatic. It is play, but it's very serious play”.  This level of commitment and dedication to the virtual community within the game helps to ensure that the reactions of players will approximate with the reactions of people in real-life situations of danger. I used to play Dungeons and Dragons and a bit of Rune Quest so I can see the point she’s making, and I also got to level 34 on Chuckie Egg, was Deadly in Elite…I could go on…

Let’s face it, when most of us read papers on SARS-CoV-2 models our eyes glaze over, rapidly now. Personally, I just about manage with the discussion and the methods is a big black box. Whereas the epidemic of Corrupted Blood within World of Warcraft was the result of unintended interactions between different elements of the game, it nevertheless shows the potential of such scenarios for the study of infectious disease. One of the major constraints in studies of disease dynamics in animals is that epidemiologists are restricted largely to observational and retrospective studies. In nearly every case, it is physically impossible, financially prohibitive, or morally reprehensible to create a controlled, empirical study where the parameters of the disease are already known before the course of epidemic spread is followed. At the same time, computer models, which allow for large-scale experimentation on virtual populations without such limitations, lack the variability and unexpected outcomes that arise from within the system, not by the nature of the disease, but by the nature of the hosts it infects. These computer simulation experiments attempt to capture the complexity of a functional society to overcome this challenge. Two such systems, Transportation Analysis Simulation System (TRANSIMS) and Epidemiological Simulation System (EpiSims) are particularly ambitious, using large amounts of computing power to generate realistic virtual societies in which agents (autonomous entities governed by the rules of the simulation) perform programmed actions based on incredibly detailed research of real-world behaviour under non-outbreak conditions. However, they are programmed, by necessity, using these non-outbreak data.

Whereas games can be thought of among these types of agent-based epidemiological models, the use of human agents rather than virtual agents could further illustrate human behaviours in actual outbreak scenarios, rather than relying on stochastic algorithms to approximate assumed behaviours under these conditions. Human-agent simulations, where the subjects are virtual but have their actions controlled by human beings interacting with each other, may potentially bridge the gap between real-world epidemiological studies and large-scale computer simulations. Since the influence of individual behavioural choice has been shown to greatly affect the range of societal outcomes in many fields including epidemiology, differences between the human-agent simulation and a pure computer simulation of the same disease, incorporating the vast complexity of human behaviour, rational or otherwise, could examine the effects of these behaviours on the course of an outbreak. It would be great to put a mask on the characters.

Both types of simulation, however, share 2 limitations. First, they are both still simulations, and human-controlled virtual agents might not act in the same way as the human controller if presented with the same situation in the real world. Second, they are both still heavily dependent on computing power – which probably can’t model something like a superspreader and let’s face it, we barely understand asymptomatic versus symptomatic transmission. In the case of human-agent simulations, server capacity and subscriber base represent a theoretical maximum number of agents that, although far higher than most traditional epidemiological studies, can pale in comparison to purely computational simulations. These computational limitations imply that attention will need to be paid to issues of scalability of result.

In the case of the Corrupted Blood epidemic, some players, those with healing abilities, were seen to rush towards areas where the disease was rapidly spreading, acting as first responders in an attempt to help their fellow players. Their behaviour may have actually extended the course of the epidemic and altered its dynamics, for example, by keeping infected individuals alive long enough for them to continue spreading the disease, and by becoming infected themselves and being highly contagious when they rushed to another area. Of course, this behaviour could also have greatly reduced the mortality from the disease in those they treated. Such behaviour and its effects would have been extremely difficult to capture accurately in a pure-computer model. Human response is, almost by definition, difficult to predict, requiring experiments on emotionally involved subjects to determine the proportion of the population likely to respond in various ways. This understanding would provide the groundwork for the examination of the effect of those behaviours on the system. The failure of the quarantine measure, similarly, could not have been accurately predicted by numerical methods alone, since it was driven by human decisions and behavioural choices. This ability to demonstrate unforeseen consequences of human actions within a statistically robust and controlled computer simulation is yet another benefit of such a system.

Massively multiplayer online role-playing games (MMORPGs) represent a particularly tantalising pool of experimental laboratories for potential study. With very large numbers of players these games provide a population where controlled outbreak simulations may be done seamlessly within the player experience. However, MMORPGs are, at their core, still games, and as such enjoyment and entertainment are their central focus. It may therefore prove difficult to motivate players to participate in an epidemiological simulation. However, plagues and epidemics already have prominent places in fantasy settings: the spread of disease, intentional and otherwise, has occurred as a major plot in several major software titles. Researchers will have to allow players to feel not as if they are in a deliberate epidemiological simulation where they may die based on statistical whims, but rather that they are immersed in a coherent, logical setting where death is a major risk, essentially unifying epidemiological experimentation with game design and development. These efforts will likely involve careful consideration and partnership with the gaming industry, mirroring the outreach, partnering, and involvement of community representatives often needed to make traditional epidemiological studies palatable to real-world populations being studied. Studies using gaming systems are without the heavy moral and privacy restrictions on patient data inherent to studies involving human patients. This is not to say that this experimental environment is free from concerns of informed consent, anonymity, privacy, and other ethical quandaries. Players may, for example, be asked to consent to the use of their game behaviour for scientific research before participating in the game as part of a licence agreement. Studies impossible to undertake on actual populations can be run as in-game events, and the data collected would be free from any biases because they would be recorded from servers. Lastly, the ability to repeat such experiments on different portions of the player population within the game (or on different game servers) could act as a detailed, repeatable, accessible, and open standard for epidemiological studies, allowing for confirmation and the alternative analysis of results.

Although the use of these systems opens the possibility for new methods of experimentation, it remains subject to the need for external validation before any understanding gained from these experiments could be applied to real-world outbreak scenarios. Some of this validation could be achieved in the same way that many mathematical and simulation modelling results are already tested: by comparing the outcomes observed in virtual scenarios that have been tailored to reflect the conditions of real-world, historical outbreaks as closely as possible with the outcomes of the real-world outbreak. However, as with many simulation modelling techniques, having a framework available for experimentation, even in the absence of external validation, can lead to potentially crucial insights into the dynamics of the system as a whole.


Last week Meta revealed that it's built one of the world's fastest supercomputers, a behemoth called the Research SuperCluster, or RSC. With 6,080 graphics processing units packaged into 760 Nvidia A100 modules, it's the fastest machine built for AI tasks, Mark Zuckerberg says. That processing power is in the same league as the Perlmutter supercomputer, which uses more than 6,000 of the same Nvidia GPUs and currently ranks as the world’s 5th fastest supercomputer. And in a second phase, Meta plans to boost performance by a factor of 2.5 with an expansion to 16,000 GPUs this year. Meta will use RSC for a host of research projects that require next-level performance, such as "multimodal" AI that bases conclusions on a combination of sound, imagery and actions instead of just one type of input data. That could be helpful for dealing with the subtleties of one of Facebook's big issues, spotting harmful content. Meta and other AI proponents have shown that training AI models with ever larger data sets produces better results. Training AI models takes vastly more computing horsepower than running those models, which is why iPhones can unlock when they recognise your face without requiring a connection to a data centre packed with servers. Supercomputer designers customize their machines by picking the right balance of memory, GPU performance, CPU performance, power consumption and internal data pathways. In today's AI, the star of the show is often the GPU, a type of processor originally developed for accelerating graphics but now used for many other computing chores. The games of the future will have many more applications beyond the games themselves, just look at what AlphaGo has spurned. How this interacts with the world of quantum computers will be fascinating.

The depth of history here is already remarkable, but it’s now part of the future. It’s here to stay but will develop its own rapid evolution. AI-powered interactive experiences are now usually generated via non-player characters, or NPCs, that act intelligently or creatively, as if controlled by a human game-player. AI is the engine that determines an NPC’s behaviour in the game world. AI games increasingly shift the control of the game experience toward the player, whose behaviour helps produce the game experience. AI procedural generation, also known as procedural storytelling, in game design refers to game data being produced algorithmically rather than every element being built specifically by a developer. Also, no longer is gaming simply a choice between console or desktop computer. Rather, players expect immersive game experiences on a vast array of mobile and wearable devices, from smartphones to VR headsets, and more. AI enables developers to deliver console-like experiences across device types. But where it excels, is its ability to beat humans, enabling us to learn.

When it comes to games, the Corrupted Blood outbreak in World of Warcraft represents both a missed opportunity and an exciting new direction for future epidemiological research. Although the infrastructure needed to accurately record data on the outbreak to research standards was not in place, and the outbreak was largely a series of unexpected interactions between different factors within the game, it offers a view of potential further study. Virtual outbreaks designed and implemented with public-health studies in mind have the potential to bridge the gap between traditional epidemiological studies on populations and large-scale computer simulations, involving both unprogrammed human behaviour and large numbers of test participants in a controlled environment where the disease parameters are known; something neither type of study can manage alone. One can believe that, if the epidemic is designed and presented so as to seamlessly integrate with the rest of the persistent game world, in such a way as to be part of the user's expected experience in the game, a reasonable analogue to real-world human reactions to disease might be observed and captured within a computer model. This human-agent model of disease dynamics can then be used to provide reproducible empirical analyses, yielding greater insights into the behavioural reactions and individual responses of people threatened by outbreaks of disease. By using these games as an untapped experimental framework, we may be able to gain deeper insight into the incredible complexity of infectious disease epidemiology in social groups. Lessons for SARS-CoV-2 abound and these games can not only help us understand the pandemic better, but how to deal with it, and interestingly prepare for the next one. It’s not very easy to google ‘computer game and virus’ as you can imagine what comes up. And as per COVID-19, there are lots of lessons to be learned. We can learn a ton from games, and they’re fun too.


Text written by Professor Justin Stebbing
Managing Director