– article written by Vincent Lautier –

Super Mario Bros is now used as a training field for artificial intelligence. Researchers from the University of California in San Diego used the cult game to test different models of AI. Their observation? This benchmark is more complex than the previous ones and clearly highlights the difficulties of the models to reason in real time.

Experience: a mario piloted by AI

For this study, the Hao AI lab did not use the original game of 1985, but a emulated version integrated into Kida framework designed internally. AI received screenshots and basic style instructions: “If an obstacle is approaching, jumps to the left”. From there, she generated Python code to control Mario. The idea was to see how these models could adapt and develop game strategies.

EZGIF 12B952F5417751

Source: Techcrunch.com

Models that think too much to play properly

Surprise: the models of AI supposed to be the most intelligent, like GPT-4O of Openai, proved rather bad. Their problem? They take too long to decide what to do. And in Super Mario Brosif you put three seconds to choose between jump or run … you die. Conversely, less sophisticated but more reactive models, such as Claude 3.7 from Anthropic, have succeeded better.

A real test for AI?

Using video games to assess AI is not new, but some researchers are starting to question their relevance. Certainly, Super Mario Bros Pushes AI to anticipate and react quickly, but it is only a game, with fixed rules and a limited environment. Andrej Karpathy, a researcher at Openai, even talks about an “evaluation crisis”: we no longer know which tests really reflect the capacities of current models.

If Super Mario Bros shows the limits of some AI in real time, that does not mean that these models are useless elsewhere. AI assessment must take into account more complex and varied situations.

Article published by myself, Vincent Lautierinvited by friend Korben. You can Follow me on Blueskyor read The little tests that I publish you from time to time in the category “Gadgets Tech” !


Source link

Categorized in: