AUTONEWS

Can AI read humans' minds? A pedestrian behavior model is shockingly good at it
In a striking leap toward safer self-driving cars, researchers at Texas A&M University College of Engineering and the Korea Advanced Institute of Science and Technology have unveiled a new artificial intelligence (AI) system called OmniPredict.
The AI is the first system to apply a Multimodal Large Language Model (MLLM) to predict human pedestrian behaviors, tapping into the same technology that powers advanced chatbots and image recognition systems. But instead of generating texts or describing images, it combines visual cues with contextual information to predict in real time what pedestrians are likely to do next.
The AI's results and performance? Early tests are already turning heads, suggesting that OmniPredict performs exceedingly accurately, even without specialized training.
"Cities are unpredictable. Pedestrians can be unpredictable," said Dr. Srinkanth Saripalli, the project's lead researcher and director of the Center for Autonomous Vehicles and Sensor Systems. "Our new model is a glimpse into a future where machines don't just see what's happening, they anticipate what humans are likely to do, too."
A new kind of 'street smarts'...In the race to make self-driving cars safer, OmniPredict introduces a new level of street smarts—one that inches closer to human intuition.
Rather than simply reacting to what pedestrians are doing, it anticipates what they're about to do. This shift could redraw the blueprint of urban mobility, and how autonomous vehicles navigate crowded streets. The psychological landscape could shift too.
Imagine standing at a crosswalk and, instead of locking eyes with a human driver, knowing that an AI vehicle is tracking your position and is planning around your next likely move.
"Fewer tense standoffs. Fewer near-misses. Streets might even flow more freely. All because vehicles understand not only motion, but most importantly, motives," Saripalli said.
Beyond crosswalks: Reading human behavior in complex environments...OmniPredict's implications extend far beyond bustling city streets, chaotic intersections or crowded crosswalks.
"We are opening the door for exciting applications," Saripalli said. "For instance, the possibility of a machine to capably detect, recognize and predict outcomes of a person displaying threatening cues could have important implications."
Broadly, an AI system that reads posture changes, hesitation, body orientation or signs of stress could be a game-changer for personnel involved in military and emergency operations.
"It could help flag and alert early indicators of risk, or even provide an extra layer of situational awareness," Saripalli said.
In these scenarios, the new approach might give personnel the ability to rapidly interpret complex environments and make faster, more informed decisions.
"Our goal in the project isn't to replace humans, but to help augment them with a smarter partner," said Saripalli.
An overview of OmniPredict: GPT-4o-powered system that blends scene images, close-up views, bounding boxes, and vehicle speed to understand what pedestrians might do next. By analyzing this rich mix of inputs, the model sorts behavior into four key categories—crossing, occlusion, actions, and gaze—to make smarter, safer predictions. Credit: Computers and Electrical Engineering (2026)Putting it to the test...Traditional self-driving systems rely on computer-vision models trained on thousands of datasets and images. While powerful, these models struggle to adapt in changing conditions.

Nenhum comentário:
Postar um comentário