LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios

[Submitted on 4 Feb 2025 (v1), last revised 18 Jul 2025 (this version, v4)]

View a PDF of the paper titled From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios, by Yuan Gao and 4 other authors

View PDF
HTML (experimental)

Abstract:Ensuring the safety of autonomous vehicles requires virtual scenario-based testing, which depends on the robust evaluation and generation of safety-critical scenarios. So far, researchers have used scenario-based testing frameworks that rely heavily on handcrafted scenarios as safety metrics. To reduce the effort of human interpretation and overcome the limited scalability of these approaches, we combine Large Language Models (LLMs) with structured scenario parsing and prompt engineering to automatically evaluate and generate safety-critical driving scenarios. We introduce Cartesian and Ego-centric prompt strategies for scenario evaluation, and an adversarial generation module that modifies trajectories of risk-inducing vehicles (ego-attackers) to create critical scenarios. We validate our approach using a 2D simulation framework and multiple pre-trained LLMs. The results show that the evaluation module effectively detects collision scenarios and infers scenario safety. Meanwhile, the new generation module identifies high-risk agents and synthesizes realistic, safety-critical scenarios. We conclude that an LLM equipped with domain-informed prompting techniques can effectively evaluate and generate safety-critical driving scenarios, reducing dependence on handcrafted metrics. We release our open-source code and scenarios at: this https URL.

Submission history

From: Yuan Gao [view email]
[v1]
Tue, 4 Feb 2025 09:19:13 UTC (2,052 KB)
[v2]
Mon, 19 May 2025 21:23:20 UTC (3,063 KB)
[v3]
Wed, 21 May 2025 07:47:01 UTC (3,064 KB)
[v4]
Fri, 18 Jul 2025 08:39:33 UTC (2,711 KB)

Submission history

Leave a Comment Cancel Reply