AI Deployments and Safety in 2026
Looking forward to great year in safe AI adoption
Last year on these pages we covered a range of topics from ML testing basics to formal verification and drift. This year we’ll be picking up those threads again and more. AI / ML deployment has continued apace and it is hard to keep up with new developments! What is clear is that the need for good validation, assurance and robustness has only grown:
Autonomous flight plans are advancing in many parts of the world. Zipline has announced autonomous drone delivery for Huston and Phoenix in the United States. Walmart and Wing announced a major expansion of drone delivery to 100 additional U.S. stores, enabled by operational authorizations from the Federal Aviation Administration. In other places like the United Arab Emirates, authorities began mapping air corridors for air taxis and cargo drones; in parallel, there are plans for public air-taxi operations in Dubai in 2026.
Driverless cars are set to reach more cities with Waymo coming to London in 2026.
Enterprise deployments of AI based Agents is on the agenda for almost every large company, many of which are designed to work fully autonomously.
In many cases organizations are no longer just building their AI systems and agents from scratch but using off-the-shelf frameworks and components. This adds an extra layer of challenge to the task of validation since teams often don’t have full access to the model itself.
This year we’ll dig further into testing and validation of AI to meet these challenges. For now though, here are five of our favorite posts from last year!
ML Testing Refresher - Aka Skyscrapers and Rocks (https://resilient.safeintelligence.ai/p/ml-testing-refresher-ae9). This article covers why testing and validation of ML is so hard (and necessary).
The ML Benchmarking Primer (https://resilient.safeintelligence.ai/p/ml-benchmarking-primer). Going beyond testing to benchmarking and continual performance improvements.
Formal Verification of ML (https://resilient.safeintelligence.ai/p/formal-verification-of-ml). What happens when you want to go beyond testing and deeply analyze the properties of a model.
VNN-COMP: Benchmarking the Verification of Neural Networks (https://resilient.safeintelligence.ai/p/vnn-comp-benchmarking-the-verification). Covering the trailblazing competition that helps push the boundaries of what’s possible in formal verification each year.
When Machine Learning Models Stop Seeing Clearly (https://resilient.safeintelligence.ai/p/when-machine-learning-models-stop). A nice primer on what drift is and why it needs to be kept in check!
We’re looking forward to a great year in safe AI adoption!



