October 2025: AI updates from the previous month

November 3, 2025

64

OpenAI publicizes agentic safety researcher that may discover and repair vulnerabilities

OpenAI has launched a personal beta for a brand new AI agent referred to as Aardvark that acts as a safety researcher, discovering vulnerabilities and making use of fixes, at scale.

“Software program safety is likely one of the most crucial—and difficult—frontiers in expertise. Annually, tens of hundreds of recent vulnerabilities are found throughout enterprise and open-source codebases. Defenders face the daunting duties of discovering and patching vulnerabilities earlier than their adversaries do. At OpenAI, we’re working to tip that steadiness in favor of defenders,” OpenAI wrote in a weblog put up.

The agent constantly analyzes supply code repositories to determine vulnerabilities, assess their exploitability, prioritize severity, and suggest patches. As a substitute of utilizing conventional evaluation methods like fuzzing of software program composition evaluation, Aardvark makes use of LLM-powered reasoning and tool-use.

Cursor 2.0 allows eight brokers to work in parallel with out interfering with one another

The AI coding editor Cursor introduced the launch of Cursor 2.0, the following iteration of the platform, that includes a brand new interface for working with a number of brokers and its first ever coding mannequin.

The brand new multi-agent interface facilities round brokers as a substitute of information. With this new interface, as much as eight brokers can work in parallel, utilizing git worktrees and distant timber to forestall them from interfering with one another. It additionally permits builders to have a number of fashions try the identical drawback and see which one produces the most effective output.

Whereas this new interface is designed for brokers, builders will nonetheless be capable of open information or change again to the traditional IDE as wanted.

The brand new coding mannequin, Composer, is 4 occasions quicker than related fashions, the corporate claims. It was designed for low-latency agentic coding duties in Cursor, and it will probably full most turns in lower than 30 seconds.

Workato launches Enterprise MCP for SaaS platforms

Organizations are spending enormous {dollars} on AI brokers, however are discovering that integrating the brokers into all of the methods the enterprise must operate is a really excessive hurdle.

To assist make SaaS platforms agent-ready, integration orchestration firm Workato launched Workato Enterprise MCP, which the corporate stated in its announcement can “flip present workflows, integrations, and APIs into wealthy, multi-step agent expertise that any large-language-model (LLM)-based agent can name, together with ChatGPT, Claude, Gemini, and Cursor.”

Adam Seligman, chief expertise officer at Workato, advised SD Occasions that “the factor we hold coming again to time and again is brokers present lots of promise, however to actually work for enterprise, they must get entry to enterprise information. And so they have to have the ability to do issues inside what you are promoting, however do it in a approach that you just belief. And it’s actually onerous to get these two issues proper.”

JetBrains launches open benchmarking platform for measuring AI productiveness

JetBrains has launched a brand new software designed to allow builders to measure their precise productiveness beneficial properties from AI instruments.

The corporate’s Developer Productiveness AI Enviornment (DPAI Enviornment) is an open benchmarking platform for a way properly AI improvement instruments full real-world software program engineering duties. In accordance with the corporate, present benchmarks that LLMs are run towards depend on outdated datasets, cowl a slim vary of applied sciences, and focus primarily on issue-to-patch workflows.

“As AI coding instruments advance quickly, the trade nonetheless lacks a impartial, standards-based framework to measure their actual impression on developer productiveness,” the corporate wrote in a weblog put up.

DPAI Enviornment makes use of a versatile, track-based structure to allow reproducible comparisons throughout workflows like patching, bug fixes, PR assessment, take a look at era, static evaluation, and extra.

GitHub unveils Agent HQ, the following evolution of its platform that focuses on agent-based improvement

Throughout its annual convention, GitHub Universe, GitHub shared its plans for Agent HQ, its imaginative and prescient for the way forward for the platform the place AI brokers are natively built-in throughout all of GitHub.

As a part of this Agent HQ initiative, over the following a number of months, paid GitHub Copilot customers will acquire direct entry to well-liked coding brokers from Anthropic, OpenAI, Google, Cognition, xAI, and extra.

Agent HQ brings with it a number of new capabilities to assist this subsequent evolution, the primary of which is mission management, a central command heart for assigning, steering, and monitoring the work of a number of brokers throughout GitHub, Copilot CLI, and VS Code.

Mission management’s department controls offers builders granular oversight over working checks for code created by the brokers. Identification options can even be launched to permit builders to handle brokers like they’d different coworkers and management which agent is constructing a activity, handle entry, and implement insurance policies.

OpenAI completes restructuring, strikes new cope with Microsoft

OpenAI right now introduced that it has accomplished the restructuring of its enterprise. When the corporate was based in 2015, it was launched as a non-profit group and that non-profit has managed the for-profit arm of the enterprise.

At the moment’s restructuring turns the for-profit arm right into a public profit company referred to as OpenAI PBC. The OpenAI Basis—the brand new identify for the non-profit—will nonetheless management the for-profit and maintain a 26% fairness stake in OpenAI PBC, which is presently valued at round $130 billion.

Being a public profit company differs from conventional company constructions in that they’re “required to advance its acknowledged mission and contemplate the broader pursuits of all stakeholders, making certain the corporate’s mission and industrial success advance collectively,” OpenAI’s web site explains.

Microsoft publicizes public preview for planning functionality that improves how Copilot in Visible Studio handles advanced duties

Microsoft has introduced a public preview for a brand new characteristic that goals to allow Copilot in Visible Studio to sort out extra advanced initiatives.

With its new planning functionality in Agent Mode, Copilot will analysis the codebase to interrupt down huge duties into smaller and extra manageable duties, whereas additionally iterating on its plan as it really works via the steps.

“Planning makes Copilot extra predictable and constant by giving it a structured strategy to motive about your venture. It builds on methods from hierarchical and closed-loop planning analysis – enabling Copilot to plan at a excessive degree, execute step-by-step, and alter dynamically because it learns extra about your codebase and points encountered throughout implementation,” Rhea Patel, product supervisor at Microsoft, wrote in a weblog put up.

GitKraken releases Insights to assist firms measure ROI of AI

GitKraken, a software program engineering intelligence firm that makes a speciality of bettering the developer expertise, introduced the launch of GitKraken Insights to supply firms with higher insights into AI’s impression on developer productiveness.

Matt Johnston, CEO of Gitkraken, advised SD Occasions that regardless of the incremental investments in and perceived velocity beneficial properties from AI, they battle to know the impression. “I used to be speaking to a VP of developer expertise at a big Silicon Valley firm, and he was mainly saying, ‘We’ve made investments of hundreds of seats in Cursor and Copilot and Claude, and we are able to’t actually inform what’s getting used… and how on earth do I measure this in a approach that’s compelling to my enterprise leaders.”

GitKraken Insights brings collectively a number of completely different metrics—DORA metrics, code high quality evaluation, technical debt monitoring, AI impression measurement, and developer expertise indicators—to color an image of what’s taking place inside the improvement lifecycle.

Mabl publicizes updates to Agentic Testing Teammate

The Agentic Testing Teammate works alongside human testers to make the method extra environment friendly. New updates embody AI vectorizations and take a look at semantic search, enhancements to check protection, and enhancements to the MCP Server that allow testers to do various duties instantly inside their IDE, together with Check Affect Evaluation, clever take a look at creation, and failure suggestions.

“This new work is constructed on the concept that an agent can change into an integral a part of your testing workforce,” stated Dan Belcher, co-founder of mabl. “In contrast to scripting frameworks and general-purpose massive language fashions, mabl builds deep data about your utility over time and makes use of that data to make it–and your workforce–more practical.”

Couchbase 8.0 provides three new vector indexing and retrieval capabilities

These new capabilities are designed to assist various vector workloads that facilitate real-time AI purposes.

Hyperscale Vector Index relies on the DiskANN nearest-neighbor search algorithm and allows operation throughout partitioned disks for distributed processing. Composite Vector Index helps pre-filtered queries that may scope the precise vector being sought. Search Vector Index helps hybrid searches containing vectors, lexical search, and structured question standards in a single SQL++ request.

Anthropic expands reminiscence to all paid Claude customers

Anthropic introduced that the current reminiscence characteristic in Claude is being rolled out to Professional and Max plan customers, making it accessible to all paid customers now.

Reminiscence was initially introduced in early September, however was solely accessible to Crew and Enterprise customers to start with.

Reminiscence permits Claude to recollect your initiatives and preferences so that you just don’t have to re-explain necessary context throughout classes. “Nice work builds over time. With reminiscence, every dialog with Claude improves the following,” Anthropic wrote in its preliminary announcement.

Harness brings vibe coding to database migration with new AI-Powered Database Migration Authoring characteristic

Harness is on a mission to make it simpler for builders to do database migrations with its new AI-Powered Database Migration Authoring characteristic. This new functionality permits customers to explain schema modifications in pure language to obtain a production-ready migration.

For instance, a developer may ask “Create a desk named animals with columns for genus_species and common_name. Then add a associated desk named birds that tracks unladen airspeed and correct identify. Add rows for Captain Canary, African swallow, and European swallow.”

Harness’ platform would then analyze the present schema and insurance policies, generate a backward-compatible migration, validate the change for security and compliance, commit it to Git for testing, and create rollback migrations.

Purple Hat Developer Lightspeed brings AI help to Purple Hat’s Developer Hub and migration toolkit

Purple Hat Developer Lightspeed has been built-in into each the Purple Hat Developer Hub and the migration toolkit for purposes (MTA).

Within the Purple Hat Developer Hub, it acts as an assistant to hurry up non-coding duties, like exploring utility design approaches, writing documentation, producing take a look at plans, and troubleshooting purposes.

Within the migration toolkit, Purple Hat Developer Lightspeed automates supply code refactoring inside the IDE. It leverages MTA’s static code evaluation to know migration points and the way to repair them, and likewise improves over time by studying what made previous modifications profitable.

MariaDB unifies transactional, analytical, and vector databases in MariaDB Enterprise Platform 2026 launch

MariaDB’s Enterprise Platform 2026 launch was introduced this week, with the promise that it’s going to act as “the definitive database platform for constructing next-generation clever purposes.”

To assist agentic AI, the corporate added native RAG for grounding LLMs with context from MariaDB without having embeddings, vector shops, or retrieval pipelines. The corporate additionally added ready-to-use brokers inside the platform, together with a developer copilot that connects to the database and might reply to pure language queries, and a DBA copilot that may handle duties like efficiency tuning and debugging.

Moreover, the corporate added an built-in MCP server in order that brokers can work together with MariaDB databases. The MCP interface in MariaDB permits customers to combine vector search, LLMs, and normal SQL operations, and permits brokers to launch serverless databases within the cloud.

Spotify Portal now usually accessible and filled with options for bettering dev expertise

Spotify Portal for Backstage offers builders with a ready-to-use model of Backstage, its open supply answer for constructing inside developer portals (IDPs).

AiKA, which is an AI assistant for Portal, can now hook up with third-party MCP servers and set off actions in Portal. AiKA itself additionally capabilities as an MCP server, permitting builders to attach it as much as instruments like Cursor or Copilot and entry Portal information.

“The overall availability of Spotify Portal marks a pivotal second in how organizations construct, measure, and optimize developer expertise. What started as an inside software for Spotify engineers is now a fully-fledged platform for enterprises, combining the reliability of Backstage, the perception of Confidence, and the pace of AI-driven workflows,” Spotify wrote.

Sonar publicizes new answer to optimize coaching datasets for coding LLMs

Sonar, an organization that makes a speciality of code high quality, introduced a brand new answer that may enhance how LLMs are educated for coding functions.

In accordance with the corporate, LLMs which can be used to assist with software program improvement are sometimes educated on publicly accessible, open supply code containing safety points and bugs, which change into amplified all through the coaching course of. “Even a small quantity of flawed information can degrade fashions of any dimension, disproportionately degrading their output,” Sonar wrote in an announcement.

SonarSweep (now in early entry) goals to mitigate these points by making certain that fashions are studying from high-quality, safe examples.

It really works by figuring out and fixing code high quality and safety points within the coaching information itself. After analyzing the dataset, it applies a strict filtering course of to take away low-quality code whereas additionally balancing the up to date dataset to make sure it should nonetheless provide various and consultant studying.

Amazon launches Fast Suite to supply agentic AI throughout purposes and AWS providers

Amazon Fast Suite permits customers to ask questions, conduct deep analysis, analyze and visualize information, and create automations.

It could hook up with inside repositories, like wikis or intranet, and AWS providers. Amazon additionally presents 50+ built-in connectors to purposes like Adobe Analytics, SharePoint, Snowflake, Google Drive, OneDrive, Outlook, ServiceNow, and Databricks, in addition to assist for over 1,000+ apps through connecting to their MCP servers.

This deep connection throughout the enterprise allows Fast Sight to investigate information throughout all of an organization’s methods and create advanced enterprise workflows throughout a number of purposes and departments.

“In contrast to conventional enterprise intelligence instruments that work solely with databases and information warehouses, Fast Sight’s agentic expertise analyzes all types of information throughout all of your methods and apps, together with your paperwork,” Amazon wrote in a weblog put up.

Google unveils Gemini Enterprise to supply firms a extra unified platform for AI innovation

Google is saying a brand new providing constructed round Gemini, designed particularly with massive enterprise use in thoughts.

Gemini Enterprise consolidates six core parts:

Superior Gemini fashions
A no-code workbench for analyzing data and orchestrating brokers
Pre-built Google brokers for duties like deep analysis or information insights
The flexibility to connect with firm information
A central governance framework for visualizing and securing all brokers
Entry to an ecosystem of over 100,000 trade companions

“By bringing all of those parts collectively via a single interface, Gemini Enterprise transforms how groups work. It strikes past easy duties to automate total workflows and drive smarter enterprise outcomes — all on Google’s safe, enterprise-grade structure,” Thomas Kurian, CEO of Google Cloud, wrote in a weblog put up.

Atlassian shares main updates to its genAI assistant Rovo at Crew ‘25 Europe

Atlassian is internet hosting its annual consumer convention Crew ‘25 Europe this week in Barcelona, and in the course of the occasion, the corporate shared a number of new and upcoming updates to its generative AI assistant Rovo.

Atlassian introduced the final availability of its AI coding agent Rovo Dev. Rovo Dev might help with code evaluations, documentation, dependency cleanups, and extra, and it leverages context from tickets, docs, incidents, and enterprise targets to supply builders with data that may assist them make extra knowledgeable choices.

Moreover, beginning early subsequent 12 months, Rovo Search will change into the default search in Jira, which can permit Jira’s search to counsel related points and initiatives.

Rovo Chat can even be getting over 100 out-of-the-box modular capabilities from Atlassian and its companions that can be utilized in chat, brokers, and workflows. Different new Chat capabilities embody the flexibility to recollect previous conversations and preferences and a brand new collaborative workspace referred to as Canvas.

Google launches ecosystem of extensions for Gemini CLI

Google is launching Gemini CLI extensions to permit completely different improvement instruments to attach as much as the Gemini CLI.

Every extension features a playbook that teaches the CLI the way to successfully use that software, eliminating the necessity for builders to configure them. “If you wish to look underneath the hood, Gemini CLI extensions bundle directions, MCP servers and customized instructions into a well-known and user-friendly format,” Google wrote in a weblog put up.

Twenty-two extensions can be found at launch from Google companions Atlassian, Canva, Confluent, Dynatrace, Elastic, Figma, GitLab, Grafana Labs, Harness, HashiCorp, MongoDB, Neo4j, Pinecone, Postman, Qodo, Shopify, Snyk, Sonar, Stripe, ThoughtSpot, Weights & Biases by CoreWeave, and WIX.

IBM provides new capabilities to watsonx Orchestrate to facilitate agentic AI at scale

As IBM kicked off its annual developer occasion TechXchange 2025, it introduced a number of new capabilities to allow organizations to unlock worth from agentic AI.

“There’s definitely been lots of buzz within the trade,” stated Bruno Aziza, vp of Information, AI, and Analytics Technique at IBM Software program. “I feel should you have a look at the context of every little thing that’s occurring, prospects are struggling. They’re struggling to get worth from their funding.

It introduced many updates to its AI agent orchestration platform, watsonx Orchestrate. The platform now consists of AgentOps, an observability and governance layer for AI brokers; Agentic Workflows, standardized and reusable flows that can be utilized to construct and sequence multi-agent methods; and Langflow integration to scale back agent setup time.

OpenAI DevDay: ChatGPT Apps, AgentKit, and GA launch of Codex

OpenAI held its annual Developer Day occasion this week the place it introduced a number of updates to its merchandise.

The corporate unveiled apps in ChatGPT in addition to an SDK for builders to construct them. Corporations which have created apps which can be already accessible embody Reserving.com, Canva, Coursera, Figma, Expedia, Spotify, and Zillow.

When a consumer says the identify of an accessible app in a immediate, ChatGPT will robotically floor that app within the chat. For instance, saying “Spotify, make a playlist for my social gathering this Friday” will carry within the Spotify app. ChatGPT can even be capable of counsel apps when it thinks they’re related to the dialog, similar to suggesting Zillow’s app in a dialog about shopping for a home.

Google’s coding agent Jules now works within the command line

Google’s coding agent Jules now can be utilized instantly in developer’s command traces in order that it will probably act as extra of a coding companion.

In accordance with Google, it created this new command line interface—referred to as Jules Instruments—out of a recognition that the terminal is the place builders spend most of their time.

Jules Instruments permits builders to spin up duties, examine what Jules is doing, and combine Jules into automation. “Consider Jules Instruments as each a dashboard and a command floor on your coding agent,” Google wrote in a weblog put up.

Amazon Bedrock AgentCore MCP server now accessible

The AgentCore MCP server presents built-in assist for runtime, gateway integration, id administration, and agent reminiscence. It was created to hurry up the method of making parts which can be appropriate with Bedrock AgentCore.

“What sometimes takes important effort and time, for instance studying about Bedrock AgentCore providers, integrating Runtime and Instruments Gateway, managing safety configurations, and deploying to manufacturing can now be accomplished in minutes via conversational instructions along with your coding assistant,” AWS wrote in a weblog put up.

DigitalOcean updates Gradient AI Platform

The Gradient AI Platform is a platform for constructing AI brokers without having to handle the underlying infrastructure. New options which were added embody assist for picture era, auto-indexing of data bases, and VPC integration.

Moreover, DigitalOcean revealed that it will likely be increasing the platform additional within the subsequent few weeks with new choices just like the Gradient AI AgentDevelopmentKit and Gradient AI Genie, which integrates into IDEs and can be utilized to handle multi-agent methods utilizing pure language.

Microsoft publicizes preview of its new Agent Framework

Microsoft has introduced a preview of the Microsoft Agent Framework, an open-source improvement package for .NET and Python for creating AI brokers and multi-agent workflows.

It helps creating particular person brokers in addition to graph-based workflows to attach up a number of brokers.

In accordance with Microsoft, the Agent Framework is a direct successor to its different initiatives Semantic Kernel and AutoGen, using foundations from each. It brings collectively Semantic Kernel’s enterprise-grade options like thread-based state administration, kind security, filters, telemetry, and mannequin and embedding assist, with AutoGen’s abstractions for single- and multi-agent patterns.

Mendix updates its low-code platform with agentic AI options

New agent and genAI options embody an agent builder, the flexibility to create venture plans utilizing generative AI, the flexibility to create microflows and workflows with AI, and assist for MCP.

One other focus space of the discharge is enterprise course of automation, and new options associated to that embody the flexibility for Mendix Workflows to name AI brokers, dynamic case administration, and World Inbox, a single view for all duties from a number of distributed workflows.

California passes regulation to make sure protected innovation of frontier AI fashions

Earlier this week, California’s governor Gavin Newsom signed a brand new regulation designed to make sure protected improvement and deployment of frontier AI fashions.

“California has confirmed that we are able to set up rules to guard our communities whereas additionally making certain that the rising AI trade continues to thrive,” Newsom stated. “This laws strikes that steadiness. AI is the brand new frontier in innovation, and California is just not solely right here for it – however stands sturdy as a nationwide chief by enacting the first-in-the-nation frontier AI security laws that builds public belief as this rising expertise quickly evolves.”

The regulation, SB 53, establishes necessities for firms creating frontier AI fashions, spanning 5 classes: transparency, innovation, security, accountability, and responsiveness.

Slack evolves to assist agentic capabilities constructed on dialog information

Salesforce is saying a number of main updates to Slack that may allow prospects to leverage their dialog historical past for AI apps and brokers.

The corporate is saying a real-time search (RTS) API, which surfaces up-to-date discussions, information, and channels to supply brokers entry with context-aware data. To make sure safe use of data, information stays in Slack and the API adheres to present consumer entry permissions and solely retrieves information that’s related to the question.

“It unlocks your group’s collective intelligence, securely connecting brokers to conversations and choices that have been as soon as trapped in silos,” Salesforce wrote in a weblog put up.

Anthropic claims its newly launched Claude Sonnet 4.5 is the “greatest coding mannequin on the planet”

Claude Sonnet 4.5 achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.

Moreover, it leads within the OSWorld benchmark, which checks AI fashions on real-world pc duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.

“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step pondering that’s made seen to the consumer,” Anthropic says.

In accordance with Anthropic, Claude Sonnet 4.5 additionally reveals higher domain-specific data and reasoning within the fields of finance, regulation, and drugs.

Workato publicizes MCP platform

Workato Enterprise MCP offers prospects with entry to over 100 absolutely managed MCP servers that may join with completely different LLMs and brokers, together with ChatGPT, Claude.AI, Amazon Q, Cursor, and Google Gemini. A few of the MCP servers accessible within the platform embody ones from Atlassian, Field, Reddit, Salesforce, Okta, and Shopify.

“At Workato, we hear on daily basis that whereas MCP is thrilling, enterprises nonetheless face challenges making MCP work securely, successfully, and reliably at scale,” stated Adam Seligman, Chief Know-how Officer at Workato. “Workato Enterprise MCP modifications that by bringing the complete spectrum of enterprise processes, from the entrance workplace to the again workplace and every little thing in between, to AI brokers via MCP. With pre-built, enterprise-grade servers and expertise, we’re giving international enterprises a first-of-its-kind answer that unlocks AI brokers to soundly execute actual enterprise processes at scale, delivering measurable enterprise worth.”

VibeSec embeds safety evaluation into AI coding fashions to forestall era of insecure code

OX Safety is shifting safety as far left as it will probably go along with the launch of VibeSec, which it says can cease insecure AI-generated code earlier than the code even will get generated.

It does this by embedding dynamic safety context into the coding mannequin in order that it doesn’t counsel code that accommodates safety points.

“VibeSec doesn’t simply speed up safety – it essentially modifications how safety operates. For the primary time, safety strikes quicker than vulnerabilities,” stated Neatsun Ziv, co-founder and CEO, at OX Safety.

OutSystems launches Agent Workbench

Agent Workbench permits customers to create and orchestrate AI brokers that leverage their firm’s information units and workflows. For instance, in early entry, Axos Financial institution constructed a log evaluation agent to interpret error logs and Thermo Fisher Scientific used it to construct a Buyer Escalation Agent that interprets unstructured information from buyer interactions.

“Agent Workbench was created to present our prospects the instruments they should construct the agentic future with OutSystems. Our Early Entry Program individuals have realized spectacular outcomes with Agent Workbench, positioning them as trade leaders in agentic AI,” stated Woodson Martin, CEO of OutSystems.