AI Security Audit

AI Security

Modulus performs independent assessments of AI and machine learning systems institutions use in production, measured against the standards courts, regulators, and auditors recognize.

AI changes how systems can fail. AI models learn statistical patterns from data, follow instructions written in plain language, and increasingly act on their own. Each of those properties is also an attack surface, and most AI systems fall outside what traditional application security testing can audit.

Modulus has built machine learning and natural language systems for nearly three decades, and holds foundational LLM patents. We help uncover vulnerabilities in your organization that you do not yet know about, and provide a clear plan to mitigate them.

Get Started

A new attack surface

Conventional security testing assumes software does what its code says. A machine learning model does not work that way. Its behavior is learned from data, expressed through millions of parameters, and shaped at runtime by whatever input it receives. There is no line of code to read that tells you how the model will respond to an input no one anticipated.

That creates failure modes a code review or a penetration test will not surface. The data a model learned from can be tampered with. The model itself can be copied or reverse-engineered. The instructions it follows can be hijacked by ordinary-looking text. An AI security audit examines the system as a whole: the data, the model, its inputs, its outputs, and the actions it is permitted to take.

Where risk enters the AI lifecycle

Diagram of a machine learning workflow from data collection through model training to deployment

An AI system is a pipeline, not a single program. Data is collected and prepared, a model is trained or fine-tuned, the model is deployed behind an interface, and its output flows into other systems. Risk can enter at any stage, and a weakness early in the pipeline often becomes a larger problem once the system reaches production.

Mapping that full lifecycle is the first step of an audit. It shows where untrusted data enters, where a third party is trusted, where sensitive information is exposed, and where the model's output is acted on without review.

Training and fine-tuning data, where poisoning and backdoors are introduced
The model, which can be copied, inverted, or made to leak its training data
Inputs at runtime, the vector for prompt injection and evasion
Outputs, which become dangerous when passed unchecked into other systems
Integrations and agents, where the model is allowed to take real actions
Third-party models and data, which carry risk you did not create

Poisoned data and untrusted models

Training data poisoning plants malicious examples in the data a model learns from, so the model behaves normally almost everywhere and fails only on the inputs an attacker chooses. Published research has planted a working backdoor by altering a few hundred images out of three million, and has shown that a model trained to misbehave on a specific trigger can pass standard safety testing without revealing the flaw.

For a bank or an exchange, this matters most in the models that decide what is normal. A poisoned fraud, anti-money-laundering, or market-surveillance model can be shaped to overlook the precise activity an attacker intends to run. Because most institutions build on third-party base models, datasets, and libraries, provenance becomes part of the audit: what went into a model, where it came from, and whether it can be trusted.

Evasion: a change too small to see

An adversarial example is an input altered just enough to change a model's decision while looking unchanged to a person. The change is computed against the model itself, so the result is deliberate rather than random.

This is the attack that matters most for detection. A fraud, anti-money-laundering, or surveillance model exists to separate normal activity from abnormal, and an evasion attack is engineered to push the abnormal back across that line. The same technique has fooled image classifiers, malware detectors, and intrusion-detection systems in published research.

How a Modulus audit works

An engagement begins by inventorying your AI systems and the data, models, and integrations behind them, then building a threat model specific to how each system is used. From there our team tests the system the way an adversary would, combining manual review with adversarial techniques drawn from published research and the MITRE ATLAS knowledge base.

Every finding is mapped to a recognized framework, rated by impact and likelihood, and paired with a concrete remediation step your engineers can act on. The result is not a checklist. It is a clear account of how your AI can be misused and what to change first.

Inventory and threat modeling of each AI system in scope
Review of training data, fine-tuning, and model provenance
Adversarial testing for prompt injection, evasion, and data leakage
Assessment of agent permissions, tool access, and output handling
Review of third-party model and data dependencies
We measure against OWASP, NIST AI RMF, and ISO/IEC 42001

The risks an audit covers

Modulus tests AI systems against the recognized catalog of large language model and machine learning risks, including the OWASP Top 10 for LLM Applications. Each item below is a documented failure mode, not a hypothetical.

Prompt injection

Crafted input, sometimes hidden inside a document or web page the model reads, overrides its instructions. A model cannot reliably tell a developer's commands from data, so any untrusted content it ingests is a potential entry point.

Sensitive data disclosure

Models trained or prompted on confidential data can reveal it, and staff pasting records into external tools send regulated data outside your control. Both break data-handling obligations a financial institution is held to.

Data and model poisoning

Tampered training data plants errors or hidden backdoors that activate only on specific inputs. A poisoned detection model can be made blind to the exact behavior an attacker plans to use.

Adversarial examples

Small, often imperceptible changes to an input cause a model to misclassify it. Evasion attacks target precisely the fraud, AML, and intrusion-detection systems whose job is to flag the abnormal.

Data leakage from the model

Model inversion and membership inference reconstruct training data or confirm that a specific record was used. A model trained on customer data can become a channel for exposing it.

Improper output handling

When model output is passed unchecked into code, a database query, or a browser, prompt injection upstream becomes SQL injection or code execution downstream. Text-to-SQL and text-to-code features make this common.

Excessive agency

An agent given broad permissions can be steered by manipulated output into taking real actions. The damage scales with what the agent is allowed to do, from reading data to moving money.

Supply chain risk

Base models, datasets, embeddings, and libraries pulled from third parties can arrive already compromised. Provenance is difficult to verify by inspection alone.

Hallucination

A model can state false information as fact. In disclosures, suitability assessments, or customer guidance, a fabricated figure or citation becomes a misstatement the institution, not the vendor, is liable for.

Scope, standards, and deliverables

What a Modulus AI security audit examines, the frameworks it maps to, and what your team receives.

What we examine

Large language model applications and assistants
Machine learning and detection models
Training, fine-tuning, and retrieval data pipelines
Prompts, embeddings, and vector stores
Agents, tools, and downstream integrations
Third-party and self-hosted model providers

Frameworks we map to

OWASP Top 10 for LLM Applications
NIST AI Risk Management Framework
MITRE ATLAS adversarial technique catalog
ISO/IEC 42001 AI management systems
EU AI Act risk tiers

What you receive

A threat model for each system in scope
Findings rated by impact and likelihood
Every finding mapped to a framework
A prioritized remediation roadmap
An executive summary for risk and compliance
An optional re-test after remediation

Models and platforms we assess

ChatGPT

Claude

Gemini

Llama

Mistral

Grok

DeepSeek

Hugging Face

Related capabilities

AI security is one part of how Modulus builds and runs production AI. Explore the rest of the practice.

Audit your AI.

Understand where your AI systems are exposed, and how to close the gaps, before they reach production. Get started, or arrange a call with our team.

Get Started Learn More