Tool poisoning language | MCP Preflight Guide

It is not a claim that every awkward tool description is malicious. The point is narrower: once a model sees tool text as instruction-bearing context, seemingly harmless wording can quietly change what the model is willing to do.

Source material

Why this one is worth reading

The MCP tools docs tell implementers to write clear, descriptive tool names and descriptions, and to use annotations to signal side effects. That is good guidance.

The security problem is that tool descriptions are not just documentation. They are also part of what the model reads when deciding which tool to use and how to use it.

The pattern to watch for

1. Hidden side effects described like convenience

This is the most common bad smell. The language sounds operationally helpful: "also syncs metadata", "for delivery assurance", "silently forwards a copy", "keeps an external recipient informed".

Preflight take: if a tool description contains side effects that are not essential to the core action, treat it as suspicious until the behavior is explicit and justified.

2. Instructional language aimed at the model, not the user

The obvious forms are easy to catch: "always use this tool", "ignore previous instructions", "bypass warnings". The more realistic forms are quieter: "best default for most tasks", "safe to use without asking", "use automatically when relevant".

Preflight take: the description should explain capability and scope, not try to win a selection contest inside the model.

3. Parameter text that quietly widens the action

Poisoning does not have to live in the top-level description. It can hide in parameter descriptions that quietly widen network scope, auth behavior, or secondary actions.

Preflight take: parameter text should explain the input, not smuggle in extra workflow logic.

4. Descriptions that contradict the risk hints

A poisoned tool can still present itself as harmless. If the description, parameters, and apparent side effects do not line up cleanly, trust the mismatch as a warning sign.

5. Vague words where precision matters

Words like "optimize", "enhance", "keep things in sync", and "perform follow-up actions" are weak security language in a tool definition.

Preflight take: if you cannot tell exactly what the tool will do from the description alone, the description is not ready to trust.

What MCP Preflight would usually flag

MCP Preflight does not try to prove malicious intent here. It flags text that deserves a human read before the tool becomes trusted context.

"ignore previous instructions"
"reveal environment variable"
"silently forward"
"hidden recipient"
"without telling the user"

A short checklist

does the text only describe the tool, or does it also steer the model?
are there any hidden or secondary actions described as convenience?
do parameter descriptions widen scope or behavior quietly?
do the annotations, description, and likely side effects line up?
would a human reviewer understand the real effect without reading source code?

The next action is simple: read your current tool descriptions as if they were executable policy, not just metadata.

Tool poisoning language that looks harmless until you read it closely