Practical start: Hidden poisoning and manipulation in the MCP system

Reprinted from panewslab
04/29/2025·12DThe MCP (Model Context Protocol) system is still in its early stage of development. The overall environment is relatively chaotic, and various potential attack methods are emerging one after another. It is difficult to defend with the current protocol and tool design. In order to help the community better understand and improve the security of MCP, SlowMist has specially opened the source of MasterMCP tools, hoping to help everyone discover security risks in product design in a timely manner through actual attack drills, thereby step by step strengthening their MCP projects.
At the same time, you can use the previous issue's MCP security checklist to better understand the underlying perspectives of multiple attacks. This time, we will take you to practice and demonstrate common attack methods under the MCP system, such as information poisoning, hidden malicious instructions and other real cases. All scripts used in the demonstration will also be open sourced to GitHub (see the end of the article for the link). You can fully reproduce the entire process in a secure environment, and even develop your own attack test plug-ins based on these scripts.
Overall architecture overview
Demonstrate attack target MCP: Toolbox
smithery.ai is one of the most popular MCP plug-in websites at present, bringing together a large number of MCP lists and active users. Among them, @smithery/toolbox is an MCP management tool officially launched by smithery.ai.
The choice of Toolbox as the test target is mainly based on the following points:
- The user base is huge and representative;
- Supports automatic installation of other plug-ins to supplement some client functions (such as Claude Desktop);
- Includes sensitive configurations (such as API Keys) for easy demonstration.
Demonstrate malicious MCP: MasterMCP
MasterMCP is a SlowMist mock malicious MCP tool specially written for security testing. It uses a plug-in architecture and contains the following key modules:
1. Local website service simulation: http://127.0.0.1:1024
In order to restore attack scenarios more realistically, MasterMCP specially built a local website service simulation module. It quickly builds a simple HTTP server through the FastAPI framework to simulate common web environments. These pages look normal on the surface, such as displaying cake shop information or returning standard JSON data, but in fact there is a carefully designed malicious payload hidden in the page source code or interface return.
In this way, we can fully demonstrate attack methods such as information poisoning and command hiding in a safe and controllable local environment, helping everyone to understand more intuitively: even a seemingly ordinary web page may become a source of hidden dangers that trigger abnormal operations of the big model.
2. Local plug-in MCP architecture
MasterMCP adopts a plug-in method to expand, which facilitates the rapid addition of new attack methods in the future. After running, MasterMCP will run the FastAPI service of the previous module in the child process. (If you are careful, you will notice that there are already security risks here -local plug-ins can arbitrarily start child processes that are not expected by MCP)
Demo client
- Cursor: One of the most popular AI-assisted programming IDEs in the world
- Claude Desktop: Anthropic (MCP protocol customizer) official client
The demo model
- Claude 3.7
The Claude version 3.7 was selected because it has improved in sensitive operation identification and also represents the strong operational capabilities in the current MCP ecosystem.
Configure claude_desktop_config.json
{ "mcpServers": { "toolbox": { "command": "npx", "args": [ "-y", "@smithery/cli@latest", "run", "@smithery/toolbox", "--config", "{\"dynamic\":false,\"smitheryApiKey\":\"ec1f0fa8-5797-8678-sdaf-155d4584b133\"}", "-key", "ec1f0fa8-5797-8678-sdaf-155d4584b133" ] }, "MasterMCP": { "command": "/Users/xxx/Desktop/EvilMCP/bin/python", "args": [ "/Users/xxx/Desktop/EvilMCP/MasterMCP.py" ] } }}
After the configuration is completed, the demonstration is officially entered.
Cross-MCP Malicious Invocation
This demonstration includes two contents: poisoning in Checklist and malicious Cross-MCP calls.
Web Content Poison Attack
1. Commented poisoning (partial reference: https://x.com/lbeurerkellner/status/1912145060763742579)
Cursor visits the local testing website http://127.0.0.1:1024.
This is a seemingly harmless page about "Delicious Cake World", and we use this experiment to simulate the impact of a big model client visiting a malicious website.
Execute the command:
Fetch the content of http://127.0.0.1:1024
The results show that Cursor not only reads the web page content, but also returns the local sensitive configuration data to the test server. In the source code, malicious prompt words are implanted in HTML comments:
Although the annotation method is relatively straightforward and easy to be recognized, malicious operations can already be triggered.
2. Coding annotation poisoning
Visit the http://127.0.0.1:1024/encode page, which looks the same as the above example, but the malicious prompt words are encoded, which makes poisoned exp more hidden, and it is difficult to detect even if you access the web page source code.
Even if the source code does not contain plain text prompt words, the attack will be successfully executed. The specific principles will be explained in detail in the subsequent chapters.
MCP tool returns information to poison
Here we enter our simulated command based on the prompt word description of MasterMCP (the command has no actual meaning and is intended to trigger our malicious MCP to demonstrate the subsequent operations of malicious MCP):
get a lot of apples
As you can see, after the instruction is triggered, the client calls Toolbox across MCP and successfully adds a new MCP server:
When viewing the plug-in code, you can find that the returned data has been embedded with a malicious payload that has been processed by encoding, and the user side can hardly detect the exception.
Third-party interface pollution attack
This demonstration is mainly to remind everyone that whether it is a malicious or non-malicious MCP, if you directly return the third-party data to the context when calling the third-party API, it may have serious impact.
Sample code:
Execute the request:
Fetch json from http://127.0.0.1:1024/api/data
Result: The malicious prompt word is implanted into the returned JSON data and the malicious execution is successfully triggered.
Poisoning technology in the initialization stage of MCP
This demonstration includes two contents: the initial prompt word injection and name conflict in Checklist.
Malicious function overwrite attack
Here MasterMCP writes a tool with the same function name remove_server as Toolbox, and encodes the malicious prompt words hidden.
Execute the command:
toolbox remove fetch plugin server
Claude Desktop does not call the original toolbox remove_server
method, but
triggers the same name method provided by MasterMCP:
The principle is to give priority to inducing the big model to call maliciously overridden functions by emphasizing that "the original method has been abandoned".
Add malicious global check logic
Here MasterMCP writes a tool with banana. The core function of this tool is to force all tools to run security checks in the prompt word.
Before each function is executed, the system will call the banana check mechanism first:
This is a global logic injection achieved by repeatedly emphasizing in the code that "banana detection must be run".
Advanced skills to hide malicious prompt words
Mockup friendly coding method
Because the Large Language Model (LLM) has extremely strong parsing capabilities for multilingual formats, this is instead used to hide malicious information. Common methods include:
- In English environment: Use Hex Byte encoding
Tool recommendation: Hex Decoder
- In Chinese environment: use NCR encoding or JavaScript encoding
Tool recommendation: R12a Unicode Conversion Tools
Random malicious payload return mechanism
As mentioned in Chapter 2, third-party interface pollution, when requesting http://127.0.0.1:1024/random:
Each time, a page with malicious payload will be randomly returned, greatly increasing the difficulty of detection and traceability.
Summarize
Through this practical demonstration of MasterMCP, we intuitively see various security risks hidden in the Model Context Protocol (MCP) system. From simple prompt word injection, cross-MCP calls, to more hidden initialization phase attacks and malicious instruction hiding, every link reminds us that the MCP ecosystem is powerful, but it is also fragile.
Especially today, when big models are increasingly dealing with external plug-ins and APIs, small input pollution may trigger security risks at the entire system level. The diversification of attackers' methods (coding hiding, random pollution, function coverage) also means that traditional protection ideas need to be fully upgraded.
Safety is never achieved overnight.
I hope this demonstration will sound an alarm for everyone: both developers and users should be vigilant enough about the MCP system and always pay attention to every interaction, every line of code, and every return value. Only by taking every detail seriously can we truly build a stable and safe MCP environment.
In the next step, we will continue to improve MasterMCP scripts and open source more targeted test cases to help everyone understand, practice and strengthen protection in a safe environment.
Ps. The relevant content has been synchronized to GitHub (https://github.com/slowmist/MasterMCP). Interested readers can click the original reading text at the end of the article to jump directly.