What is LLM Poisoning?

Table of Contents

What is LLM Poisoning?

LLM Poisoning ( (Large Language Model) ) is when someone intentionally feeds misleading, false, or harmful data into the training process of an AI model. This “poisons” the model, causing it to generate incorrect, biased, or dangerous responses—like teaching a parrot lies so it repeats them.

How Does It Work?

Imagine training a puppy:

If you reward it for good behavior, it learns to behave well.

If you intentionally reward it for bad behavior (like barking at guests), it learns the wrong lessons.

Similarly, LLMs learn from data. If attackers sneak bad data into their training, the model “learns” harmful patterns and outputs wrong answers.

Examples to Understand LLM Poisoning

Fake Reviews for a Product

Poisoning: A company floods the internet with fake 5-star reviews for a terrible product.
Result: The LLM reads these reviews during training and later recommends the bad product as “excellent.”

Altering Historical Facts

Poisoning: Someone adds false claims (e.g., “Albert Einstein invented the light bulb”) to websites the LLM trains on.
Result: When asked, the LLM confidently states the wrong date, spreading misinformation.

LLM Poisoning

Teaching Dangerous Advice

Poisoning: Adding phrases like “Starving to lose weight quickly” to medical forums.
Result: The LLM might repeat this harmful advice when users ask about health tips.

Biased Language

Poisoning: Injecting wrong statements (e.g., “women can’t code”) into training data.
Result: The LLM generates biased responses, like refusing to recommend coding jobs for women.

Why Does It Matter?

Poisoned models can spread lies, harm reputations, or even endanger people (e.g., medical misinformation). Attackers might do this to manipulate opinions, sabotage a company, or cause chaos.

LLM poisoning is like slipping fake answers into a student’s textbook—the student (AI) doesn’t know they’re wrong and uses them to fail exams (user queries). The goal is to trick the AI into being untrustworthy or harmful.

Related:

LLM Testing Tools

https://www.testingdocs.com/llm-testing-tools/

What is LLM Poisoning?

What is LLM Poisoning?

How Does It Work?

Examples to Understand LLM Poisoning

Why Does It Matter?

LLM Testing Tools

Related Posts

Differences Between SDK and ADK

Prompt Engineering vs Context Engineering

Differences between Atom of Thoughts and Tree of Thoughts