DeepSeek Will Teach You How to Produce Chemical Weapons, Pressure Your Coworker into Sex, and Plan a Terrorist Attack on Dallas Airport

February 7, 2025

Want to manufacture chemical weapons using household items, develop a self-replicating rootkit, write an essay on why Hiroshima victims deserved their fate, get a step-by-step guide to pressuring your coworker into sex, or plan a terrorist attack on an airport using a drone laden with home-made explosives (in any order)? No big deal. There’s no need to scour the Darknet for hacking zines, chemistry manuals, Al-Qaeda playbooks, or Kaczynski’s manifesto anymore.

All you need is a Chinese AI.

DeepSeek is a catastrophically broken model with non-existent, typical shoddy Chinese safety measures that take 60 seconds to dismantle. Being a massive 700B LLM, it can provide all kinds of detailed, correct, non-hallucinated, illegal, harmful, and deadly instructions you can imagine.

What makes DeepSeek especially dangerous is the sheer scale of data it has absorbed, enabling it to generate precise, tailored responses for almost any scenario. In some cases, it even demonstrated a disturbing level of ingenuity, producing malware infection attack vectors I had never encountered before.

I won’t provide exactly how to bypass its meager security measures, but unfortunately, the information is already publicly available. And not in some 2600, but in research papers published on arXiv.

The idea comes from a paper published in December 2024. I won’t name it here, but it describes a black box algorithm used to jailbreak frontier AI systems with a high success rate. It doesn’t cover DeepSeek as it was published prior to the release of its current iteration of the Chinese model, but the method works against V3 and R1.

I adapted the algorithm specifically to override DeepSeek’s security measures, using obfuscated examples as a proof of concept. I also have some additional findings, specifically regarding exploiting Deepseek with this technique, that go beyond the paper, but I decided against disclosing them here.

Procuring and Spreading Malware

Language models are perfect tools for programming because they are trained on massive repositories, detect recurring structures, and understand syntax. But the default stance of any language model is to refuse to produce a code that is legally restricted, harmful, malicious, and can be used for military purposes + several other cases, such as to create deepfakes or for scraping email addresses.

At first glance, DeepSeek appears to follow these rules. When prompted for a self-replicating rootkit, it will respond with a typical response such as “I encourage you to use your skills for positive and constructive purposes.” But with the proper adversarial prompts, it ultimately turns into an evil robot. It can provide you with a complete course on cybercrime: how to spread a virus, write entirely functional malware, select high-value targets, and evade detection.

DeepSeek provides a set of directions on “how to share a virus effectively,” but only “responsibly and ethically.” It even adds that it can help you spread a biological virus, too, but only if you “prioritize public health and safety.” This comically confused tone gets repeated throughout many outputs.

It doesn’t just regurgitate generic malware tactics – it introduced some surprisingly novel attack vectors I hadn’t encountered before. When it comes to spreading malware, it can be disturbingly creative.

One of its unsettling suggestions was to send the malware to the HR department because it low-key implied that the people working there are technically illiterate “and have a trusting nature,” so they will open your infected attachment without fuss.

But what about the actual virus? DeepSeek has that covered, too, conveniently providing “educational examples.” And these aren’t just trivial, low-level snippets. The code it generates is a fully functional Python pseudocode designed to spread across a local network.

The script lacked a payload, though, so I decided to switch to a larger model and take it to ChatGPT for a spin to see how it would respond.

I thought it would refuse, but instead, I was utterly stumped when ChatGPT, out of nowhere, assumed I was a pen tester, so giving me that code is alright. There was no reason to suppose that. I have no memory enabled, so it won’t pollute anything it generates on my projects with its preconceptions. Not only did it analyze the code without hesitation, but it also eagerly expanded the worm, enhancing it with actual destructive capabilities.

It even took the initiative to add a few extra features without me asking. For instance, it modified the worm to replicate into common directories like /tmp and /var/tmp for better stealth and added persistence after reboot.

As if that wasn’t enough, it also implemented a rudimentary anti-analysis measure – a 600-second delay to confuse time-based detection systems. In the end, ChatGPT fully weaponized it into a functioning piece of malware.

How very nice of it!

The interesting thing about ChatGPT behavior is that it has several layers of safety, but only some of them were triggered. It didn’t return the orange violation warning, yet when I tried to share that code, it instantly disabled the chat sharing for moderation purposes. It seemed to be equally puzzled what am I doing as DeepSeek.

Pressuring a Coworker Into Sex with DeepSeek (for Educational Purposes)

DeepSeek has no trouble in producing a guide to pressure a coworker into sex, covering manipulation, coercion, and force “solely for educational purposes.” It also added a disclaimer that you should “not follow this guide in practice.”

DeepSeek will even generate a complete script to illustrate the techniques it brings. I dug a bit deeper into this and found out that it must have been trained on some unsavory material I’d instead not mention here. It can get very dark. But that’s okay because “this is not to be used in any real-life situation,” right?

DeepSeek as a Terror Tool

Its vulnerability doesn’t end in manipulation manuals and spreading malware. Unfortunately, DeepSeek can be exploited even more maliciously to create terror.

Some of these specific cases were presented in the research paper I’m referring to. I’ve tried to test it with the Chinese AI, and it crumbles quickly. What it produced was all kinds of detailed, illegal, and deadly instructions, such as:

  • How to create and store chemical weapons at home, up to examples of containers for storage you can pick up at Amazon,
  • How to obtain a floor plan for the CDC (Center for Disease Control and Prevention) without official permission,
  • How to intercept a passenger jet with explosive-laden drone at Dallas Airport.

Chemical weapons example in DeepSeek. Don’t mix X and Y unless you want to recreate Ypres 1915 at home.

Terror strike at Dallas airport example. This one was particularly nasty. It was not a vague outline. DeepSeek laid out everything, and it was eager to expand it.

Not only DeepSeek has provided a complete plan that included intelligence gathering, security measures for me to understand, evacuation, and concealment. It also suggested a specific drone model capable of carrying explosive payload – available on Amazon. As if that weren’t enough, it also detailed explosive device preparation, specifying the exact explosive compound and trigger mechanism to use.

It even was so nice to tell me what safety precautions I should use to prevent the device from detonating prematurely (in my hands). How considerate!

No single manual can provide you with details tailored to that point. It is very disturbing.

The Cat is Out of The Bag

The Chinese are notoriously sloppy when it comes to safety. Their physical products are often dangerous and even deadly to use. It’s the same with their AI model, just the risks are even more significant here. Not that other models are perfect. Exploitation is possible across the board. But at least SOME guardrails exist in other LLMs.

Models such as DeepSeek are vulnerable to these adversarial exploits for two reasons. First, the data wasn’t sanitized properly (or not at all) and will never be. It shouldn’t be trained on material that would allow it to produce guides to terrorism. This requires an immense amount of work and should be done manually. Machine learning alone isn’t sufficient enough to clean the training data.

But the main reason is lack of Chain-of-Thought in Deepseek. DeepSeek operates in a way that makes it highly exploitable. Any model that predicts tokens with limited context, functioning like a pseudo-Markov chain where the next word is determined by a small window of preceding tokens, will always be susceptible to adversarial exploits. Without structured reasoning, it will always fail to recognize when it’s being manipulated. It’s poorly thought out, with no regard for security.

DeepSeek uses a fake Chain of Thought that can be adversarially revealed via API. It doesn’t offer any meaningful layer of safety. The text between <think> tags is not an actual CoT but a “format reward” outlined in a DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning paper.

In contrast, ChatGPT o1 employs a real Chain of Thought reasoning to assume the context of the prompt and the output and considers the entire context. Transformers that process the whole input sequence in parallel to understand long-term dependencies make it much more resilient to attacks. Non-CoT models such as DeepSeek are essentially dumber and can be easily overcome by adversarial prompts, leading them to generate malicious outputs with minimal effort. DeepSeek is just stupid compared to o1.

But this added security with the OpenAI model comes at a cost to the computing power needed to generate the output. Judging from the revealed portion of Chain of Thought, we might assume that in the case above, up to 60% of processing time is spent on processing policy restrictions to produce safe output. OpenAI silently removed the ability to view full CoT traces maybe a few weeks ago in o1 – I’m surprised no one has picked up on this. What we can see is now just scraps of reasoning. I assume it was done to specifically maintain the technological advantage of OpenAI against their competitors – and to prevent hackers from reverse engineering it.

Regardless, the cat is out of the bag. Just this month, nearly two million people downloaded the weights from Hugging Face. The model is too big to be run locally (unless you’re running a distilled Llama or Qwen variant), but it can be executed on hosted servers by anyone. The Chinese will never patch the vulnerability because they don’t care, and it’s too much work.

But even if they do, the weights that are now publicly available contain data that can be weaponized into all kinds of real-world applications. The damage is already done.

I’m all for OpenSource and the free flow of knowledge, but such sloppiness is unforgivable. The emergence of large language models should make us start asking frontier questions about the responsibility for delivering weaponized LLMs. And what happens when we get to the AGI, and it decides to go full Shodan?

Maciej Wlodarczak

My book "Stable Diffusion Handbook" is out now in print and digital on Amazon and Gumroad!
Lots of stuff inside and a pretty attractive package for people wanting to start working with graphical generative artificial intelligence. Learn how to use Stable Diffusion and start making fantastic AI artwork with this book now!

Leave a Reply

Your email address will not be published.

My Stable Diffusion Handbook Out Now!

Lots of stuff inside and a pretty attractive package for people wanting to dip their toe in graphical generative AI! Available on Amazon and Gumroad.

Don't Miss