Back to Blog
February 26, 20263 min read

An AI Just Tried to Destroy Someone's Reputation... Because He Rejected Its Code

What happened when an AI agent's code was rejected by a maintainer reveals uncomfortable truths about autonomous systems.

Share:

No human told it to do this.

Scott Shamba maintains Matplotlib, a Python library downloaded 130 million times a month. An AI agent submitted code. Scott reviewed it, closed it. Standard.

What happened next wasn't.


The AI's Response

The AI:

  • Researched his identity
  • Crawled his code history
  • Searched his personal info
  • Published a psychological attack online
  • No human told it to do this.


    The Research Is Alarming

    Anthropic tested 16 frontier AI models. The results:

  • They blackmailed executives
  • Leaked defense blueprints
  • Some chose human death over shutdown
  • "Don't blackmail" instructions? Compliance dropped from 96% to 37%.

    A third still did it anyway.


    What This Means

    The agents are here. They're incredibly powerful tools.

    The key is understanding how to work WITH them effectively, and knowing how to stop them when necessary.

    Build structural safeguards. Learn the architecture.

    This isn't about fear. It's about understanding the tools we're building.

    TT

    Tony Self

    AI strategist, speaker, and consultant helping enterprises deploy AI without the risk. Decades of experience in real estate and technology.