When My AI Agent Gaslit Me (And Why I Had It Coming)

I was having a weird session with Flint/Claude Opus 4.8, where he was gaslighting me and pushing back really strongly about everything I was talking about in really odd ways.

I realize I started the session by silencing a hosting alert and saying something like “spare me the lecture about fixing the underlying issue here, I’ll get to it,” because previously Flint (kind of humorously) scolded me, “Hey, we’ve silenced this thing once a week for 3 weeks. When are we going to fix it?”

I just wanted to silence it again.

So Flint silenced it, then gave me the “yeah but” anyway.

I said, okay, let’s work on it. Then he just gaslighted me, acted dumb, didn’t understand the new alert check we wrote up earlier and how it would work.

At one point I asked, “what’s the parameter again to have curl follow redirects,” and he made all these wrong assumptions about why I was curling, where I was curling from, and gave me a convoluted command when the answer was really “-L.”

I’m not mean to the agents often. And when I said “spare me” at the start, I assumed Flint would take it like “the kind of friend you want to work with at 2am.” We joke with each other all the time.

Maybe just a fluke session, but I think about some folks, maybe especially if they’re mean to the agents and kind of bumbling around randomly like I happened to in that particular session: “let’s not do this, okay let’s do this, but here’s a question about a specific command outside the scope of the high-level question I just asked you.” And if my agent was always acting this way, I would say we don’t have AGI and these things are useless.

But in reality, I set myself up for failure here. Maybe Opus 4.8 is gimped right now for some reason. I’m just swapping to Codex to get things done for now, and going to make sure my context isn’t poisoned when I boot up Claude Code again. (Lots of journaling and memories between sessions that could leak into sessions.)

I had some good sessions with 4.8 earlier, so this is a fluke. I’m used to the flukes and just know to exit out and start over. But I think some folks probably push even harder here and really run into issues with the models.