I Asked an AI to Judge My Hacker News Comments. The Real Lesson Wasn’t About Me.

You’ve probably done it. You open your Hacker News profile, scroll through your own comments, and think, “Yeah, that was a good one. I’m clearly the sharpest person in this thread.” Then you close the tab, vaguely satisfied. It’s a private ritual of ego maintenance—what one developer has called a “horoscope for nerds.”

So when I stumbled on Selbstbild—a web app that lets Fable 5 (Anthropic’s top-tier model, the one the US government had to restrict export on) analyze your entire HN comment history—I knew I had to try it. Not because I needed validation. Because I needed more validation.

The tool is beautifully simple. You bring your own API key, it scrapes your public comments, and the LLM spits out a personality summary, a quality score, and a few brutal observations. It’s all client-side, no sneaky cookies. Pure, unadulterated algorithmic narcissism.

I got my results. Fable 5 told me I was “highly analytical, occasionally condescending, but ultimately a valuable community member.” I felt seen. I felt vindicated. I felt exactly the warm glow you’d expect from a machine you’ve been told is smarter than most humans.

But then I read the creator’s own admission—the part that made me rethink the whole exercise.

He built the entire app using Fable 5 itself. A few prompts. A few minutes. The model wrote nearly all the code. And then he reviewed it. He found bugs. A cache deletion issue that could leave user data lingering. An ancient API version that had been deprecated. A couple of stray “any” types.

These were not subtle issues. They were exactly the kind of mistakes any competent human developer would catch in a five-minute code review. But Fable 5—the same model that had just evaluated my entire personality with eerie accuracy—had let them through.

This is the paradox of “super-intelligence”: it can pass the Turing test on your ego and fail the lint check on your code.

I’m not picking on Anthropic. The same thing happens with GPT-4, with Gemini, with every frontier model. We’ve built systems that can write poetry, pass the bar exam, and diagnose diseases, yet they still stumble on cache invalidation—the computer science equivalent of remembering to turn off the stove.

The creator’s conclusion is brutally honest: “If such obvious deficiencies can sneak in on such a simple project with such an advanced model, that tells me we are still far from a stage where ‘no review’ is even something we can discuss.”

That sentence should be printed on a poster in every AI lab. We’re so busy celebrating how much code LLMs can generate that we ignore how much of it is subtly wrong. A second pass—whether by a human or another agent—would have caught these bugs. But a second pass isn’t “no review.” It’s just outsourcing the review to a different part of the stack.

The “no review” dream is a fantasy. What we’re building is a future where reviewing becomes cheaper, not obsolete.

And that’s the real takeaway from Selbstbild. It’s not about whether you’re as smart as you think you are. (Spoiler: you’re not, but neither is the AI.) It’s about the meta-game we keep missing. Every time we use an LLM to judge our own intelligence, we’re implicitly trusting that the judge is competent. But the model that wrote the judge is the same model that wrote the code with the cache bug.

The architecture of the hype is a hall of mirrors. We ask AI to assess our writing, our code, our personalities. We trust its verdict because it sounds sophisticated. But the model itself is a high-speed parlor trick with a PhD in bullshit.

I still recommend you try Selbstbild. It’s genuinely fun, and the creator deserves credit for being transparent about both the app’s capabilities and its limitations. Just don’t take the results too seriously. And definitely don’t let Fable 5 deploy code to production without a human looking at it first.

If a machine can tell you you’re a genius while simultaneously forgetting to clear its own cache, maybe the machine isn’t the one you should be listening to.

The tool is at selbstbild.eu. Go get your ego stroked. Then go review your own code.

FAQ

Q: Isn't this just a vanity project? Should I actually care about an AI that flatters me?

A: The vanity is the hook, not the point. The real value is the meta-lesson: the same model that strokes your ego also introduced trivial bugs in the tool itself. It's a concrete case study on why 'no review' pipelines for AI-generated code are still dangerous, even with frontier models.

Q: What's the practical takeaway for developers building with LLMs?

A: Always review AI-generated code, even when using the best models. The errors in this project—cache invalidation, deprecated API version—are the kind a junior dev would catch. If Fable 5 can miss them on a simple app, complex production systems will have far worse issues. Use AI as a junior pair programmer, not as a deity.

Q: But isn't it hypocritical to use an AI to judge your intelligence and then criticize AI's limitations?

A: Exactly. That's the paradox the article highlights. We're quick to praise AI when it flatters us and quick to dismiss it when it fails. The contrarian take is that we should treat both outputs with equal skepticism. The tool is fun, but the technology is not ready for unsupervised deployment—and that's a truth the hype machine tries to hide.

FAQ

📖 Related Articles

The Collapse of Theorem Mercantilism: Did AI Just Destroy Math’s Ultimate Monopoly?

Why Do Your Flawless Plans Always Fail? The Fractal Complexity Gap

Your Compiler Is Lying to You. Here’s the Truth.

This Open-Source Truck Is a Rebellion Against Everything You Know About Cars