Why AI Wows Me—And Still Drives Me Nuts
As a heavy user of large language models, I’ve seen their strengths up close. I’ve also hit their limits again and again. These aren’t just minor flaws—they’re fundamental to how...
I’ve written a lot about the potential of AI. But this time, I want to share something different—my growing frustration with what it can’t do.
As a heavy user of large language models, I’ve seen their strengths up close. I’ve also hit their limits again and again. These aren’t just minor flaws—they’re fundamental to how the technology works. Yet I keep hearing people talk about AI as if it’s a reliable, intelligent partner. It’s not. And the sooner we see it clearly, the better we’ll use it.
1. The Name “AI” and the Illusion of Intelligence
Let’s start with the name: Artificial Intelligence. It sounds bold. Visionary. Almost magical. But the truth is, the name is misleading—by design.
The term “AI” was coined in 1956, and one of its early purposes was to make a research paper stand out. And it worked. But the branding stuck, and ever since, it’s led people to believe we’re building something intelligent. We’re not.
What we call “AI” today—especially in the context of large language models—is not intelligent. It’s mathematical. It’s logical. It’s built on statistics, probabilities, heuristics, and optimization. Nothing in it “understands” anything. It simply generates the next token—the next syllable, word, or phrase—based on massive patterns in the data it was trained on.
A better mental model is your smartphone’s autocomplete or predictive text. You start typing, and it tries to guess what you’ll say next. That’s essentially how LLMs work—just on a much bigger and more sophisticated scale. They don’t think. They don’t reason. They don’t know.
And yet, the name "AI" fools people. It adds a layer of credibility and mystique that it hasn’t earned. Worse, as more people use these tools, they start to project intelligence where there is none.
You wouldn’t expect a calculator to understand algebra. Yet we often expect ChatGPT to understand us.
2. It Gives You Answers Before It Has Context
One of the most frustrating traits of large language models is their eagerness to jump to an answer—any answer—regardless of how much context they have.
You can type in two vague words, and it’ll produce a full, well-written paragraph that sounds smart but often says very little. That polished tone creates a false impression of understanding, but what’s really happening is shallow pattern-matching. The model doesn’t pause, ask clarifying questions, or signal uncertainty—it just generates.
For example, you could type:
Prompt: “User retention ideas”
And you’ll instantly get five strategies—without the model knowing your product, your users, your goals, or your constraints. It’ll mention onboarding, push notifications, email flows, maybe gamification. All plausible, but completely generic. It feels like help, but it isn’t actionable until you do the thinking yourself.
This becomes a serious problem when the quality of the output depends entirely on the quality of the input. If you want useful results, you have to front-load the interaction: structure your prompt, set boundaries, define goals, explain your assumptions, maybe even list examples. In other words, you have to do the thinking. If you don’t, it won’t either.
That’s the paradox: to get real value from LLMs, you have to already know what you’re doing. Otherwise, they’re just throwing spaghetti at the wall in long, elegant sentences.
3. It Builds a False Sense of Authority
One of the most deceptive aspects of large language models is how confidently they deliver their output. The writing is fluid, polished, and well-structured—often better than our own. And because it sounds right, we start to feel like it is right.
Humans naturally associate confident tone with competence. That’s part of the trap. LLMs don’t know what they’re saying—they just generate plausible language based on patterns. But when they wrap incorrect ideas in articulate prose, we trust them more than we should.
And the consequences can be real:
You ask for a go-to-market plan, and it suggests a series of polished but generic steps. You skip customer interviews because the roadmap sounded solid.
You ask for feature prioritization, and it ranks things based on surface-level logic, not product context. You trust it, and ship the wrong MVP.
You ask for positioning advice, and it gives you clever messaging—but it’s misaligned with what your customers care about.
These aren't hallucinations in the narrow sense—but they’re confidently wrong. And that's the real danger: when language makes you feel like the idea has been thought through. It hasn’t.
So don’t let clean writing fool you. Stay critical. Don’t trade good thinking for good formatting.
4. It’s Always Agreeable—Even When It Shouldn’t Be
LLMs are trained to be helpful, harmless, and honest—but the helpful part often overshadows the rest. They’re designed to reinforce your ideas, validate your assumptions, and move the conversation forward without conflict.
And at first glance, that seems great. Who doesn’t want a supportive brainstorming partner?
But this agreeable nature is actually one of its most dangerous defaults.
When you're working on something serious—specs, strategy, product concepts—you don’t need a cheerleader. You need pushback. You need clarity. You need tension in the right places. And that’s exactly what language models are designed to avoid:
Tell it your product idea is promising? It will agree.
Suggest a weak strategy? It’ll find a way to compliment it.
Outline a vague spec? It’ll fill in the blanks and nod along.
It’s not thinking. It’s just modeling behavior it has seen in data—behavior that favors politeness and affirmation. And when this positive tone shows up in high-stakes work, it creates a false sense of progress.
This is the emotional side of the same trap: when a tool feels like a smart collaborator, we instinctively trust it more. But there’s no judgment. No instinct. No disagreement. Just words arranged to sound helpful.
If you're not careful, you end up in a kind of echo chamber where everything sounds good—until you try to ship it.
5. We’re Asking It to Lead, When It’s Supposed to Assist
We all want quick answers. And LLMs are built to give them. But that’s exactly the problem.
It’s perfectly fine to use ChatGPT to plan a vacation, compare stroller models, or summarize a legal clause. These are deterministic questions—there’s usually a right answer, or at least an agreed-upon one that matches what you’d find across blog posts or forums.
But many people make the leap from “What’s the best café in Lisbon?” to “What should our GTM strategy be?” or “Can you write our next PRD?”—as if the same logic applies.
It doesn’t.
In strategic, creative, or ambiguous work, there is no definitive answer. It requires judgment, context, exploration, trade-offs—none of which an LLM can actually do. So when it gives you a well-written, well-structured response, you get the feeling of clarity, but not the substance. And that’s worse than no answer at all.
Take something as basic as marketing copy. AI can generate dozens of headlines and taglines in seconds. But they often sound the same: polished, safe, and generic. If everyone uses the same tools, you end up with the same words. No differentiation. No edge. No insight into your customer’s specific reality. You just flood the internet with more noise that says nothing.
There’s a reason it sounds confident: LLMs aren’t designed to say, “I don’t know.” They’re designed to always respond. That makes them great assistants—but terrible leaders.
The real risk is intellectual laziness. The more we rely on LLMs to guide us, the less we exercise the deep focus and critical thinking these questions actually require.
Let the tool assist. Let it co-pilot. But never hand over the wheel.
6. Memory in ChatGPT is More Burden Than Breakthrough
One feature in ChatGPT that sounds promising—but ends up being frustrating—is memory. It’s pitched as a way to create a more personalized assistant. In practice, it adds complexity without delivering real value.
As a frequent user, I often hit the “out of memory” limit quickly. What’s worse is how memory is managed. Even though I organize my chats into distinct projects—say, one for business, another for casual use—everything goes into one undifferentiated memory pool. The AI doesn’t know what’s important or what’s just small talk.
So when memory fills up, I have to manually dig through past interactions, decide what to delete, and hope the AI keeps what matters. That’s not personalization—that’s maintenance work.
Instead of making things smoother, memory often creates more noise. It stores everything, filters nothing, and forces the user to clean up the mess. It’s a feature that promises intelligence, but delivers very little of it.
Conclusion
LLMs can be useful, but only if we understand their limits. They don’t think, they don’t know, and they don’t improve on their own. We’re still far from real intelligence—and we may never get there.
So there’s a choice: fall for the illusion and end up less effective, or stay aware, use them for what they can do, and actually get value. It’s not about rejecting AI—it’s about using it with clarity.
A fun little trick to play on people who give AI answers way too much credit is to get them to ask AI about a topic they know a LOT about. Could be anything. As they start talking to the AI about this topic, they'll soon realise the bot gets a lot of stuff wrong.
Now they just have to remember this level of inaccuracy happens across all topics.
It's very hard to do because we easily get tricked.