Transcript

Time to Shift Into the Next Gear

5 February 2026/18 min

I have to admit, I had a bit of an existential crisis at my desk this morning. I was staring at my screen, watching that little cursor blink in a chat window, and I just had this sinking feeling that I am doing everything wrong. Wrong how? I mean, you've been using these tools for years. That's the problem. We're sitting here, it's February 2026, the technology is just light years ahead of where it was even six months ago. But I realized I'm basically driving a Ferrari like it's a bicycle. Huh, yeah. I open a window, I type a question, I get an answer. It feels magical compared to, you know, the old days of Googling, sure. But if you actually look at what the engineering is capable of as of today, right now, just using these tools as a chatbot is, well, it's borderline tragic. It is a tragedy of wasted potential. It's like owning a supercomputer and just using it to play solitaire. You aren't wrong, strictly speaking, it works, it does the job. But you are leaving a tremendous amount of horsepower in the garage. And that disconnect is exactly... That's exactly why I wanted to pull this specific source today. We are looking at a brand new article published just this morning, February 5th, by Stefan Sonnell from Spinout. It's titled, The Five Levels of Agentic AI Automation. And it is a heavy hitter. It's not just some listicle, it's a complete restructuring of how we should view work. It completely reframed my morning. And honestly, it made me feel a little behind the curve. Well, the goal here is to catch you up. What Sonnell does is fascinating. Because he moves us away from thinking about AI as a tool we talk to, and starts defining it as a system that acts for us. Right. So the mission for this deep dive is pretty straightforward. Right. We want to move you, the listener, from being what he calls a chatter to being an orchestrator. To do that, the article borrows a framework from the automotive industry. Most people know the SAE levels for self-driving cars, right? Level zero is a manual stick shift. Level five is a car with no steering wheel that you can, you know, sleep in. Right. That's the standard vocabulary for autonomy. This article maps that exact scale onto AI. It gives us a way to measure where we are, and more importantly, where the technology just jumped to. And we have to be clear about the timeline. We do. January 2026 was a watershed moment in engineering. Things have shifted in a way that most people haven't even noticed yet, because the interface, the chat box, it looks the same. We're going to get to the sci-fi stuff, the Ralph method, and agents that sleep in a bit. But we have to eat our vegetables first. We have to look in the mirror. The source calls the status quo. Level zero. Level zero. The map reader. This one hit a little too close to home. Level zero is the chat phase. It's synchronous. You ask. It answers. It stops. The analogy the source uses is reading a map in a phone book while you're driving. It's a vivid image, isn't it? In this scenario, the system, the map, it gives you information. It tells you turn left in 500 meters. But who is turning the wheel? You are. You are. Who is pressing the gas? You are. Who is checking the lights? Who is checking the blind spot and making sure you don't hit a pedestrian? You are. You are doing 100% of the cognitive heavy lifting of actually driving the car. The AI is just a passenger holding a book. There's a snarky little footnote in the article here that made me laugh. Zanella writes that at level zero, you're doing all the signaling, unless you drive a BMW or an Audi, in which case you don't signal anyway. A little automotive shade to keep things light. But the critique is serious. Level zero feels revolutionary because it's faster than looking up the map. You don't have to keep yourself in a library, but it requires constant, unblinking human attention. Right. If you stop typing, the AI stops working. It has no agency. It just waits for you. And yet, the source argues most people, even the quote-unquote power users, are stuck here. We have that Ferrari, but we're just using it to go buy milk. Okay, so let's nudge the needle. What does level one look like? Level one is cruise control. The article defines this as the transition to context-aware usage. Okay, so in practical terms, this is when I upload a PDF. Or a massive project file before I start chatting. That's the start of it, yeah. You're building a temporary knowledge base. The AI understands the background context of what you're working on. So in the driving analogy, you're still steering. You're still watching the road. But the car is maintaining its speed. So you aren't constantly tapping the gas pedal to explain who you are or what the project is every time you ask a question? Exactly. The behavior shift here is interesting, too. The source mentions that at level one, you enter the sort of flow state. While the AI is typing an answer, you're already formulating the next prompt in your head. You're faster, yes. You might say, summarize this document. And before the cursor stops moving, you're typing, now make a bulleted list of the risks. You're the active manager. Yeah. But let's be real. You are still very much the driver. If you have a heart attack, the car crashes. The AI doesn't know where it's going. It just knows the current speed. And that's the distinction. Level zero and level one are both human-in-the-loop systems. But then... We hit level two. And the article suggests this is where the real shift begins. This is where we cross the chasm. This is letting go of the wheel. I find this psychologically difficult. Level two is where you stop describing the path and start describing the destination. It's the difference between driving yourself and hiring a chauffeur. When you hire a chauffeur, do you tell them, OK, turn the wheel 30 degrees right, press the gas pedal with 10% pressure, now signal left? No, that would be insane. You'd just say, take me to the airport. Precisely. And that is level two. And that is level two AI. The text gives a great example of this prompt shift. Instead of asking for a template or a summary, the prompt becomes something like, analyze spin-out AB as a potential partner. Check their annual report, products and reviews. Write a memo on risks and strengths. See, that's where I get nervous. You aren't telling the AI how to read the report or how to check the reviews. You're defining the outcome. The agent breaks down that task, executes it, and this is the scary part, you walk away. You go grab a coffee. You get another email. The AI only calls you if it crashes. But does it work? I mean, really? Or do you come back to a hallucinated mess? That level two, it works, but usually for linear tasks. It's one agent doing a series of steps. It reads, it thinks, it writes. It's effective, but it's still limited to what one brain can handle. OK. Where it gets really wild and where the ROI starts to look ridiculous is level three. Level three is orchestration. The source compares this to managing a team. And that's because now you have parallel tracks. Imagine you have three or four agents running on your computer at the same time. Agent A is doing that competitor analysis we talked about. Agent B is drafting responses to your emails. And Agent C is prepping slides for a steering committee. Exactly. So they're all running autonomously. They stay in their lanes doing their specific jobs while you sit back and coordinate. You're the conductor of the orchestra. But I see the limitation here. They are silos. They're silos. Agent A doesn't know what Agent B is doing. Mm. Their analysis from Agent A reveals something that should change the slide deck Agent C is building. Well, they can't talk to each other. They're working in isolation. Which brings us to level four, collaboration. This was a part of the article where I actually put the paper down and just stared at the wall for a second. Level four is where the agents start talking to each other. Correct. They collaborate on a single complex problem without you refereeing every single interaction. The case study in the source describes creating a market analysis of the system. It's not just one bot doing it. OK, walk us through that because it sounds like a recipe for chaos. Robots talking to robots. It's actually highly structured. You give the system a goal. Create a comprehensive market analysis. The system spins up Agent 1 to analyze competitor pricing. It spins up Agent 2 to review industry reports. Agent 3 compiles customer surveys. OK. But here's the magic. Agent 4, let's call it the lead agent, takes the work of the first three, synthesizes it, and writes the final report. If Agent 4 sees that the pricing data from Agent 1 contradicts the survey data from Agent 3, what happens then? Agent 4 can ask them to recheck. It can say, hey, Agent 1, these numbers don't match the survey. Run your query again, all without you touching the keyboard. It resolves the conflict internally. That feels like a massive leap in trust. But what's the actual return on investment here? Is it just a coolness factor? The source claims work that used to take a week, 40 human hours of compiling, cross-referencing, drafting, is done in one hour of compute time. A week down to an hour. That effectively collapses the cost of first draft work to near zero. But believe it or not, we aren't even at the top of the mountain yet. We have to talk about level five. Level five. The frontier. The source calls this the domain of the Ralph method. Now, when I first read that, I assumed Ralph was just some guy in a basement who figured this out. But this is actually a specific engineering breakthrough from just last month, January 2026. It is. And it's important to note the source surprises. This isn't science fiction. This isn't theoretical. Companies like OpenAI, Google, Meta and tools like Cursor and Lovable. They are operating here right now. So what is the Ralph method? It refers to a fundamental shift in how developers treat the AI's relationship with failure. The core concept is persistence as strategy. Persistence as strategy. Break that down. Think about how you use ChatJBT today. You ask a question. If it gives you a bad answer, what do you do? I roll my eyes. I curse a little and then type a new prompt to correct it. Right. You treat it as one shot. You get one shot. Then the human intervenes. The Ralph method puts the AI in a closed loop. They tell it here is the test. Here is the success criteria. Keep trying. If you fail, read the error message, fix your code and try again. Do not talk to me until the test passes. So it's self-correcting. It's grading its own homework. It iterates. It can try a thousand times in the time it takes you to drink that coffee. If attempt number 400 fails, it learns and tries number 401. And combined with that, we have the concept of subagents. At level five, the main agent spawns these many agents specialists to handle parts of the job. The example in the text is a bank doing due diligence on an acquisition. That is a massive high stakes task. You can't just have a chat bot guess at that. Imagine the complexity. You need financial analysis, legal review, technical infrastructure checks, cultural fit assessment. At level five, you figure this process on Monday morning. OK, the main agent spawns a legal agent, a financial agent, a tech agent. They work for hours, maybe even days. They iterate. They solve problems on their own. And on Tuesday, on Tuesday, you get a decision basis, a fully formed report that synthesizes thousands of documents. The comparison the author makes here is incredible. He says it's like calling McKinsey, saying, fix this and getting the delivery for the cost of the first meeting. It effectively democratizes the kind of resource heavy analysis that used to be the exclusive domain of massive consulting firms. A small startup can now run a McKinsey level analysis for 50 bucks of compute credits. OK, I have to play the skeptic here. I lose my train of thought if I walk into the kitchen to get water. How does an AI work for days without hallucinating or just forgetting what it's doing? The memory of these things has always been the weak point. If you feed it too much info, it starts making things up. It's a perfectly valid question. If you tried to do this in 2024, it would have failed miserably. The source breaks this down into under the hood mechanics. There are three technical enablers that make level five possible today. Let's unpack them, because if I'm going to trust this thing, I need to know how it works. Number one is the context window. Think of the context window as your desk. In 2023, the desk was small. You can maybe fit 8000 words on it, a few papers. If you added more, some fell off the edge and the AI forgot them instantly. Right. 8000 words or more. You can have piles of documents open. But even a big desk has limits. If you're working for days on a company acquisition, you're going to fill up that desk eventually. Exactly. Which leads to number two, external storage, or as the article calls it, the file system. This is the game changer. Agents now sleep. Sleep. That sounds uncomfortably human. It's a metaphor, but an accurate one. An agent works for a while, fills up his context window, and then it saves its game. It writes a summary of its progress to a file, clears its short-term memory, wipes the desk clean, and then wakes up by reading the file to continue. So it creates its own save points. It's exactly like you writing down notes before you go home for the weekend so you know where to start on Monday. This allows the AI to work on projects that are infinitely larger than its memory capacity. So it doesn't need to remember everything. It just needs to know where the information is stored. And that links to the third enabler, externalized plans. The agent writes down a checklist. It doesn't need to remember what it did an hour ago using brainpower. It just looks at the list. Step three is done. What is step four? It's so simple when you put it that way. It's just good project management applied to software. But this brings us to the human element, because if the AI is doing the work and the planning and the executing, what happens to us? Are we just redundant? That is the million dollar question. And the source doesn't sugarcoat it. It brings up a really sobering case study. Klarna, the buy now, pay later giant. Right. They made headlines a while back for replacing 700 service agents with AI. And initially the stats looked great. Wait times dropped from 11 minutes to two minutes. Efficiency went through the roof. Shareholders were probably throwing parties. Then the twist. They started hiring humans again. Why? Because a pure cost focus hurt quality. And this is a critical lesson for level five. Even a self-driving car needs someone to decide where to go. And more importantly, someone needs to verify that it arrived at the right place and didn't run over the way. The source argues that our role shifts from doing to defining quality. Yes. Think about it. The value is no longer in producing, writing the code, drafting the memo, analyzing the spreadsheet. The AI does that faster and better. The value is in specifying the outcome clearly enough that an agent doesn't derail after working for three days. It's the genie in the bottle problem. If you aren't specific with your wish, you get a weird result. I want to be the richest man in the world. And suddenly you're a bank robber. Precisely. If you are vague, the car ends up in a swamp. The new skill set isn't typing fast. It's building tests, not just code tests, but tests for tone, correctness, brand voice. You have to verify the work. So we're becoming quality assurance for our own digital workforce. In a way, yes. We are shifting from players to coaches. A coach doesn't run the lap. The coach analyzes the form and sets the training schedule. That sounds well, to be honest, it sounds a bit boring compared to creating. Is that a promotion or demotion? It depends on your mind. Like I said, if you love the grunt work, it's a loss. If you love the strategy, it's a massive promotion. You're now managing a team of experts rather than doing the work yourself. I want to wrap up with the philosophical angle the source takes at the end, because this isn't really about technology, is it? No, it's a time shift. The author points out a historical pattern here. Templates killed formatting time. Google killed research time. Now, agentic A.I. is killing production time. And he speaks directly to the people. Probably some listening right now who say, I'm too busy to learn this new A.I. stuff. I have deadlines. I can't spend three hours figuring out how to make an agent work. His argument is that you don't have time not to learn it. It doesn't require a budget or a massive strategy deck. It requires testing. It requires you to give an agent a real task. Take your hands off the wheel and just see what happens. That's the challenge. Stop staring at the map. Level zero is comfortable, but it's slow. And if you let the car drive, if you embrace level five, the question stops being, does the car work? The technology is proving that it does. The question becomes something much more personal. What are you going to do with the empty hours you just got back? That is the question that defines the next decade of work. A huge thank you to Stefan Sonnell and Spinout for this breakdown. It's certainly given me a lot to think about regarding my own workflow. I think I need to go have a serious talk with my computer. Just make sure you define the outcome clearly. Thanks for listening to this deep dive. We'll catch you on the next one.

← Back to episode All episodes