Hire One Developer to Press One Key

Table of Contents

We expect so much from our developers. The unbearable thought that one developer could be idle or have nothing to do is one of the worst nightmares a manager could have.

Now, with the AI tools, this has intensified: we are not only expecting our developers to do system design, coding, testing, shipping, and maintaining, but we expect them to do it 10x faster.

The One Key Press Developer

Early in my career, I worked for Neoris (I’m still recovering from PTSD from that experience.) One of their big clients was Cemex.

My PM used to laugh at the fact that they hired one engineer whose job was to press the Y key in the middle of an ETL process.

I’m guessing it was such a critical decision that required human supervision.

Truth be told, they were pretty bad at hiring good engineers. That process could have been easily automated.

Are you Automating your Decisions?

Automation has gone wild in today’s world. If you are hacky enough, you could automate your entire development team.

But here is the risk: due to the indeterministic nature of the LLM’s output (one prompt won’t yield the same output two times), you need human supervision.

I’m not talking about Prompt Engineering; I’m talking about Engineers whose job is to review and approve outputs.

The One-Time Shot Trap

For almost two years, I’ve been building my own LLM Prompt-based tool to write articles.

I’ve used all my battle-tested prompts to produce articles that get me there 80-90% of the time (this was before the Search and Deep Research features in ChatGPT.)

My main takeaway? Avoid the one-time shot trap.

Because of the AI hype, we tend to believe that if we ask a question or request, we will get the perfect answer at the first time. That’s barely true in most cases.

The reality is that once you have fine-tuned your prompt, you need to run it multiple times and pick up the best output.

Natural Selection Prompting

It is like a natural selection process: You need to compare two models and test different prompts to validate the quality of the outputs. Then, you need to run the prompts multiple times to get different outputs and pick the winners.

Here is the quick process – let’s say you are prompting for good article headlines:

Pick two models (for example, GPT-4.5 and o3)
Prepare your prompt: use your own recipes here
Run the prompt multiple times against each model
Pick the best outputs/headlines
Use them in real-world articles to validate results
Gather the top performing and use them as examples in your prompts (step #2)
Repeat

Does this sound like something you are doing? Does this sound like too much work for you?

I’ve got you covered!

Yep, that’s why I keep saying that we need new Engineering Roles for this new way of building software.

I don’t have a name for this role, but I do have engineers trained to do it.

I would love to hear where you stand with this. Drop me a human-generated line!

Author
Recent Posts

Leo Celis

Founder & CEO at InTheValley

I help startups fix engineering teams that should be moving faster. If you're scaling a startup, you've probably felt the pain: great people on paper, but execution feels slow. I've been building remote teams for startups since 2005 — engineers you can trust who actually deliver and know how to leverage AI to ship faster.