Using AI for Software Development - State of the Union

2025-10-18 Software-Craftmanship Augmented-by-AI Thoughts Tips & Tricks

I am using AI for 2 years now. Let’s talk about where I am right now.

Let’s talk about the tools I am using and how I use them. Let’s talk about my setup, my configuration, best practices and workflows.

  • Preamble

I started with a view that the AI will mainly assist and augment the software-engineer of the future and that the engineer will be not only be in the loop, but will control and drive the loop. Like the driver of a car that has a lot of asisttance (lane change detection, blind spot detection, navigation, cruise control, impact warning, parallel parking, …) to make him/her a better driver.

Today my view is that we will get to fully-automous driving much faster than I thought. Means the engineer gets into the car and describes the destination of the journey. The destination can either be an address (fix this bug) or a place you want to go to. Like …

Go to a restaurant with an outdoor area, that serves beer and burgers (like Sandyford House in Dublin), where the burger is less than EUR 20 and that can be reached in less than 30 mins. They need to have a table for 4 available. On the way there, please take the most senic route. Please avoid traffic. Please avoid tolls. Please keep the temperature in the car at 20 degrees celcius. Please play contemporary classical piano music (like Ludovico Einaudi) and notify me when we are 5 mins from our destination. Ask me at least 3 clarifying questions about the destination and the journey (each). Then make a plan and show me the plan before you start driving.

For the first 5 mins you probaly want to watch the car driving and want to correct it when it is doing something that you do not like. Then you get the laptop out and get work done while the car is driving you to your destination.

A year ago I was mainly using windsurf. Today I am mainly using claude-code.

Let’s go …

  • The future

I think that tools and tooling will converge. Means sooner or later it will not matter, if you use claude-code, gemini-cli, codex-cli or something else. They will all roughly be able to do the same (like Intelij vs. VsCode). Same is true for models. Sooner or later all the models will be (more or less) equally good at developing software and writing code.

What will make a difference is how you use the tools. Do you know, where you want to go? How good can you describe the destination?

The software-engineer of the future (at least for the next 5 years), will be a product-engineer (a mix of a software-engineer and a product-manager).

And the software-engineer of the future will be very articulated and will be able to express himself/herself clearly and succinctly.

The software-engineer of the future will have a lot of business-domain know-how.

Over the next couple of years it will become much more important to know where you want to go, then knowing how to get there.

  • Tools

I am using claude-code. I have not tried any of the claude-code alternatives. Means I do not know, if they are better of worse than claude-code.

I am using claude-sonnet-4.5. I have not tried any other models and cannot comment on other models being better or worse than claude-sonnet-4.5.

I am on the MAX plan (EUR 100/month).

I have MCPservers configured to use gemini (for code analysis (because gemini has a 1M token context window)) and to access JIRA and GitHub. Very soon I will probably retire my usage of MCP servers and will use command line tools (gemini, git, gh, …) and skills instead.

I run claude-code with "defaultMode": "bypassPermissions" (YOLO-mode). They only command that I do not allow is sudo. To make this safe, I run claude-code in a sandbox docker-container that only mounts the source code that I need into the container. The source code is obviously managed by git and whatever claude-code is doing is recoverable. Note: For this to work you need to make sure that your git-user cannot delete repos. The sandbox also has a firewall configured that only allows access to certain domains. Means the container prevents access to random domains.

To make this even more useable, I have wrapped the docker-container in a devcontainer and can now start this devcontainer in vscode.

Right now my vscode has all the extensions configured that I want and need for TypeScript, Python and Scala. Then I have a lot of general purpose extensions for GitHub, Markdown and so on. Right now it is one big mess. Sooner or later I will probably refactor this into language specific or repo specific devcontainers.

In vscode I also have GitHub Copilot configured. Mainly for hyper-intelligent code completion. For everything else I am using claude-code (e.g. I do not use the GitHub CoPilot chat).

As a terminal I am using Warp in auto(responsive) mode (auto modes intelligently select the best model for the task at hand based on the context and request type).

  • Workflows

My workflows start with a JIRA ticket. I either write a ticket by hand or I ask claude-code to create one from the context we are in (e.g. fix something that needs fixing later).

Once a week (or the latests when I pick up a ticket) I do a ticket review/hardening for all tickets. As part of that hardening, I make sure that the ticket is complete and correct and can be implemented in not more than 3 days. That polishing/hardening happens by means of a claude-code command that drives a conversation between the AI and myself, where I am asking the AI to ask me questions about the ticket to make the ticket better. For instance, every ticket should have a clear definition of done. While we are polishing and hardening the tickets, the tickets get updated and new/more tickets get created.

To close/implement the ticket we might require multiple PRs on multiple repos. I have not mastered the art of directing the AI at the ticket and it tell it to just do it.

Instead I am now going to the repo where the change needs to be implemented and I am firing up my vscode and start the devcontainer and start my claude-code. I then run a command that will create the branch and the PR for the given ticket.

If this is the first time I am visiting this repo, I run the init command to create the CLAUDE.md file (and also link that file to WARP.md). Then I am running my own init command to create a good seperation of concern between the README, CLAUDE and CONTRIBUTING. If the files are there, but it has been a while that I have worked on the repo I use my init/refresh command and use gemini to review/refresh the three files to make sure they are current and complete.

Every repo has a prompts folder. When I work on a ticket a folder with the ticket-number gets created. In that folder I am maintaining th elist of prompts I have used to implement the ticket. If it is a complex ticket I run the requirements command, which will break down the ticket into a requirements, design and implementation/tasks.

I then ask the AI to execute the task list. I normally watch it for the first 5 mins to make sure it is doing the right thing. If not I hit ESC and course-correct. After I have convinced mysef that the AI is doing the right things, I am moving to do something different.

After/When the AI is done I am asking github to use the GHCP reviewer bot and review the PR. Then I am using the pr-comments command to get all the comments and address them one-by-one. Some times claude-code will just explain why the comment can be ignored, but in some cases GHCP is raising a good point and claude-code will make changes to the code based on the feedback that was provided.

After a couple of iterations I am ready to merge the PR and close the ticket.

These days I am also investing a lot of time in good, mocked unit-tests. These tests needs to achive at least 80% code coverage.

I am also running integration-tests that are not mocked. It is these integration tests that give me confidence that everything is working as it should be.

Note: Orthogonal to this (once an hour) we are also running system-tests.

  • The Software-Development Process

I do not want to teach how cranny needs to suck eggs, but would nevertheless quickly talk about the software-development process.

The process needs to bridge the gap between vage, high-level idea to a very detailed implementation.

That gap cannot be bridged in one go. You need to break this down into phases and layers of artitacts (requirements, architecture, design, implementation). Every layer/phase will add more detail until there is enough detail to implemement an incarnation of the idea (on of many) in code. Between the idea and code, what needs to be implemented and how was/is a moving target and there was always drift and creep. Means the code and the (original) idea were/are different. In a world like this the code is the ultimate truth. If you ask: What is this system doing, you never look at the requirements or the design. You always look at the code.

I believe that this will change. I think that code will become ephermal and that the system is defined by the set of prompts that created the system.

But between now and then we still need to bridge the gap somehow.

feature-brief, A&D (in-the-large/-small; design of the system/service), implementation, …


Configure Claude to use Gemini for sub-tasks (as a sub-agent)

Use commands (a lot) Use clear/reset (a lot) Use memory (#) (a lot) (do not forget to SCC it/back it up) Invest in a good CLAUDE.md file

2 Phase Approach

  • Trust and Verify
  • Build and Polish
  • Fix and Review

CC (claude-code) and VSC (vscode (with CC))

  • not using windsurf (anymore)
  • the death of the IDE VSC, zed, …)

Human in the middle vs Human in the loop

  • Picture/Graph
  • Who is in control?
  • The human as an agent?

Need to make done (how good and right looks like) measurable!!!

  • at least 80% code coverage
  • no file has more than 150 lines
  • no function, method is more than 50 lines

Instructions …

  • see previous, plus …
  • the code must be readable by humans
  • comments should not explain what the code does, but why the code is the way it is (why is it implemented this way and not any other way)
  • write idiomatic code
  • for every plan or todo-list ask me 3-5 questions to make the plan better
  • do not generate a lot of code. Do not repeat yourself. Reuse and/or extend existing code and functions and artifacts (creating a new function vs. extending an existing one)
  • follow the clean-code/-design manifesto

Find a way for the AI to find code-smells and design-smells and code-/design-hygiene problems

  • Make the AI (or another AI) look at the code and critic it

News …

  • chatGPT 5 (flop)
  • sycophancy (flip)
  • windsurf (flop)
  • the death of the IDE (flip)
  • the death of the keyboard (flip)

AIs are good at software-development

  • But not predictably good
  • Do NOT trust them (yet)
    • Trust and verify
  • Much less good/helpful at infrastructure, dev-ops, …
    • Results look equally good, but lack quality (wrong more often)