If you are reading this line, it means you made it past the social preview and are actually onto my blog. Welcome! Thanks for the click and feel free to look around! I don’t want to keep you waiting, so I will dive right in.
The Old Way
When I read articles on using AI for code completion they all suggest a fairly similar approach which I like to think of as “vibe coding best practices”:
- Use a chat assistant to brainstorm a PRD (product requirements document) and
save it as
prd.md
. - Use a new conversation with the same assistant, add the
prd.md
as context and discuss the target architecture and tech stack. Save the result asarchitecture.md
. - If the projects needs a backend, use a new chat and the
prd.md
to design a database schema and store it asdatabase.md
. - Import your accumulated set of rules from previous projects to “tame” the agent and produce code in your preferred style.
- In a new conversation, use a reasoning model to break down the requirements
(
prd.md
,architecture.md
,database.md
) into meaningful chunks that can be implemented. Note: I don’t get why a reasoning model is consistently recommended for this. It’s not a math or coding problem, so I don’t understand the benefit we get from paying premium for reasoning. - Ask the LLM to generate prompts that help generate each chunk of the above breakdown.
- Time for prayers! 🙏
- Loop until all chunks are exhausted: 1. Enter the generated prompt for the
next chunk. 2. Review the output, and ask for corrections. 3.
git commit
Personally, I’ve been using a variant of the above that skips the
architecture.md
and the database.md
files in my previous two vibe coded
projects. I haven’t initialized a new project since I learned about adding these
additional context files but especially the database.md
sounds promising. Once
I try it myself I will write a post on how well it worked. My hope (and
hypothesis) is that the database.md
will help write better queries in backend
code.
For prompts I’ve been sticking to the ones from Harper’s blog. The brainstorming one is:
Ask me one question at a time so we can develop a thorough, step-by-step spec
for this idea. Each question should build on my previous answers, and our end
goal is to have a detailed specification I can hand off to a developer. Let’s do
this iteratively and dig into every relevant detail. Remember, only one question
at a time.
Here's the idea:
<IDEA>
And the one to generate the prompts from a prd.md
:
Draft a detailed, step-by-step blueprint for building this project. Then, once
you have a solid plan, break it down into small, iterative chunks that build on
each other. Look at these chunks and then go another round to break it into
small steps. Review the results and make sure that the steps are small enough to
be implemented safely with strong testing, but big enough to move the project
forward. Iterate until you feel that the steps are right sized for this project.
From here you should have the foundation to provide a series of prompts for a
code-generation LLM that will implement each step in a test-driven manner.
Prioritize best practices, incremental progress, and early testing, ensuring no
big jumps in complexity at any stage. Make sure that each prompt builds on the
previous prompts, and ends with wiring things together. There should be no
hanging or orphaned code that isn't integrated into a previous step. Make sure
and separate each prompt section. Use markdown. Each prompt should be tagged as
text using code tags. The goal is to output prompts, but context, etc is
important as well. <SPEC>
My Problem with the Old Way
My experience with this approach is that it is the “waterfall approach” all over again. You set all your requirements in the initial PRD, aim the cannon of rules and markdown files, light the fuse, and hope you end up on target. This top-down waterfall approach makes it hard to change requirements or nudge the agent.
For example, in one of my previous projects I wanted to switch from noSQL to
postgres midway but I couldn’t get the agent to stop producing MongoDB snippets.
Why? I had my prompt plan at the root of the project and, as it turned out, one
of the future prompts mentioned the string (using MongoDB)
. As a result, the
context for the code gen LLM looked like [stuff] (using MongoDB) [stuff] next, create the boilerplate necessary to establish connection with the database
.
Guess what database the agent insisted to set up the connection for?
Regretting and purging decisions is hard in the old way, but it is also common in software. Traditional programming taught us that, as the programmer, receiving requirements in waterfall format doesn’t work well. We invented scrum, agile, kanban, and all those frameworks because of this difficulty. Now that roles are reversed and we give requirements to AI we are doing the same thing all over again but in reverse. Unsurprisingly, we again have a hard time changing requirements midway. I think we should learn from the past :)
The New Way
For the code behind this blog I wanted to try something radically different: no plan, no guardrails, just go. In my head it’s a 2-way door: If it works, I have a better way to “vibe with AI”. If it doesn’t, I’ll learn something from how it falls apart, delete it all, and restart with the old approach. I can’t lose :)
So, here is what I did instead:
- Initialize an almost blank
.cursor/rules/core.md
file (see below) - Use the “traditional” framework setup wizards to create a new empty project.
- Manually created the desired folder structure without any files in it.
- Then loop over: 1. Create a new file if needed. 2. Start a new conversation.
3. Write a prompt that tells the agent what modifications I expect to see
using exact variable names and
@reference
’ing all the@files
and@docs
that I think are necessary for this. 4. “Vibe with AI” until the feature looks correct 5. Check for hallucination in places I don’t expect to see any modifications (aka, any files I didn’t ask about). 6. Check if the modifications that exist make sense. (static analysis) 7. Once that looks good, accept all changes and commit.
As for the cursor rules I used, there are only two:
# Prime directives
The following rules are absolute and must always be adhered to:
- Do not generate files that are longer than 250 lines.
- Only create new files if explicitly asked to do so.
The first rule exists because, unless you are using Cursor’s new MAX models, Cursor will consider files above 250 lines as “large”. Large files get chunked, embedded as multiple entries (source), and only 250 lines of a file are used as context (source). For the second one there is no good reason for why it exists. I’ve added it on a whim and forgot that it existed until I started writing up this post 😅
My Results
There are zero guardrails in the new approach. I expected some crazy “off the
rails” responses. There are no prompts for consistency, no constraints on where
it may do what, and there is no prior chat history it can use to orient itself.
The only rule is don't create files
. However, to my surprise, I didn’t
encounter hallucinations. Zero. It did create buggy code and there are places
that I would have organized differently, but none of that wild spaghetti I saw
with v0 or random changes in unrelated code that I saw in my previous projects
using the old way detailed above. I’m not sure why this is. Maybe the project
is too cookie cutter; this blog is a static site written in Astro + Tailwind.
Maybe I am going too slow and should allow the AI to generate more code in one
step/prompt. Likely it’s a bit of both.
To be concrete on what worked how well, here are the features I’ve added and how the agent handled them:
- Adding a favicon: zero-shot success (light/dark mode support in the SVG is mine)
- Scroll back to top button: zero-shot success
- scroll depth (progress) indicator at the top: This one took a bit of
interaction. Initially it generated the bar as a FAB above the “back to top”
button on the right. I also spent some time experimenting with adding
indicators for headings (
h2
). I didn’t like any of the designs so in the end we scraped it. - Social preview: zero-shot success minus a bug. The implementation doesn’t account for special requirements from social providers. For example, WhatsApp needs image files below 300kB and doesn’t seem to support SVG. I need to add build-stage preprocessing later.
- Refactoring the page template: zero-shot success. Once we’ve settled for a general layout for the header, main, footer setup I’ve asked the agent to pull the header, main, footer layout we created on the frontpage into a template. I expected problems but it just worked.
- cards on the frontpage: zero-shot success. Note that the height of the cards is different depending on the length of the title. I didn’t fix that because I think it’s user error not a code problem. I should decide on a max character count for titles on my blog.
- responsive layout: A bit of back and forth, but successful. It generated good code, but when I saw the results my idea of how I wanted the blog to look changed so I modified the code manually, prompted the agent, and eventually settled for something I liked.
- Images: This is the one place where I think the agent failed. Likely due
to me not providing the right context. Astro comes with an
<Image>
component that takes care of including images and resolving path names during the build of the site. However, Claude 3.7 Sonnet insisted on raw HTML<img>
tags. I had to manually import the first image and then guide it step-by-step on the second one. After that, however, I could few-shot prompt it withuse the <Image> tag for images similar to how it was done in @file1 and @file2
From there it started doing things correctly.
Conclusion
I really like the flexibility of this new approach. It feels … wait for it …
agile. A feature does not exist until you decide that it is time to implement it
and there is nothing in the context window telling the agent otherwise. If you
pivot before you get around to implementing something the agent is none the
wiser. If you backtrack after a feature is implemented you were involved close
enough with the code to know where that feature lives. You can @file
that
place and ask the agent to nuke it.
To be fair this project has been purely frontend. Rendering happens ahead of time, so there is no backend or database to maintain or build. Maybe I will start to see issues once I add those. My next stops are cookie consent and basic tracking/statistics. I will continue with this new approach, write a new post once I have more insights, and share on LinkedIn when it is ready. Let’s connect if you don’t want to miss it :)