This is an work in progress log of my thoughts on vibe coding, and AI and coding.
[11:54 am, 13/2/2025] aahnik: software is not just the code actually.. its a logical structure composed by humans having a deep understanding of the domain problem,and the software architecture
LLMs can emulate that logical software.. I believe in science, lets not pass comments without experimenting..
can we make an autonomous team of ai agents, which can build a production grade db, or compiler from scratch ?
no human intervention (no human will guide it what to do, and how to do, but will just give an initial 1 page, high level, spec docs) but it has access to all human data it can ask questions in public forums it needs to write
all its design docs
formal proofs of working
test cases
prove how the arch complies to the SLAs
iterate i guess we can create a new benchmark for LLMs, by asking them to create a DB/Compiler from scratch [11:55 am, 13/2/2025] aahnik: this will an experiment, and we need to define parameters of judging the final output of the LLMs [11:57 am, 13/2/2025] aahnik: we can apply to top US confs.. i am open to co-authors
we will emulate the roles/team structure of worlds top db companies, and give those exact same job roles to LLMs
benchmark 1, can it solve existing problems (whose solutions are publicly documented)
can it solve new problems (frontier of database, and compilers) [11:59 am, 13/2/2025] aahnik: I want to create a benchmark, that top AI companies, will evaluate their new models on [12:00 pm, 13/2/2025] aahnik: base paper: https://arxiv.org/pdf/2412.14161 [12:02 pm, 13/2/2025] aahnik: also, research on “what LLMs can and cannot do” inside software development will be very valuable what tasks LLMs can do well ? why ? what tasks LLMs struggle to do ? why ? this direction of thought may produce nice papers, and products [12:14 pm, 13/2/2025] aahnik: if we can enlist all specifically well-defined classes of SWE tasks that LLMs struggle to do, then we can
breakdown the tasks into smaller tasks, and try different techniques of making LLMs do those tasks
how does a senior human software engineer think ? we have human thoughts and decisions in design docs wiki of all companies. can LLMs reason with that ? if you specifically define a task, be it security constraint, or performance constraint, LLM can actually do it
to prevent a security disaster, LLM can write all the test cases, human work can be much reduced a gang of agents, can try all known pen testing methods, tirelessly 24/7
how long would it take, for agents to surpass humans, in all tasks we seemingly consider difficult? what exactly is difficult, needs to be defined first
instead of thinking in terms of single prompt –> single output, we need to think in terms of agents, that autonomously do stuff, and iterate on its mistakes, and learn from its mistakes, and get better over the years
Refer: nmn.gl two famous articles on vibe coding and stuff
wrt to the ThePrimeEagen video
The video thumbnail is a clickbait.. the creator of the video is not saying to stop using AI
if someone only uses bike and car, he will loose the ability to run well. Gym is required to run well.
A better analogy is a soldier. A soldier cannot go to battle with just body and knife. He needs machine guns, and missiles. But soilders still need to do the drill to keep themselves fit, mentally and physically.
To ship relentlessly without getting tired. one needs a sharp and clear mind with focus. The techniques in the video will help achieve that. And a deeper understanding. The solution to all problems is always a deeper understanding in the long run.
With AI we are thinking at higher levels of abstractions, instead of lower levels. Thinking is must. Thinking is going nowhere. Investing an hour or two daily, sharpening thinking, will go a long way.
Goal 1. Ship the feature (cut the tree) 90% time Goal 2. Sharpen your axe (sharpen your thinking+focus - brain gym) 10% time
If we dont do goal 2, goal 1 will eventually get slowed down.