The Wrong First Test
Why LLL can look unimpressive on toy demos, and why the real difference appears in larger, iterated projects.
If your first prompt is a tiny demo app, you are probably testing the wrong thing.
Why small demos can disappoint
LLLTS is usually a worse show-off tool for toy tasks than looser AI coding workflows.
If you ask for a tiny app or novelty feature, it still has to do the expensive parts:
- follow structural rules
- require explicit specs
- produce companion tests
On a trivial project, that extra discipline can make the workflow look slower while producing an end result that does not look dramatically different from ordinary generated TypeScript. That is a real tradeoff, not a bug.
Where the difference shows up
LLL becomes more valuable as the project stops being a toy.
The point is not that it creates a perfect system in one shot. It does not. The point is that when the codebase keeps growing, the same structural pressure keeps applying every time a new feature lands.
That changes the failure mode. In weaker workflows, early progress can be fast, then quality falls apart as the project grows: files bloat, boundaries blur, regressions appear, and "fix the bug" turns into a recurring ritual.
With LLL, the conversation more often shifts upward:
- "I changed my mind about how this should work."
- "This is close, but I meant a different behavior."
- "I did not specify this edge clearly enough."
Those are still corrections, but they are different from repeatedly cleaning up preventable chaos.
How to evaluate it properly
Do not paste your whole product vision into one prompt and hope for a perfect result.
LLL still works best incrementally:
- choose a real project or a real subsystem
- start with one meaningful feature, not a throwaway demo
- keep working through several changes, not just the first scaffold
- judge the workflow by how stable the third, fourth, and fifth change feel
That is where the difference tends to become visible: the codebase stays more coherent while the work keeps moving.
A better first trial
A better first trial is something that has enough pressure to reveal whether discipline helps:
- a nontrivial innovative tool with many moving parts
- a system with state, rules, and edge cases
- something you expect to extend more than once
If the task is so small that raw speed is the only thing you can measure, you are unlikely to see why LLL exists.