I've been giving a serious attempt at using Cursor in a C++ code base. I might still be using it wrong, but I've only managed to get it to write code that compiles and is also actually useful, once every 20 attempts or less. When it does succeed, it's limited to very narrow tasks, never large enough to offset the time wasted by commanding and helping the AI do the work. So as of today, the more I use Cursor, the bigger the productivity loss (and frustration), very far from the advertised claims. I haven't tried other competitor products though, but I'd expect the same unless there's some model out there trained through reinforcement learning instead of basic pattern memorization? Regardless I'll keep trying though because I really want the super-powers; live is short and I have lots of ideas to try. Or is my experience an outlier, and are other C++ developers actually successful with these tools?

May 23, 2025 · 5:20 AM UTC

Replying to @iquilezles
Similar experience. Most utility is if you need to do something that 10000 people already did. Perfect for little one-use throwaway shell/python/js tools. If your needs are niche/original, AI seems to not help at all.
4
1
4
122
Replying to @iquilezles
I use ChatGPT o3 (and earlier models) for 3d API questions. Never writing code directly. For obscure OpenGL stuff, it's actually useful. Vulkan documentation is also huge and slow to load, so it helps there too. I of course don't trust AI. AI is just one source of info.
3
1
107
Replying to @iquilezles
I’ve been building my own AI coding tool, and I’ll tell you, the issue with cursor is: - the default tier is optimized to maximally trim context - their agent has to find files in your codebase, the larger the worse it does - their default auto model is unknown That said, ai quality scales proportionally with training data, and c++ is not sufficiently represented on GitHub for most LLMs to be amazing it, compared to react js and typescript. Id be happy to help walk you through how I think you can best use available models, so we can find out if the issue is cursor or ai models in general.
2
1
35
Replying to @iquilezles
Its autocorrect thing is actually good. It saves me small amounts of time constantly. Still not there when it comes to solving novel problems
2
32
Replying to @iquilezles
in my experience there's a direct correlation between the triviality of the code being LLM-assisted and the productivity boost. Thus, in your case, I would expect practically zero productivity boost...
2
25
Replying to @iquilezles
I’m using Claude & ChatGPT in just a web browser tab, daily, when working on Playbit (C and Linux kernel stuff.) I never ask it to make complete programs or anything but rather to give me a starting point which I then rewrite the last 40% of. Significant speed up for me, but these LLMs are far away from being able to “write code for me.” I find it useful to think of these LLMs/agents/whatever as prep cooks, not sous chefs. (They can save me time by chopping onions and shaving potatoes but wouldn’t be able to cook a meal)
1
16
Replying to @iquilezles
I've always said that any sufficiently large C++ codebase is effectively its own language with its own paradigms and idioms. This is less true for most other languages. That said, PyTorch is an exceptionally complicated C++ codebase and I find current LLMs can handle it.
1
9
Replying to @iquilezles
I don’t think you’re using it wrong, that’s the same experience I have, and I’ve tried multiple models for coding over the past 2 years since GPT 3.5. I think it’s at least partially training data related. I happen to know some React / web dev, and for these things the model is more useful than it is for games or 3D graphics. But this could also be because I’m much worse in web dev than games. Which may also explain why the claims of productivity boosts come exclusively from inexperienced developers or non-developers.
1
1
8
Replying to @iquilezles
The hype around this stuff made me realize most of the big twitter dev accounts are actually webdevs writing endless boilerplate larping as engineers. o3/o4 and claude are great at that stuff. Not so much at doing anything novel.
1
7
Replying to @iquilezles
Same here. But I use LLMs daily for code snippets that I’m too lazy to write on my own. Great as a rubber duck junior dev that sits next to you. Still I boostedmy productivity 4x and I’m not afraid of topics that I have zero expertise in (ie. netcode)
4
Replying to @iquilezles
Use it as intellisense, not as a chat to directly write code, especially not from scratch.
3
Replying to @iquilezles
AFAIK they can train LLMs to produce valid code by feeding that into a parser and using that as a reward signal for the model. The problem might be that they've done this for JS and Python but they haven't done it for C++, so it's good at generating valid JS but not C++.
2
Replying to @iquilezles
I found more success using Claude 3.7 directly with Aider CLI vs Cursor. I feel a problem with a lot of modern AI tools is that they trying to save money by using dumber models and/or smaller context but it rarely works.
1
Replying to @iquilezles
My experience as well.
1
Replying to @iquilezles
This can be (1) insufficient context; we're solving this with Brokk (brokk.ai) or (2) the model just isn't smart enough yet (wait a year and it probably will be)
Replying to @iquilezles
Idk, maybe C++ is a rough one for models still. I have a a very heavy and sophisticated Rust codebase, including lots of Bevy and Dioxus (both pre 1.0 frameworks with very little documentation), and Cursor does wonders to speed me up. Going from scratch doesn't work super well, but with existing codebase context, it can figure out how the current version of frameworks work, etc.
Replying to @iquilezles
Maybe the meme is real
1
35
Replying to @iquilezles
Very similar experience. it gets better with more tedious parts. - Project setup/initial code layout - Test layout - Given a data model, write a large number of operations on it - or, in general where there is a formal description of the task - Documentation Overall, it’s still quite time consuming and quite frustrating; psychologically it ends to become frustrating when you start considering it a search engine even if it does all best to look like an individual.
15
Replying to @iquilezles
Very similar experience here with rendering/systems code. Most tries are failures, occasionally its helpful. Maybe I'm bad at prompting. And also the training set likely mostly contains web stuff which is where the productivity gains seem to be. I found Grok to be decent among the models for C++, chatGPT the worst.
3
15
Replying to @iquilezles
I had a very positive recent experience with o4-mini-high. But it was not about writing code, but spotting a bug. I had > 4k lines of math-heavy CUDA code that I had already tested in small chunks, but it was not working as a whole. I spent at least 5 hours trying to debug it, and I had no luck. I just dumped all 4k lines of code into o4-mini-high, and it managed to find a bug on a second attempt. I was just a one-symbol typo :(. Now I quite often just dump my code to llm and ask if it can find a bug or a typo, just as a preventive measure.
1
9
Replying to @iquilezles
I use regularly chat gpt/ Gemini for assistance in debugging c++ code. It's good enough to give examples, spotting and solving some bugs in functions, and generating docstrings. It it's very hard to get them to do a very large task exactly how you want them to do.
5
Replying to @iquilezles
More nuances there is in the codebase harder these tools fails, you have to write longer prompts and give more meaningful context for them to have honest shot, but please be careful code they write requires careful review as they introduce subtle bugs.
4
Replying to @iquilezles
I agree, cursor is not useful for serious projects (yet). My preferred workflow is to do the work myself and ask an LLM coding questions as needed.
3
Replying to @iquilezles
It mostly saves time for me not having to lookup boilerplate and working mostly as an overkill intelisense so I don't have to look up the definitions of common classes I rarely use. I use the chat on environments where I have no idea what's happening to gather info about the codebase or in languages that I don't use often
1
3
Replying to @iquilezles
Same experience here. These tools are good for autocompletion, useless otherwise.
3
Replying to @iquilezles
Cursor is great for writing CSS and some more basic HTML. Not writing actual code. For code it's better to pose questions to Claude in a textbook like Q&A format.
2
Replying to @iquilezles
Out of curiosity, are you using the paid version of Cursor? At this point, they plug into several SOTA models in the paid version, and some of them are considerably better than others. I don't think it is perfect yet, but it's gotten a lot better for me.
2
Replying to @iquilezles
I’ve had some luck with chat based LLMs on helping me with SIMD since I don’t know all the possible intrinsics. But similarly, it’s more miss than hit, and the longer the conversation becomes, the less the chance it’s going to be useful.
2
Replying to @iquilezles
ya i would avoid using AI for any systems level programming its just not good, maybe because of less training data available? i'm not sure ive tried it multiple times it sucks.
2
Replying to @iquilezles
they don't work that great in existing codebases. work best when starting from scratch and making everything overly modular. found the most success with using o3 to generate a structure and gemini 2.5 to fill it out
2
Replying to @iquilezles
You are not using it wrong. This is a fundamental problem of LLM due to RLHF. They can never go beyond the training data's quality. For C/C++, the training data contains so many idioms from different era, most of them are bad actually. As a result, garbage in garage out.
2
Replying to @iquilezles
I’m curious whether Google and other big tech companies have trained models on their internal C++ codebases for their engineers.
2
Replying to @iquilezles
I don't think it is able to write anything novel, however if I am implementing something that exists already, it helps to shorten search times by 10x. Also routine functional scripting like Python file processing for personal needs is amazing with it.
2
Replying to @iquilezles
I find it can be useful used very carefully but the hype is past the reality when it comes to difficult real world tasks, you need to leverage your rules file to get useful output, describe your project layout, coding standards etc Claude has been the best model I’ve found for C++
1
Replying to @iquilezles
Try Gemini 2.5 in Cursor if you haven’t, I haven’t tried with C++ but it’s by far the best model.
1
Replying to @iquilezles
I have medium sized objective c codebases. I do not use Cursor. The best way I've found is to send the context using some file selectors including the objects involved, sometimes the files are too large and will need to trim to functions. Claude works best most of the time.
1