inigo quilez · May 23, 2025 · 5:20 AM UTC

inigo quilez · May 23, 2025 · 5:20 AM UTC

inigo quilez

inigo quilez

@iquilezles

May 23

I've been giving a serious attempt at using Cursor in a C++ code base. I might still be using it wrong, but I've only managed to get it to write code that compiles and is also actually useful, once every 20 attempts or less. When it does succeed, it's limited to very narrow tasks, never large enough to offset the time wasted by commanding and helping the AI do the work. So as of today, the more I use Cursor, the bigger the productivity loss (and frustration), very far from the advertised claims. I haven't tried other competitor products though, but I'd expect the same unless there's some model out there trained through reinforcement learning instead of basic pattern memorization? Regardless I'll keep trying though because I really want the super-powers; live is short and I have lots of ideas to try. Or is my experience an outlier, and are other C++ developers actually successful with these tools?

May 23, 2025 · 5:20 AM UTC

108

665

Aras Pranckevičius 🇺🇦🇱🇹 · May 23, 2025 · 5:49 AM UTC

Aras Pranckevičius 🇺🇦🇱🇹 @aras_p

May 23

Replying to @iquilezles

Similar experience. Most utility is if you need to do something that 10000 people already did. Perfect for little one-use throwaway shell/python/js tools. If your needs are niche/original, AI seems to not help at all.

122

Sebastian Aaltonen · May 23, 2025 · 6:58 AM UTC

Sebastian Aaltonen

@SebAaltonen

May 23

Replying to @iquilezles

I use ChatGPT o3 (and earlier models) for 3d API questions. Never writing code directly. For obscure OpenGL stuff, it's actually useful. Vulkan documentation is also huge and slow to load, so it helps there too. I of course don't trust AI. AI is just one source of info.

107

eric provencher · May 23, 2025 · 12:09 PM UTC

eric provencher

@pvncher

May 23

Replying to @iquilezles

I’ve been building my own AI coding tool, and I’ll tell you, the issue with cursor is: - the default tier is optimized to maximally trim context - their agent has to find files in your codebase, the larger the worse it does - their default auto model is unknown That said, ai quality scales proportionally with training data, and c++ is not sufficiently represented on GitHub for most LLMs to be amazing it, compared to react js and typescript. Id be happy to help walk you through how I think you can best use available models, so we can find out if the issue is cursor or ai models in general.

Xor · May 23, 2025 · 11:06 AM UTC

Xor

@XorDev

May 23

Replying to @iquilezles

Its autocorrect thing is actually good. It saves me small amounts of time constantly. Still not there when it comes to solving novel problems

Michael Becker · May 23, 2025 · 5:48 AM UTC

Michael Becker @michae1becker

May 23

Replying to @iquilezles

in my experience there's a direct correlation between the triviality of the code being LLM-assisted and the productivity boost. Thus, in your case, I would expect practically zero productivity boost...

Rasmus Andersson · May 24, 2025 · 9:16 AM UTC

Rasmus Andersson

@rsms

May 24

Replying to @iquilezles

I’m using Claude & ChatGPT in just a web browser tab, daily, when working on Playbit (C and Linux kernel stuff.) I never ask it to make complete programs or anything but rather to give me a starting point which I then rewrite the last 40% of. Significant speed up for me, but these LLMs are far away from being able to “write code for me.” I find it useful to think of these LLMs/agents/whatever as prep cooks, not sous chefs. (They can save me time by chopping onions and shaving potatoes but wouldn’t be able to cook a meal)

Randall Hunt · May 23, 2025 · 12:15 PM UTC

Randall Hunt

@jrhunt

May 23

Replying to @iquilezles

I've always said that any sufficiently large C++ codebase is effectively its own language with its own paradigms and idioms. This is less true for most other languages. That said, PyTorch is an exceptionally complicated C++ codebase and I find current LLMs can handle it.

Gabriel Dechichi · May 23, 2025 · 1:19 PM UTC

Gabriel Dechichi

@gdechichi

May 23

Replying to @iquilezles

I don’t think you’re using it wrong, that’s the same experience I have, and I’ve tried multiple models for coding over the past 2 years since GPT 3.5. I think it’s at least partially training data related. I happen to know some React / web dev, and for these things the model is more useful than it is for games or 3D graphics. But this could also be because I’m much worse in web dev than games. Which may also explain why the claims of productivity boosts come exclusively from inexperienced developers or non-developers.

Cornus Ammonis · May 23, 2025 · 8:12 PM UTC

Cornus Ammonis @cornusammonis

May 23

Replying to @iquilezles

The hype around this stuff made me realize most of the big twitter dev accounts are actually webdevs writing endless boilerplate larping as engineers. o3/o4 and claude are great at that stuff. Not so much at doing anything novel.

Michal Staniszewski · May 23, 2025 · 8:31 AM UTC

Michal Staniszewski @bonzajplc

May 23

Replying to @iquilezles

Same here. But I use LLMs daily for code snippets that I’m too lazy to write on my own. Great as a rubber duck junior dev that sits next to you. Still I boostedmy productivity 4x and I’m not afraid of topics that I have zero expertise in (ie. netcode)

c0de517e/AngeloPesce · May 24, 2025 · 1:25 AM UTC

c0de517e/AngeloPesce @kenpex

May 24

Replying to @iquilezles

Use it as intellisense, not as a chat to directly write code, especially not from scratch.

Maxime Chevalier · May 23, 2025 · 2:28 PM UTC

Maxime Chevalier

@Love2Code

May 23

Replying to @iquilezles

AFAIK they can train LLMs to produce valid code by feeding that into a parser and using that as a reward signal for the model. The problem might be that they've done this for JS and Python but they haven't done it for C++, so it's good at generating valid JS but not C++.

vlad.near 🔴⚔️🛡🧙🏻‍♂️ · May 23, 2025 · 5:58 PM UTC

vlad.near 🔴⚔️🛡🧙🏻‍♂️

@vgrichina

May 23

Replying to @iquilezles

I found more success using Claude 3.7 directly with Aider CLI vs Cursor. I feel a problem with a lot of modern AI tools is that they trying to save money by using dumber models and/or smaller context but it rarely works.

Ailantd Sikowsky #Ceasefire · May 23, 2025 · 10:07 AM UTC

Ailantd Sikowsky #Ceasefire @ailantd

May 23

Replying to @iquilezles

My experience as well.

Jonathan Ellis · May 23, 2025 · 4:22 PM UTC

Jonathan Ellis

@spyced

May 23

Replying to @iquilezles

This can be (1) insufficient context; we're solving this with Brokk (brokk.ai) or (2) the model just isn't smart enough yet (wait a year and it probably will be)

Brokk – AI for Large Codebases

The first AI-native code platform that gives large projects compiler-grade context, so LLMs stay useful even when your repo spans millions of lines.

brokk.ai

Carlos DP 🤖🥊🇺🇸 · May 23, 2025 · 6:17 AM UTC

Carlos DP 🤖🥊🇺🇸

@the_carlosdp

May 23

Replying to @iquilezles

Idk, maybe C++ is a rough one for models still. I have a a very heavy and sophisticated Rust codebase, including lots of Bevy and Dioxus (both pre 1.0 frameworks with very little documentation), and Cursor does wonders to speed me up. Going from scratch doesn't work super well, but with existing codebase context, it can figure out how the current version of frameworks work, etc.

learntToCode · May 23, 2025 · 6:00 AM UTC

learntToCode

@learntToCode

May 23

Replying to @iquilezles

Maybe the meme is real

Fabio Filasieno ❄️ · May 23, 2025 · 5:42 AM UTC

Fabio Filasieno ❄️

@FilasienoF

May 23

Replying to @iquilezles

Very similar experience. it gets better with more tedious parts. - Project setup/initial code layout - Test layout - Given a data model, write a large number of operations on it - or, in general where there is a formal description of the task - Documentation Overall, it’s still quite time consuming and quite frustrating; psychologically it ends to become frustrating when you start considering it a search engine even if it does all best to look like an individual.

Volkan İlbeyli · May 23, 2025 · 5:30 AM UTC

Volkan İlbeyli

@Varaquilex

May 23

Replying to @iquilezles

Very similar experience here with rendering/systems code. Most tries are failures, occasionally its helpful. Maybe I'm bad at prompting. And also the training set likely mostly contains web stuff which is where the productivity gains seem to be. I found Grok to be decent among the models for C++, chatGPT the worst.

INT 16H · May 23, 2025 · 3:05 PM UTC

INT 16H

@int_16h

May 23

Replying to @iquilezles

I had a very positive recent experience with o4-mini-high. But it was not about writing code, but spotting a bug. I had > 4k lines of math-heavy CUDA code that I had already tested in small chunks, but it was not working as a whole. I spent at least 5 hours trying to debug it, and I had no luck. I just dumped all 4k lines of code into o4-mini-high, and it managed to find a bug on a second attempt. I was just a one-symbol typo :(. Now I quite often just dump my code to llm and ask if it can find a bug or a typo, just as a preventive measure.

エドゥアルド · May 23, 2025 · 8:04 AM UTC

エドゥアルド @eduardo98m

May 23

Replying to @iquilezles

I use regularly chat gpt/ Gemini for assistance in debugging c++ code. It's good enough to give examples, spotting and solving some bugs in functions, and generating docstrings. It it's very hard to get them to do a very large task exactly how you want them to do.

Quiveron · May 23, 2025 · 5:35 AM UTC

Quiveron

@quiveron_x

May 23

Replying to @iquilezles

More nuances there is in the codebase harder these tools fails, you have to write longer prompts and give more meaningful context for them to have honest shot, but please be careful code they write requires careful review as they introduce subtle bugs.

David Brown · May 23, 2025 · 4:17 PM UTC

David Brown

@davidwbrw

May 23

Replying to @iquilezles

I agree, cursor is not useful for serious projects (yet). My preferred workflow is to do the work myself and ask an LLM coding questions as needed.

Michal Jareš · May 23, 2025 · 6:32 AM UTC

Michal Jareš

@michal_jares

May 23

Replying to @iquilezles

It mostly saves time for me not having to lookup boilerplate and working mostly as an overkill intelisense so I don't have to look up the definitions of common classes I rarely use. I use the chat on environments where I have no idea what's happening to gather info about the codebase or in languages that I don't use often

blu (🇺🇦 in NATO) · May 23, 2025 · 5:44 AM UTC

blu (🇺🇦 in NATO) @blu51899890

May 23

Replying to @iquilezles

Same experience here. These tools are good for autocompletion, useless otherwise.

rygo6 · May 23, 2025 · 11:06 AM UTC

rygo6

@_rygo6

May 23

Replying to @iquilezles

Cursor is great for writing CSS and some more basic HTML. Not writing actual code. For code it's better to pose questions to Claude in a textbook like Q&A format.

Adrian Sanchez 🎮 · May 23, 2025 · 5:25 AM UTC

Adrian Sanchez 🎮

@SunsetLearn

May 23

Replying to @iquilezles

Out of curiosity, are you using the paid version of Cursor? At this point, they plug into several SOTA models in the paid version, and some of them are considerably better than others. I don't think it is perfect yet, but it's gotten a lot better for me.

Nick Tasios · May 23, 2025 · 1:31 PM UTC

Nick Tasios

@Karyuutensei

May 23

Replying to @iquilezles

I’ve had some luck with chat based LLMs on helping me with SIMD since I don’t know all the possible intrinsics. But similarly, it’s more miss than hit, and the longer the conversation becomes, the less the chance it’s going to be useful.

Tushar Shrivastav · May 23, 2025 · 2:59 PM UTC

Tushar Shrivastav

@tspython6535

May 23

Replying to @iquilezles

ya i would avoid using AI for any systems level programming its just not good, maybe because of less training data available? i'm not sure ive tried it multiple times it sucks.

~~ · May 23, 2025 · 5:45 AM UTC

~~ @Devan_W_

May 23

Replying to @iquilezles

they don't work that great in existing codebases. work best when starting from scratch and making everything overly modular. found the most success with using o3 to generate a structure and gemini 2.5 to fill it out

Minmin Gong · May 23, 2025 · 1:57 PM UTC

Minmin Gong @gongminmin

May 23

Replying to @iquilezles

You are not using it wrong. This is a fundamental problem of LLM due to RLHF. They can never go beyond the training data's quality. For C/C++, the training data contains so many idioms from different era, most of them are bad actually. As a result, garbage in garage out.

Kapil Jain · May 23, 2025 · 11:47 AM UTC

Kapil Jain @JnKapil

May 23

Replying to @iquilezles

I’m curious whether Google and other big tech companies have trained models on their internal C++ codebases for their engineers.

V · May 23, 2025 · 11:03 AM UTC

V @ualogic

May 23

Replying to @iquilezles

I don't think it is able to write anything novel, however if I am implementing something that exists already, it helps to shorten search times by 10x. Also routine functional scripting like Python file processing for personal needs is amazing with it.

Matthew Craig · May 23, 2025 · 3:14 PM UTC

Matthew Craig

@matthewcraig42

May 23

Replying to @iquilezles

I find it can be useful used very carefully but the hype is past the reality when it comes to difficult real world tasks, you need to leverage your rules file to get useful output, describe your project layout, coding standards etc Claude has been the best model I’ve found for C++

Orion Morales · May 23, 2025 · 1:36 PM UTC

Orion Morales

@MOOOOrion

May 23

Replying to @iquilezles

Try Gemini 2.5 in Cursor if you haven’t, I haven’t tried with C++ but it’s by far the best model.

Daniel Salazar · May 23, 2025 · 5:14 PM UTC

Daniel Salazar

@dsalazar

May 23

Replying to @iquilezles

I have medium sized objective c codebases. I do not use Cursor. The best way I've found is to send the context using some file selectors including the objects involved, sometimes the files are too large and will need to trim to functions. Claude works best most of the time.