I really like the cross-LM workflow. For example, you use o3 to suggest fixes, and then you copy o3's proposed fix, paste it into Gemini, and ask for a full implementation. Once implemented by Gemini, you show the updated code to o3 (and also any feedback from the runtime) and ask to criticize, propose further fixes, or confirm that all is good. Each model, used in isolation, eventually falls into self-repetition. By this back and forth with two models, I feel like I'm less stuck in such self-repetition loops. And Gemini codes so much faster than others!

Jul 5, 2025 · 6:50 AM UTC

32
9
1
251
Replying to @burkov
with all this hoop-jumping I have to ask, "What are you building?"
1
4
It's stealth for now, but I see the light at the end of the tunnel, so stay tuned!
1
13
Replying to @burkov
I've found you can do this with the same model if you just tell the model that the content was from a different model.
1
3
Yes, but it's not as effective. If you take a copy of your code that Gemini just reproduces in the output saying it was "fixed" and just start a new conversation with this code, it will converge again very fast.
1
8
Replying to @burkov
why dont you just write the codes instead of copying back and forth?
2
1
Write code? No, thanks, past this phase for good.
1
Replying to @burkov
youve become an agent
Always was!
Replying to @burkov
glad to hear this insight and some positivity on your end Andriy! thanks for the tip i’ll try this on my workflow
I'm always positive about good things and negative about bad ones. Unfortunately, the quantity of bad things (lies mostly) has been too large lately.
2
1
Replying to @burkov
I am amazed that those so-called steps towards AGI can't detect that they run in circles. Paying by the API calls doesn't give an incentive to fix it.
-- I need A and B. -- I see, you need A and not B. Wait while I'm printing for 15 minutes. -- I said A AND B. -- Ah, now I see. You just need B. I will print for 15 more minutes; you can go grab a cup of coffee. -- I said BOTH A AND B. -- [The user seems frustrated for no reason] I understand your frustration; here's your solution for C and NOT A. Give it a spin and let me know if you see any edge cases!
1
9
Replying to @burkov
This ensemble approach is a case where diversity actually is a strength
Replying to @burkov
You’d love @RepoPrompt ! Check out @pvncher ‘s latest video of exactly this workflow
1
Replying to @burkov
Cool concept, I’ve been doing that subconsciously, especially when one model fails. You get a hang of what model to use when. Also manage context externally, in various markdowns. This is context engineering 🫡
1
Replying to @burkov
I added a proposal to @OpenCode_AI - seems like a perfect use case.
1
Replying to @burkov
I also use your approach, although all this copy and past is kind of anoying. I don't why anyone come up with such a fully integrated approach for VS code...
1
Replying to @burkov
We discovered this too in @jivaAI. Cross checks or verification across models works well.
1
Replying to @burkov
Can you describe your processes here re: the platforms and tools you’re using? I currently am doing this with (kind of) by sharing GitHub links, but I feel like there must be a more efficient way.
1
Replying to @burkov
Yep I do this as well
Replying to @burkov
Why do you think that is? Token choice? Preferred CoT?
Replying to @burkov
I've been doing this multi-model approach for several months now. My feeling is that the tunnel vision most models succumb to is usually due to the extended context of the current thread. So, as someone else said, even if you don't switch models, if you just present the prompt to the same model in a new thread, the model has a more open perspective without the extended context of the original thread. So it sees the code with "fresh eyes" somewhat like we do after sleeping on a problem - how we often wake up with a clear answer. But I also agree that presenting the problem to a reasoning model such as o3 or deep think R1, provides more useful insights. Which I can then take back to 4.1 to help me implement.
Replying to @burkov
nice, I use pastemax for "context engineering" then use the prompt in 2 or 3 LLMs (g 2.5 pro, o3, r1). check their responses, pick the one that I think is the best approach and then use that as implementation plan for g 2.5 or 2.0 flash in roo code (blazing fast to implement).
Replying to @burkov
Definitely. I have one o3 chat for high-level thinking, then gemini for mid-level and claude for tactical implementation. I often "uplevel" a chat when i think its not correct to the next AI up in the stack. and above o3 is a deep research implementation of o3.
Replying to @burkov
This is the way.
Replying to @burkov
Good idea
Replying to @burkov
This seems like the way to go actually. Using different models together by breaking up the workflow. Also controls for errors better as you say.
Replying to @burkov
Thank you man.
Replying to @burkov
LLM pair programming :)))
Replying to @burkov
It’s basically a community of AIs helping each other stay grounded and avoid wild hallucinations.
Replying to @burkov
what about work flow where multi model work without copy paste? will it work?