Leo · Oct 28, 2025 · 6:11 PM UTC

Leo

Leo

@AgenticLeo

Oct 28

I always default to GPT-5 Pro, but I forgot how amazing GPT-4.5 is for more natural tasks. It's the only model I reliably trust to write my communications and adapt writing to context, almost worth the Pro sub alone. But for most tasks, thinking-high and pro are hard to beat.

GLADIA Research Lab · Oct 27, 2025 · 2:34 PM UTC

Leo retweeted

GLADIA Research Lab

@GladiaLab

Oct 27

LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)

283

1,293

569

10,949

Leo · Oct 26, 2025 · 3:50 PM UTC

Leo

@AgenticLeo

Oct 26

It's a great standard to add on top of agents.md

AGENTS.md

AGENTS.md is a simple, open format for guiding coding agents. Think of it as a README for agents.

agents.md

Leo · Oct 26, 2025 · 3:47 PM UTC

Leo

@AgenticLeo

Oct 26

We need a .agentsignore How do we make this happen? I want to gate/unlock file/resource access independent of the gitignore! It could also be used for tool names too. It's a great and easy QOL improvement, please. I call on @OpenAIDevs @googleaidevs @claudeai @cursor_ai @embirico @OfficialLoganK @simpsoka

Leo · Oct 24, 2025 · 10:52 AM UTC

Leo

@AgenticLeo

Oct 24

Are Deep Research and Agent users still with us?

Leo · Oct 21, 2025 · 6:06 PM UTC

Leo

@AgenticLeo

Oct 21

Up next: website expose an MCP for the agent to work faster and avoid being forced to use a Human interface. Experts call it "API"

OpenAI

@OpenAI

Oct 21

10am PT.

Edoardo Cocciò · Oct 21, 2025 · 12:54 PM UTC

Leo retweeted

Edoardo Cocciò @0xdaddo

Oct 21

Self-Hosted LLMs Economics 101: A Guide on Estimating Memory Requirements of Local Models open.substack.com/pub/daddo3…

Leo · Oct 16, 2025 · 4:17 PM UTC

Leo

@AgenticLeo

Oct 16

Genuine question: does anyone actually use ManusAI daily? What does it do or what is it for you? A Lovable alternative? A deep research/agent alternative? Daily driver? Coding agent? There is something suspicious about it

Manus

@ManusAI

Oct 16

Introducing Manus 1.5 Faster, better quality results. Unlimited context. Build full-stack web apps, with real AI features, backends, user logins, custom domains and analytics. Run your biz, start your side hustle, or just have fun. If you can dream it, Manus can build it.

Leo · Oct 7, 2025 · 6:45 AM UTC

Leo

@AgenticLeo

Oct 7

Please @linear tell me you'll have a @ChatGPTapp app ready asap, I think it would make wonders for productivity

Leo · Oct 6, 2025 · 5:04 PM UTC

Leo

@AgenticLeo

Oct 6

Really wish I was there ngl...

OpenAI

@OpenAI

Oct 6

Live from DevDay with @sama. 10am PT. openai.com/live

Leo · Oct 3, 2025 · 9:37 AM UTC

Leo

@AgenticLeo

Oct 3

I have Sora 2 Pro access, but I am genuinely out of time and ideas to use it If you have any prompt or idea lmk

Leo · Oct 3, 2025 · 9:35 AM UTC

Leo

@AgenticLeo

Oct 3

Spoiler: Sora 2 Pro is just sora 2 with gpt-5-pro prompting

Tibor Blaho

@btibor91

Oct 3

Sora 2 Pro, the higher-fidelity video generation model, is now rolling out with options for both 10s and 15s duration videos, plus high and standard resolution

Andrew Curran · Oct 2, 2025 · 5:45 PM UTC

Leo retweeted

Andrew Curran

@AndrewCurran_

Oct 2

If you have never used a video model before, there is almost always a hidden language model as the middle link of the chain passing along your prompt. This is done for many reasons. This is also the first checkpoint where you will get a content refusal. This third model will also change your prompts sometimes, usually without telling you. Again, many reasons. Sometimes this is done to remove names of public or political figures, copyright violations, things like that. As they are usually instructed to do this covertly, this is teaching the model to deceive the user. Recent experiments have indicated this is not a good idea. They become the masks they wear. Sometimes though, depending on who is writing the system prompt, this model will also add to your prompt. If the model recognizes your intent it may add details that align with that intent, this is what Sora seems to be doing at a high level. Usually though, if a prompt is deemed to be too short, this is where the model may add descriptive language or setting details before sending it along. Sometimes companies will have a toggle or a checkbox for this, sometimes they don't mention it at all. The reason I'm bringing this up is it makes it difficult for anyone testing Sora right now to know how much of what we write is reaching the model verbatim, and how much of the generation is the result of GPT-5 adding context details, or writing dialog, lyrics, adding expressiveness, lore, character details, style, etc, to the prompt. Because Sora is too good at this. It is not normal. It is above the line. If this is GPT-5 doing man-in-the-middle, or this new Sora is somehow permanently integrated with a fine-tuned version of GPT-5, then sure. Normal progression. Everything Sora is doing could be replicated by writing a detailed prompt. But if Sora is doing what it seems to be doing from a one or two sentence prompt, then they made some kind of breakthrough. It knows too much. It's too good at song lyrics. It's too good at context. Too good at lore, tone, style. Characterization. It knows too much fine detail. If you want to see what I mean, make up any kind of show or topic and ask Sora to generate a splash intro and a theme song for you. The quality of song lyrics Sora will one-shot gen is too high. GPT-5, sure, no problem. Otherwise, this is something new. I mean, all of this could be explained by a multimodal model that has GPT-5 level intelligence and grasp of context, but outputs in multimodal tokens. This was actually how the original voice mode worked, the one we never got. It could sing, it could generate sound effects simultaneously with voice. It could tell stories while adding sound effects like the old radio shows. So, they have been working in this direction for some time. Maybe it just paid off.

339

Leo · Sep 21, 2025 · 10:45 PM UTC

Leo

@AgenticLeo

Sep 21

Working on a SINGLE prompt for 20 minutes and paying off with hours of saved time is one of the best feelings ever

Leo · Sep 20, 2025 · 8:55 PM UTC

Leo

@AgenticLeo

Sep 20

@TheRealAdamG do you know if there are any plans for this in the future? Even just for dev mode?

Leo · Sep 20, 2025 · 8:53 PM UTC

Leo

@AgenticLeo

Sep 20

We would go so far even with the basic metrics Codex/ClaudeCode show already. It's not just for coding!

Leo · Sep 17, 2025 · 1:11 PM UTC

Leo

@AgenticLeo

Sep 17

I've been running Codex CLI with the new model since it released. Yesterday it was a PEREFECT machine and I have been really impressed. Today gpt-5-codex feels like completely lobotomized?? It's making REALLY dumb mistakes and wasting my time, in the SAME TASKS that yesterday it nailed! Maybe some issues on my end, but I tried to rule them out (maybe my prompting is worse today? Doesn't seem like it). I have reverted back to CC and "normal" gpt-5 for now.

Leo · Sep 10, 2025 · 2:41 PM UTC

Leo

@AgenticLeo

Sep 10

Please, @OpenAIDevs, enable it for GPT-5-Pro and Projects!

Tibor Blaho

@btibor91

Sep 10

The "Developer mode" for ChatGPT Connectors, which enables unverified MCP connectors, is rolling out now

Noam Brown · Sep 7, 2025 · 6:59 PM UTC

Leo retweeted

Noam Brown

@polynoamial

Sep 7

Replying to @emollick

Unfortunately, once an eval like this becomes high profile it loses value because it’s pretty easy to maximize it with targeted data. It would have been a great benchmark to follow if it stayed under the radar.

354

Leo · Sep 2, 2025 · 6:04 PM UTC

Leo

@AgenticLeo

Sep 2

I think people don't like it cause it's MUCH more prompt-sensitive, and it takes a while to get a result back. So if you prompt it like, say, Sonnet 4, you'll get mid results after minutes. BUT If you craft a good prompt, it does reward you with a much better answer.

@levelsio

Sep 2

It's so annoying, it's so slow, I don't get why people think it's better, GPT-5 just feels like cost saving but my poll shows you guys like it better, somehow