Building voice agent infra, CEO at @uselayercode. Prev founded @pusher, @thoughtbot

London, England
Joined November 2006
Codex beast mode unlocked today with this prompt: Read TODO.txt and for each tasks todo, one-by-one sequentially letting the last task finish before starting the next, run `codex exec --yolo "TASK_DESCRIPTION. Rules: Update existing and write new test cases before implementing. When task is complete and all tests pass, mark task as done [x] in TODO.txt, then git commit changes." &> tmp/TASK_TITLE.txt`. Read TASK_TITLE.txt and if there are any issues or error that need addressing, the task can be resumed with additional instructions by running `codex exec resume SESSION_ID --yolo "ADDITIONAL_INSTRUCTIONS" &> tmp/TASK_TITLE_N.txt` (SESSION_ID or 'session id' can be found in the previous TASK_TITLE.txt ouput). When the task output describes success, and all changes have been commited so that there are no unstaged changes in the repo, then continue with the next undone task in TODO.txt.
1
13
Best practice for having a master Codex spawn a new Codex per task in a list? Do I just tell it to use tmux with codex exec?
This one weird trick will 10x your productivity
1
2
13
Extremely satisfying when you get a Codex run going all night
I gave codex a markdown to keep track of progress and let it chirp away on a massive linter debt, and it worked all night and fixed around ~6000 linter/type issues. (it would stop but I queued a massive amount of continue's to keep it working) Part of my prompt was to google whenever it's stuck and always update the tracker file when it learn sth new. This seems to have worked... it's still working.
1
1
Codex is getting so human like I just caught it 'Pausing for rest..."
Cartesia Sonic-3 TTS is now available on @uselayercode! This is the fastest, one of the most natural and expressive text-to-speech model to date. Perfect for real-time voice AI agents. Well done @cartesia_ai team! Pairs great with Deepgram Flux on our edge cloud for a super low latency STT + TTS combo.
New: @cartesia_ai Sonic-3 TTS is now available on Layercode 🔥 The fastest and most expressive production-ready TTS model — with 90ms model latency and 190ms end-to-end. How is Sonic-3 different to other TTS models? 👇
1
6
Damien C. Tanner retweeted
get Codex to work for hours with this prompt: > it still isn't working. please run things and test them until its fixed
wow, codex CLI worked on a problem with this feature uninterrupted for over 1.5hrs and actually solved it 🤯 pretty incredible @embirico @OpenAICodexCli @thsottiaux @markerdmann @fouadmatin
We’re excited to be bringing @inworld_ai TTS models to @uselayercode very soon 🚀
Inworld TTS 1 Max is the new leader on the Artificial Analysis Speech Arena Leaderboard, surpassing MiniMax’s Speech-02 series and OpenAI’s TTS-1 series The Artificial Analysis Speech Arena ranks leading Text to Speech models based on human preferences. In the arena, users compare two pieces of generated speech side by side and select their preferred output without knowing which models created them. The speech arena includes prompts across four real-world categories of prompts: Customer Service, Knowledge Sharing, Digital Assistants, and Entertainment. Inworld TTS 1 Max and Inworld TTS 1 both support 12 languages including English, Spanish, French, Korean, and Chinese, and voice cloning from 2-15 seconds of audio. Inworld TTS 1 processes ~153 characters per second of generation time on average, with the larger model, Inworld TTS 1 Max processing ~69 characters on average. Both models also support voice tags, allowing users to add emotion, delivery style, and non-verbal sounds, such as “whispering”, “cough”, and “surprised”. Both TTS-1 and TTS-1-Max are transformer-based, autoregressive models employing LLaMA-3.2-1B and LLaMA-3.1-8B respectively as their SpeechLM backbones. See the leading models in the Speech Arena, and listen to sample clips below 🎧
1
3
Codex worked for over 20 hours this week in a single refactoring task (with lots of sub tasks I’ll add) for us, and the whole thing freaking works!! I am speechless. I feel like I’ve got a glimpse into what it’s like inside OpenAI building with codex today.
Codex has transformed how OpenAI builds over the last few months. Have some great upcoming models too. Amazing work by the team!
2
Biggest pain point we have with Codex right now is it leaving around and writing duplicated code. Fixes? is knip.dev the best thing @steipete?
What's the best prompt to get Codex to speak like a normal person. Instead of "Kick the close subject so the downstream 30s timeout starts ticking" I just want it to say "Trigger close subject and begin 30s timeout"
Now our job as developers is mostly reviewing, and salvaging tasks that codex fails at.
Codex web doesn’t have an official api. But I was able to reverse engineer the api used by the Codex cli cloud task viewer (I mean, codex did the reverse engineering on itself).
Connected our kanban board in Notion to auto create Codex web tasks. 100x developer mode activated.
1
3
Try it yourself — switching takes 30 seconds in the @uselayercode dashboard. If you haven’t already, sign up for a free @uselayercode developer account and try it yourself with $100 free credits → dash.layercode.com/sign-up
But model speed is only half the equation. Network latency matters just as much! Running Flux at the edge via Cloudflare's 330+ locations allows @uselayercode to process audio in sub-50ms + take full advantage of Flux’s smart turn-taking. The end result = voice agents that respond with human-like timing to users anywhere in the world.
Flux is the first Conversational Speech Recognition model with intelligent turn-taking embedded directly* in the model. ~260ms end-of-turn detection, without sacraficing accuracy = voice agent conversations that feel natural.
The #1 voice AI complaint: Agents cut you off mid-sentence or wait awkwardly after you finish. Why? Traditional STT models don't understand dialogue flow because they were built for transcription, not conversation. @deepgramai’s amazing new Flux model solves this…
New in @uselayercode: Voice agents that actually know when you're done talking. No awkward interruptions, or robotic pauses. Powered by @deepgram Flux + Layercode's edge network.
New: @DeepgramAI Flux is available on Layercode. Flux is the first STT model built specifically for voice agents. Even better: We run Flux at the network edge ⚡ Layercode voice agents process audio in sub-50ms and take full advantage of Flux's smart turn-taking to respond with human-like timing to users anywhere in the world.
1
2
Made a vibe coding app starter repo based on the lindy (i.e. vanilla js and css) setup that works really well with codex and Claude Code. github.com/layercodedev/hono…
Best stack I've used for vibe coding is not React with Tailwind. It's Hono with server-side rendered pages. I like to use server-side JSX templating. With plain or vanilla CSS, and when some client-side interactivity is required, either reach for HTMX or vanilla JavaScript. The reason this works so well: LLMs know vanilla JavaScript and CSS better than anything else. Frameworks like React or Tailwind CSS just add another layer of complexity that the LLM has to translate from to get to what you want. Layout issues and other problems I had with a more modern stack are gone now.