AGI at Keen Technologies, former CTO Oculus VR, Founder Id Software and Armadillo Aerospace

Dallas, TX
Joined August 2010
I recently learned about Cayley transforms. Similar to how you can parameterize a 3x3 rotation matrix by 3 Euler angles or a 4 element quaternion, Cayley transforms allow you to parameterize an N dimensional rotation matrix with just N*(N-1)/2 unique values in a skew-symmetric matrix, saving more than half the parameters and guaranteeing that the matrix will always be orthogonal. Unfortunately, the transformation involves a linalg.solv() or pinverse(), so it gets slow with thousands of dimensions. Still, I am happy to have this in my mental toolbox now! en.wikipedia.org/wiki/Cayley…
82
92
15
1,914
The scrolling console has been a valuable paradigm since physical teletypewriters in the 60s, and millions of programs have coded directly to that interface. It feels like we missed a sweet spot in output options just beyond that. There are many times I use a jupyter notebook just because I want to “print some images”, and I wish I could instead have traditional console programs write a simple escape sequence header and dump 8 bit RGB image data to the terminal. Some terminals like Kitty do offer image capabilities, but chunking and base64 encoding the data takes it from “just print” to “import and call a library function”, and the capabilities aren’t going to be shared by all the other terminal outputs like tmux, Visual Studio Code, and powershell. Terminal support is one of the ugliest Unix legacies, with vestiges of literally 1970’s technology still getting in the way today, but I despair of it getting really cleaned up. It took over a decade for arrow keys to mostly work. As an aside, I have written exactly one ncurses program, ten years ago: the cockpit display for the engine control on the Rocket Racing League rocket planes. Web pages largely won the UI war with a retained mode interface, but imperative, immediate mode control always has an appeal to me. It is notable that applications are almost never coded directly to the  immediate mode graphics APIs like GDI, or the hardware accelerator queues that lay below them; they are not “useful enough” by themselves, so substantial GUI frameworks are interposed between the API and the application. Games are an exception (or at least used to be; rather less common today) with direct coding to D3D/OpenGL/Metal/Vulkan, but it is orders of magnitude more work than the classic “hello world”, and doing it from scratch is a badge of honor today. Wrapping and scrolling text in a terminal window is “useful”. If we just had a binary stretch-blit option…
John Carmack retweeted
Book list for street fighting computer scientists
23
252
14
2,437
Remember radiosity? There was a period where it was a hot topic for global illumination, but ray tracing based solutions proved so much more flexible and efficient at scale that it is now just a historical curiosity. en.wikipedia.org/wiki/Radios…
77
46
11
1,033
My distracting thought for the morning is that you could have actually made a decent rhythm game on 8 bit Apple 2 hardware, but I don’t recall any examples. Screen presentation and effects would have to be minimal, and you wouldn’t get any high fidelity audio, but you could totally do a “drum simulator” with sub millisecond accuracy and response time from an interleaved polling and speaker drive loop. Combined with all of the game engagement mechanics that don’t require much processing, it could have been compelling. 40 years ago…
I wonder if the reticle limits in chip fabrication could be expanded if you were restricted to using less dense patterning in the periphery where the focus isn’t as sharp. Or if the quality of focus is radial, could you pattern slightly more circuitry in a circular die than the conventional square inscribed in it. Yield argues against ever larger dies, but there may remain edge cases where giant dies remain more appealing than chiplets.
I quickly rehashed a few of the arguments off the top of my head for the LIBBA folks a couple years ago. My half of the exchanges: — I argued strenuously against building a fully custom XR OS inside Meta, and I still don't think it is a good idea.The benefits that a custom OS can bring are outweighed by the development costs and burden placed on new platform developers. — I always felt that the ultra-constrained-platform argument for a new OS wasn't very good.  If the platform really needs to watch every cycle that tightly, you aren't going to be a general purpose platform, and you might as well just make a monolithic C++ embedded application, rather than a whole new platform that is very likely to have a low shelf life as the hardware platform evolves. Foveated rendering is arguably a net loss on Quest Pro vs just using that same power to run the GPU at a higher clock speed.  In no way is it a dramatic win.  I can imagine scenarios where it wins, but the wins are harder than most people expect.  For today's see-through AR, the field of view isn't big enough to justify it even in the best cases. SCHED_FIFO and friends are already good enough for real time performance on Linux derived systems. I agree that there is a good place for SLAM to adaptively turn down to just tracking a few points on a single camera when the head is nearly stationary, and this would be valuable, — I am all for building something from scratch to directly satisfy user needs. The mistake that I see people making when talking about building a new OS is believing that the creation of a new platform will be a draw for third party developers to commit resources.  If you aren't counting on that, and all OS efforts are because they will directly enable your own first party value creation, then great -- although I am still skeptical about the AR value proposition with today's technology. —
I had been meaning to comment on @Jonathan_Blow’s “Why can’t we even conceive of writing a new OS today” post. Coincidentally, I just got an email that opened with: ——————————————— Hey John, 2 years ago I pitched you LIBBA - a dedicated OS for smart glasses. You were sceptical, your main concern was that a custom OS rarely justifies itself: cost, shelf life, and developer burden outweigh the benefits. You were right. ——————————————— I deeply love the ideals of clear, efficient programs that do their job without baggage, and I have always been very sympathetic to efforts like Oberon, Plan 9, and even TempleOS. But building a new operating system today doesn’t make any product sense. Meta spent a lot of resources working on a fully custom XROS, over my rather strenuous objections. They had top tier engineering talent, tons of support, and they were producing high quality code and docs. It was a best case scenario from a “new OS” perspective, and, as one of the engineers put it, “If we can’t do it, who could?” I wish I could drop (so many of) my old internal posts publicly, since I don’t really have the incentive to relitigate the arguments today – they were carefully considered and prescient. They also got me reported to HR by the manager of the XROS effort for supposedly making his team members feel bad, but I expect many of them would acknowledge in hindsight that the Meta products would not be in a better place today if the new OS effort had been rammed into them. I can only really see a new general purpose OS arriving due to essentially sacrificing a highly successful product’s optimality to the goal of birthing the new OS, and I wouldn’t do that myself as a stakeholder. To make something really different, and not get drawn into the gravity well of existing solutions, you practically need an isolated monastic order of computer engineers. Which was sort of Plan 9…
99
123
53
2,364
The premise of the Ring of Fire books is the transportation of a modern West Virginia town to 17th century Europe, and I have been enjoying the alternate history quite a bit. They aren’t for everyone; the first few chapters of book 3 revolve around an industrial accident in a petroleum refinery cobbled together from 17th century tech, which probably doesn’t meet the literature bar of many. Beyond just fun, I find them wholesome. It is obviously about the power of technology, but also organization and ideals, and it points out pockets of high competence existing in every time and place.
25
8
2
243
Starlink announced a $5/month plan that gives unlimited usage at 500kbits/s. Modern apps and web pages would immediately back up every buffer and make for a painful experience, but it is fun to consider optimizing inside that tight box (in the 90s, 4x ISDN was high end!). With Starlink’s good latency, input-distribution multiplayer games would still work fine, as long as they didn’t download anything. You could even have voice chat. With an optimal implementation, you could scroll the X feed as fast as you want with full text, with progressive images coming in when you pause, but instantly ceasing bandwidth use as you start scrolling again. Remote shells would work well by default, but we could do a lot better than standard ANSI for complex updates. Server-rendered web pages or apps could at least be progressively rendered and text-first, with full responsiveness even when the fidelity is low.
There have been a lot of crazy many-camera rigs created for the purpose of capturing full spatial video.  I recall a conversation at Meta that was basically “we are going to lean in as hard as possible on classic geometric computer vision before looking at machine learning algorithms”, and I was supportive of that direction. That was many years ago, when ML still felt like unpredictable alchemy, and of course you want to maximize your use of the ground truth! Hardcore engineering effort went into camera calibration, synchronization, and data processing, but  it never really delivered on the vision. No matter how many cameras you have, any complex moving object is going to have occluded areas, and “holes in reality” stand out starkly to a viewer not exactly at one of the camera points. Even when you have good visibility, the ambiguities in multi camera photogrammetry make things less precise than you would like. There were also some experiments to see how good you could make the 3D scene reconstruction from the Quest cameras using offline compute, and the answer was still “not very good”, with quite lumpy surfaces. Lots of 3D reconstructions look amazing scrolling by in the feed on your phone, but not so good blown up to a fully immersive VR rendering and put in contrast to a high quality traditional photo. You really need strong priors to drive the fitting problem and fill in coverage gaps. For architectural scenes, you can get some mileage out of simple planar priors, but modern generative AI is the ultimate prior. Even if the crazy camera rigs fully delivered on the promise, they still wouldn’t have enabled a good content ecosystem. YouTube wouldn’t have succeeded if every creator needed a RED Digital Cinema camera. The (quite good!) stereoscopic 3D photo generation in Quest Instagram is a baby step towards the future. There are paths to stereo video and 6DOF static, then eventually to 6DOF video. Make everything immersive, then allow bespoke tuning of immersive-aware media.
73
91
10
1,318
Natural conversation includes interruptions and talking over people, which is hard for an LLM to model as a single autoregressive sequence. I’m sure you can get pretty far by creating a text sequence with movie-script like breaks mid sentence, but it seems like the real solution would involve parallel streams of listening and thinking with talking queued up for pauses or rising to an interruption priority. Intermixing tokens from different streams and doing something custom with attention seems plausible.
175
87
27
1,929
John Carmack retweeted
We're proud to team up with our friends at Nightdive Studios once more to bring you a new definitive re-release... Introducing Heretic + Hexen featuring new and cut content, community published mods, online multiplayer up to 16 players with cross-play, up to 120hz in 4K, and of course a new optional enhanced soundtrack by our friend Andrew Hulshult. Free upgrade to all owners of either game on Steam - available now! beth.games/41rJb6v
Over the weekend I was talking with someone about drill and practice vs competition in sports like BJJ and basketball. interesting to see these old-timer recollections from the dawn of e-sports with Quake.
Please forgive and be patient with me, as producing content is a new thing for me. I created a Substack to assist in the readability of the op-eds. I realize the long form post on X isn't exactly the best. Here are the two op-eds I've written currently. Thank you for your time and patience. d16makaveli.substack.com/p/t… d16makaveli.substack.com/p/t…
It was a fun exercise recently to just open up a completely blank file and write an RL agent from scratch, without looking at any of my prior code. There is a point of scale where rewriting things from scratch is a bad idea, but it is a blessing when you can! By “from scratch” I mean with pytorch, and I chuckle a bit about how I used to be irritated when people would say that, as I considered it not from scratch if it wasn’t in C with no libraries, but now I am them.
117
93
15
3,455
Does @TheFIREorg have a position on cases like Collective Shout pressuring payment processors to pressure aggregators like Steam to remove content? Not a regulatory question, but still in the wheelhouse of expression suppression.
It is a trope that TV technobabble usually involves “reversing the frequencies” or “modulating the frequencies”, so I feel like a federation science officer when I declare “I am running the convolutions in frequency space!”
John Carmack retweeted
its refreshing to read @ID_AA_Carmack's logs from building games/engines in the late 90s: what he was thinking, optimizing, debugging, reading