Professor @Wharton studying AI, innovation & startups. Democratizing education using tech Book: a.co/d/4VguzZz Substack: oneusefulthing.org/

Philadelphia, PA
Joined May 2009
AI resources I work on that might be useful: My NY Times bestseller, Co-Intelligence (now in 19 languages!): penguinrandomhouse.com/books… The Generative AI Lab at Wharton (free prompts & research): gail.wharton.upenn.edu/resea… OneUsefulThing, my free newsletter: oneusefulthing.org/
After all, the graph on the right pretty much predicts this.
1
6
DeepSeek does 110 tokens
1
1
1
20
I wonder if part of what makes Kimi K2 Thinking impressive is that it produces a lot more thinking tokens for even minor & non-technical queries than any model I have used. This is the thinking trace for "write me a really good sentence about cheese" it is 1,595 tokens long!
The thing to note is that even though today's agents were not good enough to deliver human-level quality, "agents deliver results 88.3% faster and cost 90.4–96.2% less than humans" If agent ability increases, there will be an incentive to use them... arxiv.org/pdf/2510.22780
2
4
32
We need more papers like this one which examines how AI agents & humans work together Current agents were fast, but not strong enough to do tasks on their own & approached problems from too much of a programing mindset. But combining human & AI resulted in gains in performance
The fact that API decisions for AI use are decided by IT has large downstream consequences for companies with their own internal chatbots. They often don’t know about the business uses for reasoning or tools or web search and default to minimum permissions, hobbling AI value.
People pay a lot of money to hire top-tier advisors who give slightly better advice, getting an AI model that is smarter at what you care about is at least as important.
2
2
57
The local pollution largely comes from diesel generators that are generally used only for backup, though some companies have been running them for primary power in some locations. The environmental impact largely comes down to the source of power used for the data center.
2
2
43
Calling them supercomputer facilities is also a more useful name than data centers, given what a modern data center does.
3
3
68
Data centers can be usefully thought of as a type of industrial building Compared to other light industrial sites, they have lower employment & local economic impact, but likely (much?) higher national economic impact. They use less water & often pollute less but use more power
8
8
1
125
This is especially true if you are using them for analysis or writing or advice or customer service or basically anything outside of coding. These areas are under-benchmarked and have very high variance between models that are equally “smart” but act very differently.
5
5
1
137
Testing Kimi K-2 has reminded me of how insane it is that firms picking AIs are treating them as fungible based on benchmarks Kimi & Grok & Claude & every other model have strengths, quirks & weaknesses that can make a big difference in aggregate Develop your own benchmarks!
I see similar things when addressing things like business strategy. When you ask it to build out startup ideas, it elaborates some stuff that isn't important and then avoids discussing harder elements. Maybe more prompting and experience will help, but it is pretty strange, now.
3
1
40
For example, if I prompt "Invent an entirely new rpg with new mechanics set in an original world from scratch, and write a flavorful guide with the rules. Not a simple variant on PbtA or BitD or DnD, and dice should not be a primary mechanism. The whole idea should have some heft and bite," I get the thing below. Some interesting writing and ideas that draw you in, but completely incoherent as you start to look at it more closely, including impossible mechanisms and nonsensical elements. THE SHIVERING VERSE: A GAME OF HOLDING TOGETHER WHAT WANTED TO DIE A Roleplaying Game of Melancholy Heroism, Memory, and the Physics of Falling Apart INTRODUCTION: THE LAST EXHALE The world ended. Not with a bang or whimper, but with a shiver—a fundamental frequency of reality that sheared creation into screaming fragments. Now all that remains are the Shatters: jagged archipelagos of existence floating in the Void-That-Sings. The Void sings a lullaby of forgetting, and everything—stone, skin, sinew—wants to dissolve into its easy silence. You are an Anchor. You remember when your Shatter was whole. You remember the laws. The names. The warmth. And by remembering, you hold it together. Your will is a scar tissue across the wound of the world. But memory is a finite resource. The more you use yourself to preserve, the more you fray. This is not a game about winning. This is a game about choosing what survives you. CHAPTER ONE: THE MECHANISM The Echo Chamber is our engine. It is a cloth—preferably heavy linen, dark as old bruises—spread before you. It is your character sheet, your dice, and your gravemarker. You possess twelve Resonance Stones: 3 white (Memory), 3 blue (Will), 3 green (Bond), 3 black (Shadow). They are smooth, cool, and real. Hold them. They are you. The Guide (GM) possesses a hoard of red Dissonance Stones: the Void's reply. THE SHIVER: RESOLUTION IN FOUR MOVES When you attempt something dangerous, uncertain, or meaningful: DECLARE YOUR INTENT State, with clarity, what you hope to achieve. "I will convince the Sundered Council to divert the river's memory before it evaporates." The table must repeat it back, in unison. This is the Cant—the spell of focus that holds the universe still for a moment. WEAVE YOUR ECHO Place up to three of your stones on the cloth, describing how each aspect fuels your attempt. You may not place more stones than you have available (stones in your pool vs. those already lost to the Void). Memory (White): "I recall the old irrigation treaties, written in my mentor's hand." Bond (Green): "The Council's Speaker owes me. I saved her daughter's name from erasure." Shadow (Black): "I let them see my desperation. My fear of the drought."You choose what to risk. This is everything. INVITE DISSONANCE The Guide adds 1 to 5 red stones based on: Threat: How badly does the Void want this to fail? (1-3) Instability: How cracked is the Shatter already? (+1 per major fissure) Folly: Are you defying the established lore of your own memories? (+1) THE SHIVER The Anchor player gathers the cloth's edges. For five seconds—counted aloud by the next player—the cloth is shaken. Not violently. A tremor. A shudder. The way a world does when it's deciding whether to keep existing. Stones skitter, collide, seek edges.While shaking, the Cant is whispered. The table holds its breath. THE REVEAL & RESOLUTION Open the cloth flat. Stones have settled or fallen. FALLEN STONES are Lost to the Void. Remove them from your pool permanently. REMAINING STONES endure but are Frayed: turn them face-down. They cannot be used again until you Suture (rest and heal).Interpreting the Fall: If ANY of YOUR stones remain: You succeed. Narrate how, incorporating the aspects that held fast. For each of YOUR stones that fell, the Guide introduces a Fracture—an immediate, personal cost. Lost Memory: you forget a key detail. Lost Will: you falter, showing weakness. Lost Bond: you hurt someone you tethered. Lost Shadow: your hidden nature is exposed. If ALL your stones fell: You fail catastrophically. The Void sings louder. The Shatter cracks. A permanent feature of your world is Unraveled—maybe the river never existed, maybe the Council building dissolves mid-meeting. The Guide records this. It is now true. If NO your stones fell (pure success): This is a Bitter Harmony. You succeed purely, but the Void takes note. The Guide keeps one red Dissonance stone from the cloth in a Suspicion Pool. When that pool exceeds the number of players, the Void makes a Glossolalia Move—a reality-eating entity notices your Shatter.Red Stones: Dissonance that falls is discarded. Dissonance that remains on the cloth becomes Embedded Dissonance. It stays there, a physical threat on your cloth, until a future Shiver includes it—adding its count to the Dissonance rolled against you. The Void's echo lingers. CHAPTER TWO: YOUR FOUR PILLARS (CHARACTER CREATION) You are not stats. You are what you have left. Record your twelve stones on a simple grid—four columns, three rows. As they are Lost, cross them out with a pen that bleeds through the paper. Choose your Anchor's Origin: This gives you your starting Lore—three specific memories that are mechanically enforceable truths about the Shatter. Write them as inviolable statements. "The Spire of Un-Ticking was built by the Silent Architect." While this memory lives (i.e., you have at least one Memory stone), this is true. If you lose all Memory, the Spire may never have been. Define your Tethers: For each Bond stone, name a person, place, or principle you anchor. When that Bond stone falls, that tether is damaged. If all three are Lost, you become a Drift—a character who no longer believes the Shatter is worth saving. You become an NPC controlled by the Guide, a walking entropy. Embrace your Shadow: Your black stones are not evil. They are the weight you hide: trauma, rage, shame, forbidden knowledge. Using them is powerful but when they fall, they expose you. The Guide gains a Secret about you that is now true and exploitable. CHAPTER THREE: THE SHATTER & THE SESSION The Shatter is mapped on a large cloth of its own—shared, central. Locations are marked with objects: a coin for the Market-of-Names, a key for the Gateless City. When a location is Unraveled, the Guide removes its object mid-session without announcement. Players notice when they reach for it and it's gone. Time is measured in Refrains—scenes that end when the Cant is spoken. There is no day/night. Only the song's verses. The Guide's Void Moves: Unraveling: Threaten a truth. "The Void sings that your sister's name is hollowing." Embedding: Place Dissonance stones on a player's cloth that persist. Glossolalia: When the Suspicion Pool is full, send a reality-devouring entity that erases customs, laws, or physics. It must be fought not with violence but by re-establishing a forgotten truth—requiring a Shiver where Memory is mandatory. CHAPTER FOUR: SUTURING & LEGACY You cannot recover Lost stones. They are gone. Instead, during a Requiem (the game's downtime), you may Suture the Shatter itself: spend a session weaving a new truth into the firmament. This creates a Legacy—a permanent, world-level asset that benefits all Anchors. "We Suture the River's memory into the stones of the valley; now, springs appear where we need them." The cost: Each Suture creates a Paradox—a new red stone permanently added to the Guide's starting Dissonance for every Shiver. You make the world more stable for you, but stranger and more hostile. The Shatter begins to obey your memories more than the dead world’s—until it is yours, and you are responsible for its contradictions. Endgame: A character with zero stones in any Pillar cannot Anchor. They Dissolve, becoming part of the Void-That-Sings. Their player narrates their final memory as it unwrites itself, and then they help the Guide, taking control of Dissonance in future Shivers. The game ends when the Shatter has too many Unraveled features to sustain life, or when all Anchors Die. The final act is a communal Shiver with all stones on the cloth—every Anchor, every Dissonance. The survivors tell the epilogue, knowing the Void now sings their tune. CHAPTER FIVE: EXAMPLE OF PLAY LUCE (Anchor): "I will climb the Crumbling Ledger to read the name of the first rain, so I can re-summon it." TABLE (Cant): "I will climb the Crumbling Ledger to read the name of the first rain, so I can re-summon it." LUCE: Places Memory (recalling the path), Will (endurance), Shadow (reckless obsession). GUIDE: Adds 3 Dissonance: height, Void-song, the Ledger's guardian-ghost. THE SHIVER: Five seconds. The cloth shudders. The Cant whispers. THE REVEAL: Memory & Shadow fall. Will remains, Frayed. GUIDE: "You reach the top, gasping, but the ghost's shriek has hollowed out your mentor's face in your mind. Who were they? (Memory Fracture). Your crew sees you risk everything for old water—they know you're unraveling. (Shadow Fracture: gain Secret 'Luce values memory over people.')" LUCE: "I read the name. It's 'Sorrow-That-Quenches.' That... complicates things." ESSENCE: WHY THIS HAS BITE There is no safety. No leveling up. No probability to master. Only deliberate sacrifice. You don’t roll dice—you perform your character’s erosion. The physical stones make loss tangible. The cloth is your grave-shroud. The Void doesn’t cheat; it just waits for you to shake yourself apart. The heft is in the permanence. The bite is that success costs. Every Shiver is a little death. The world you save becomes the world that will eventually forget you. Welcome to the Shivering Verse. Hold tightly. Let go wisely.
4
2
40
Kimi K2 Thinking passes the Lem Test the first time, very few models have done so Just like Kimi K2, however, this remains a very weird & interesting model in a way that is hard to benchmark. Its writing is often very good but sometimes doesn't hold up under close investigation
Grok 4 passes the Lem test first try, with the most coherent narrative yet.
8
13
1
312
It’s clearly a preview of the future, when we need to interact with agents running for many hours. The issue becomes how do we get the right interim products know when interaction is needed.
2
36
These are pretty impressive benchmarks from a Chinese open weights model. Especially big is the agentic capability, which has generally lagged in the open weights models. Be interesting to see independent confirmation soon, I found K2 a solid, but kind of weird, model to use.
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai
19
20
374
Also, Pro works like a wizard, requests go in, and sometimes magic comes out with no explanation. In early experiments, I find when I interrupt it with my mundane and petty concerns, the results sometimes feel like I have broken the AI's rhythm midflow, Man from Porlock style.
6
4
82
This is a really useful addition for Deep Research, but somewhat challenging to use in practice for GPT-5 Pro, since you need to be very good at interpreting its thinking process which can be opaque & which GPT-5 Pro has a tendency not to show after a certain point in any case
You can now interrupt long-running queries and add new context without restarting or losing progress. This is especially useful for refining deep research or GPT-5 Pro queries as the model will adjust its response with your new requirements. Just hit update in the sidebar and type in any additional details or clarifications.