Talks about AI and health | cooking up something new | Does a lot of hackerhouses

San Francisco, CA
Joined March 2015
Living in a hackerhouse is the best way to immerse yourself in the San Francisco AI community 🍊 Orange House 2.0 is looking for a new roommate Even if you’re not looking for housing you’ll still like hearing about the fun events we host
It's a pretty neat framework, if you want TS that compiles to native and don't need the massive RN ecosystem it could be a great pick!
Despite the differences, Valdi feels very familiar to React - TSX syntax looks nearly identical - similar lifecycle methods - context API
Based on exploration of the codebase Snap likely built this because - React Native is slow tbh - Valdi enables native feeling cross platform code - snap had major C++ expertise circa 2017
It has a number of other benefits such as: - built from the ground up to be viewport aware for infinite scroll - direct C++ bindings - polyglot modules enabling easy mixing of TypeScript, C++, Swift and Kotlin. - setState is synchronous in order to eliminate weird async bugs.
Valdi has powered the Snapchat app for the last 8 years and has been proven at scale. For the developers its main difference vs React Native is it that TypeScript compiles to native views instead of simply communicating with them thru a bridge.
Snapchat just open sourced their cross platform mobile framework called Valdi and it could give React Native a run for its money.
1
1
2
The female disposition is so brave in a way foreign to men. To be big, dangerous and credible capable of harming others offers a sense of security that women do not inherently get. Women think of their safety far more than any men I know, and they often depend on others people or things to guarantee it in ways that I would never reach for. All the same they are living their lives in public! I’ve often seen women reach out and trust in others in ways men rarely do and cant help but wonder if it’s a muscle they are forced to grow.
1
I want to validate something Anyone want to rent me a spare Unitree G1?
Kind of crazy Nikita can post engagement bait like this and unlock another tranche of performance pay
Kind of crazy that you just need to put 7-10 words in the right order, post it, and an app just sends you $1000.
The test it wrote failed, exposing to us that the function under test was flawed! We then shipped a fix for this bug, making it the first example of a self improving agent that we had seen. This was pretty remarkable in 2023 given the models weren't very smart!
1
DeepUnit noticed the inconsistency and figured out the function should handle multiple options based on the file then wrote test case to prove the function could handle multiple options.
1
1
There was one function that it decided to test which was responsible for checking what option(s) DeepUnit was being run with. That functions entire file had been updated to take in multiple options, however the functions code could still only handle one option!
1
1
In 2023 our coding agent did something really surprising! Back then we were building DeepUnit which helped developers write unit tests. We naturally ran it on its own codebase a lot.
1
2
0
It's my theory that a dataset built of post annotated transcripts where reasoning about the rules with references to entities like "flying saucer" gets annotated with exact coordinates/precise descriptions would map the human visual perception to the sort of exacting language an LLM might benefit to. I suspect tokenizers would still be a limitation, in the same way LLMs can't count the R's in strawberry
I can't help but think that ARC-AGI is just a mapping problem. For humans we experience the grids(demo'd in the video) as these visual 2D representations of familiar concepts(blocks, flying saucers, etc) but internalize them into mental representations. We reason about them in very imprecise but visually intuitive language like "the saucer over there"
Justin Strong retweeted
Your agent works in demos but breaks in production. I spent 20 hours manually annotating 207 traces to find the patterns. Then I ran LangSmith's Insights Agent on the same data to see if automation could match expert analysis. Here's the head-to-head comparison 🧵 @hwchase17 @sh_reya @HamelHusain @WHinthorn @LangChainAI
The Devin team absolutely cooked here. Establishing sufficient context without context rotting your agent is one of the hardest problems in building a coding agent. The previous SOTA fast/relevant context would have arguably been building a code graph, but thats not always expansive enough.
Introducing SWE-grep and SWE-grep-mini: Cognition’s model family for fast agentic search at >2,800 TPS. Surface the right files to your coding agent 20x faster. Now rolling out gradually to Windsurf users via the Fast Context subagent – or try it in our new playground!
1
Finally, PyPi
As of Python 3.14, the free-threaded (or no-GIL) version of the Python interpreter is no longer considered experimental.
1
I’m 7 days too late
Can’t believe I forgot to wake up Green Day again