my bar for agi is an ai that can learn to run a gas station for a year without a team of scientists collecting the Gas Station Dataset

Jul 20, 2025 · 1:08 AM UTC

144
96
36
2,785
Replying to @tszzl
the world isn't ready for GasStationBench
3
1
181
GASBENCH
4
1
1
133
Replying to @tszzl
Is it allowed to generalize from the Vape Shop or Used To Be A Marriott corpuses?
1
56
yes
22
Replying to @tszzl
is it ok if it runs 1e6 gas stations to failure using rl first?
2
20
it cannot is the thing
1
9
Replying to @tszzl
lol yes. Also why specifically gas station lmao
1
7
bc it’s funny
5
70
Replying to @tszzl
Nice I think something like 40% of humans would probably fail this bar and more then you might think would be on the right tail
2
33
as opposed to IMO Gold which 99.999999% of humans fail
4
89
Replying to @tszzl
lol okay but what's the first thing to break it? My money's on the social challenge, like politely telling someone their card was declined.
1
4
that’s the easy part
2
Replying to @tszzl
Too bad i already collected the gas station dataset now we’ll never know, its definitely in your pretraining dataset already whoopsies
33
Replying to @tszzl
Very true
1
23
Replying to @tszzl
Wouldn't learning at test time and agentic data acquisition solve this like soon?
1
15
Replying to @tszzl
lemme get a pack of smokes and 20 on diesel and a mango July
1
1
7
Replying to @tszzl
if it starts selling vapes and honey packs i’m sold
7
Replying to @tszzl
Gas station metric is a great bit ngl
5
Replying to @tszzl
This is amazing! I approve
4
Replying to @tszzl
jives with @fkasummer
what is your timeline for AGI as in an intelligence that can solve ill defined out of distribution problems?
4
Replying to @tszzl
something about the length of the goal horizon is definitely important and underrrated in most definitions (like turing)
4
Replying to @tszzl
this is possible today with lovable + palantir
3
Replying to @tszzl
my bar is an oil refinery, only cause i’ve worked at one as a field operator in a distant history
3
Replying to @tszzl
% AGI = 1 - (% annual conference attendance)
3
Replying to @tszzl
a high bar
3
Replying to @tszzl
So not it deciding whether or not it wants to run one in the first place?
2
Replying to @tszzl
Ok…. This is really funny
2
Replying to @tszzl
it should be able to drive a car through LA without getting mad
2
Replying to @tszzl
🎯
2
Replying to @tszzl
so like running the Two Point - Gas Station. Yeah i thought about this and figured models can probably be tuned of Two Point perfected on Cheetah mode and then applied to real life business scenarios with relative ease. The economics and labor management are pretty spot on
2