There's a study Claude Sonnet 4.5 claims is used regularly in psych 101 classes and has been thrown around in court literally thousands of times, with an N of FORTY FIVE.
That is, they put 45 humans in a room and asked for their estimates of the speed those cars were going at after giving them each a one-word-difference description of the accident (e.g. "smash" instead "hit"), and decided those results were enough to justify the idea that humans by default rewire memories of an incident based on tiny changes in word choice.
The study is here:
drive.google.com/file/d/1xss…
And it's from THE YEAR OF OUR LORD NINETEEN-SEVENTY-FOUR. Yet according to
@mold_time, from whom I learned about it, there are essentially *no* replications of it out there. A significant branch of my civilization, from which my wealth, health and friends depend on, *has been operating on the basis that a few researchers' opinions of the whims of 45 humans generalize to all of human nature*. It is insane.
You might think it's fairly difficult to run an experiment of this kind. Only, as
@mold_time clearly attempted to instill in me today, it's trivially easy. I write this from a chair in Lighthaven, a complex just a few minutes away from Berkeley where I could've tested hypotheses by walking up to 45 different students. I could've instead pestered my Twitter simcluster, which altogether might comprise of ~45 individuals eager to take this test ✨for science✨. I could've blown a few hundred bucks on getting strangers online to do this!
(Instead, as it happened,
@Aella_Girl was in the room and offered to dump the link on her timeline, so the current N for our replication of this study is 609 and counting. Also she did most of the coding. Thanks Aella).
Indeed, designing a survey that replicates the original paper is as easy as plugging in the rules for
@GuidedTrack into Claude Sonnet 4.5 along with pdf of the study (which does generate bugs, all of which were fixed by Aella—though I am confident I could've done it myself given an extra hour, and the main thing you🫵 can't necessarily meta-replicate about my replication is "researcher with 240K twitter followers in the room" though as noted earlier this doesn't stop you from being able to get AT LEAST forty-five subjects in your study for ~a day of labor on your part.
See also e.g. "
@gwern so poor he has to put up with a moldy floor for a while, but also decided to spend some of his money on surveys asking how often people buy socks:
gwern.net/socks#sock-surveys. Not running surveys on whatever you're curious about is a skill issue and not really a material constraint in this century of abundance).
I'm grateful
@mold_time was around to finally get me to try replicating a study, which is indeed a valuable life experience, and that they're providing the valuable service of having generated a very-much-not-exhaustive list of papers that have few replications because our civilization is insane but would be easy to replicate in an afternoon—you might see me replicate 1-2 more of these in the coming weeks, out of spite alone.
You may not like it, but the screenshot below is what the frontier of actually-robust non-bullshit science in psychology looks like. There are plausibly a hundred papers on the same level of importance as this one with ridiculously small Ns, and our civilization might truly be so bizarre that the way it ends up fixing its glaring epistemic lacunae is via tweets with AI-generated cartoon images and links to afternoon-long-partially-vibe-coded surveys from well-known sex researchers.
Though of course, replicating a study is less than half the battle. It's not like I can just *stroll up* to google scholar and get this submitted...? Academia is maybe just a tiny bit more ossified than that:
equilibriabook.com/an-equili…
The GOOD NEWS is the "centralization forces" which make it so a study predictably "takes over" major branches of civilizational decision-making (such as the courts, which the 1974 study we replicated did in fact take over) are becoming easier to access. For instance, the layperson will consult Google's AI results context window (i.e. top link in pagerank), which isn't terribly difficult to get into. [If you doubt this is gaining in importance, just notice the increasing amount of posts on this platform that provide no source or even rhetorical argument besides a screenshot of the AI summary.]
Or they'll consult a large language model directly, and of course getting into the training data is practically guaranteed (for practical advice on making yourself *salient* in the training data, see:
gwern.net/llm-writing or
gwern.net/blog/2024/writing-… for a meta-argument of why you should be writing more online in the first place). So at least the rogues in our midst with the turn rate to pull off day-long replications with Ns an ORDER OF MAGNITUDE greater than the original study will gain rather than lose the advantage in the longer run of our history.
:)