In 2023 our coding agent did something really surprising! Back then we were building DeepUnit which helped developers write unit tests. We naturally ran it on its own codebase a lot.
1
2
0
There was one function that it decided to test which was responsible for checking what option(s) DeepUnit was being run with. That functions entire file had been updated to take in multiple options, however the functions code could still only handle one option!

Oct 30, 2025 · 8:53 PM UTC

1
1
DeepUnit noticed the inconsistency and figured out the function should handle multiple options based on the file then wrote test case to prove the function could handle multiple options.
1
1
The test it wrote failed, exposing to us that the function under test was flawed! We then shipped a fix for this bug, making it the first example of a self improving agent that we had seen. This was pretty remarkable in 2023 given the models weren't very smart!
1