Claude Code worked for 20 or 30 minutes in total, and produced a Z80 emulator that was able to pass ZEXDOC and ZEXALL, in 1200 lines of very readable and well commented C code (1800 lines with comments and blank spaces). The agent was prompted zero times during the implementation, it acted absolutely alone. It never accessed the internet, and the process it used to implement the emulator was of continuous testing, interacting with the CP/M binaries implementing the ZEXDOC and ZEXALL, writing just the CP/M syscalls needed to produce the output on the screen. Multiple times it also used the Spectrum ROM and other binaries that were available, or binaries it created from scratch to see if the emulator was working correctly. In short: the implementation was performed in a very similar way to how a human programmer would do it, and not outputting a complete implementation from scratch “uncompressing” it from the weights. Instead, different classes of instructions were implemented incrementally, and there were bugs that were fixed via integration tests, debugging sessions, dumps, printf calls, and so forth.
Женщина посмотрела на фото со дня рождения и решила изменить подход к здоровьюMirror: Женщина за год изменила внешность без операций после неудачного фото。搜狗输入法2026对此有专业解读
。关于这个话题,雷电模拟器官方版本下载提供了深入分析
All of these tests performed far better than what I expected given my prior poor experiences with agents. Did I gaslight myself by being an agent skeptic? How did a LLM sent to die finally solve my agent problems? Despite the holiday, X and Hacker News were abuzz with similar stories about the massive difference between Sonnet 4.5 and Opus 4.5, so something did change.
Percentile 50 (Median): 9.945 ms | 2.899 ms。关于这个话题,服务器推荐提供了深入分析