

This doesn’t directly answer your question but I guess I had a rant in me so I might as well post it. Oops.
It’s possible to write tools that make point changes or incremental changes with targeted algorithms in a well understood problem space that make safe or probably safe changes that get reviewed by humans.
Stuff like turning pointers into smart pointers, reducing string copying, reducing certain classes of runtime crashes, etc. You can do a lot of stuff if you hand-code C++ AST transformations using the clang / llvm tools.
Of course “let’s eliminate 100% of our C code with a chatbot” is… a whole other ballgame and sounds completely infeasible except in the happiest of happy paths.
In my experience even simple LLM changes are wrong somewhere around half the time. Often in disturbingly subtle ways that take an expert to spot. Also in my experience if someone reviews LLM code they also tend to just rubber stamp it. So multiply that across thousands of changes and it’s a recipe for disaster.
And what about third party libraries? Corporate code bases are built on mountains of MIT licensed C and C++ code, but surely they won’t all switch languages. Which means they’ll have a bunch of leaf code in C++ and either need a C++ compatible target language, or have to call all the C++ code via subprocess / C ABI / or cross-language wrappers. The former is fine in theory, but I’m not aware of any suitable languages today. The latter can have a huge impact on performance if too much data needs to be serialized and deserialized across this boundary.
Windows in particular also has decades of baked in behavior that programs depend on. Any change in those assumptions and whoops some of your favorite retro windows games don’t work anymore!
In the worst case they’d end up with a big pile of spaghetti that mostly works as it does today but that introduces some extra bugs, is full of code that no one understands, and is completely impossible to change or maintain.
In the best case they’re mainly using “AI” for marketing purposes, will try to achieve their goals using more or less conventional means, and will ultimately fall short (hopefully not wreaking too much havoc in the progress) and give up halfway and declare the whole thing a glorious success.
Either way ultimately if any kind of large scale rearchitecting that isn’t seen through to the end will cause the codebase to have layers. There’s the shiny new approach (never finished), the horrors that lie just beneath (also never finished), and the horrors that lie just beneath the horrors (probably written circa 2003). Any new employees start by being told about the shiny new parts. The company will keep a dwindling cohort of people in some dusty corner of the company who have been around long enough to know how the decades of failed code architecture attempts are duct-taped together.


The good news is it generally isn’t necessary to reverse engineer browser behavior when writing a browser. Since it’s mostly fairly standardized, there’s a decent test suite, and the major browsers are all open source.
Though this comes with some caveats: