

Mixture of experts has been in use since 1991, and it’s essentially just a way to split up the same process as a dense model.
Tanks are an odd comparison, because not only have they changed radically since WW2, to the point that many crew positions have been entirely automated, but also because the role of tanks in modern combat has been radically altered since then (e.g. by the proliferation of drone warfare). They just look sort of similar because of basic geometry.
Consider the current crop of LLMs as the armor that was deployed in WW1, we can see the promise and potential, but it has not yet been fully realized. If you tried to match a WW1 tank against a WW2 tank it would be no contest, and modern armor could destroy both of them with pinpoint accuracy while moving full speed over rough terrain outside of radar range (e.g. what happened in the invasion of Iraq).
It will take many generational leaps across many diverse technologies to get from where we are now to realizing the full potential of large language models, and we can’t get there through simple linear progression any more than tanks could just keep adding thicker armor and bigger guns, it requires new technologies.
I was talking about the Gulf War in the 90s: https://youtu.be/b5EeKsEFpHI
I think the Iraqi tanks were mostly blown up by the time Bush Jr did his invasion.