• 0 Posts
  • 34 Comments
Joined 3 months ago
cake
Cake day: June 4th, 2025

help-circle

  • Mixture of experts has been in use since 1991, and it’s essentially just a way to split up the same process as a dense model.

    Tanks are an odd comparison, because not only have they changed radically since WW2, to the point that many crew positions have been entirely automated, but also because the role of tanks in modern combat has been radically altered since then (e.g. by the proliferation of drone warfare). They just look sort of similar because of basic geometry.

    Consider the current crop of LLMs as the armor that was deployed in WW1, we can see the promise and potential, but it has not yet been fully realized. If you tried to match a WW1 tank against a WW2 tank it would be no contest, and modern armor could destroy both of them with pinpoint accuracy while moving full speed over rough terrain outside of radar range (e.g. what happened in the invasion of Iraq).

    It will take many generational leaps across many diverse technologies to get from where we are now to realizing the full potential of large language models, and we can’t get there through simple linear progression any more than tanks could just keep adding thicker armor and bigger guns, it requires new technologies.


  • The gains in AI have been almost entirely in compute power and training, and those gains have run into powerful diminishing returns. At the core it’s all still running the same Markov chains as the machine learning experiments from the dawn of computing; the math is over a hundred years old and basically unchanged.

    For us to see another leap in progress we’ll need to pioneer new calculations and formulate different types of thought, then find a way to integrate that with large transformer networks.





  • I understand you feel very strongly about four digit years, but I really don’t see any situation that I couldn’t sort out with a simple script.

    Usually I don’t put dates in file names in the first place, but when I do I use the UTC timestamp; a date without a timezone is inherently fuzzy, and it’s easier to compare and differentiate numerical times.

    If someone used two digit years in their naming convention I wouldn’t even blink, let alone get the woodchipper, life is too short to get angry over stuff like that.








  • Windows in particular I think gets overlooked as ‘good enough’, it’s only when you get into Linux that you really understand how far it has strayed from the light.

    You don’t need to spend hours and hours to start, you can dip your toes in with WSL, maybe use a Linux VM for a few tasks that make your life easier at work. It’s not an all-or-nothing affair, but having proficiency in more than one operating system is great professional development regardless of your personal computing preferences.


  • I’ve found that many people will go to great lengths to avoid learning anything new.

    They want to be able to ignore their computers as much as possible, even considering the prospect of alternative software is taxing and upsetting for them.

    I think that’s basically how Microsoft and Adobe are so successful, they bought and cheated their way into the default position, and now they can do whatever they want with no real repercussions.

    The user wants to click on the same icons with the same names as before, sometimes it’s as simple as wanting the same name; if it’s not called ‘outlook’ they don’t want it, doesn’t matter how well it works.




  • absentbird@lemmy.worldtoLemmy Shitpost@lemmy.worldLemmy be like
    link
    fedilink
    arrow-up
    6
    arrow-down
    7
    ·
    9 days ago

    The problem is the companies building the data centers; they would be just as happy to waste the water and resources mining crypto or hosting cloud gaming, if not for AI it would be something else.

    In China they’re able to run DeepSeek without any water waste, because they cool the data centers with the ocean. DeepSeek also uses a fraction of the energy per query and is investing in solar and other renewables for energy.

    AI is certainly an environmental issue, but it’s only the most recent head of the big tech hydra.


  • absentbird@lemmy.worldtoLemmy Shitpost@lemmy.worldLemmy be like
    link
    fedilink
    arrow-up
    20
    arrow-down
    3
    ·
    9 days ago

    When people say this they are usually talking about a very specific sort of generative LLM using unsupervised learning.

    AI is a very broad field with great potential, the improvements in cancer screening alone could save millions of lives over the coming decades. At the core it’s just math, and the equations have been in use for almost as long as we’ve had computers. It’s no more good or bad than calculus or trigonometry.