• 0 Posts
  • 29 Comments
Joined 1 年前
cake
Cake day: 2024年7月2日

help-circle
  • At the risk of being critical of Zitron, I have some comments. This is probably just nitpicking but regardless.

    […] a technology called Large Language Models (LLMs), which can also be used to generate images, video and computer code.

    LLMs cannot be used to generate images or videos. Diffusion models can create images, but that’s not a text-generation model. I guess you could use an LLM to prompt an image or video generation model, but I’m not sure if that’s what he meant or not.

    Large Language Models require entire clusters of servers connected with high-speed networking, all containing this thing called a GPU — graphics processing units.

    Sort of, but not really. GPT-5 with its (presumably) trillions of parameters and its (apparently) hundreds of millions of users per month needs a lot of throughput to cater to that, but there’s nothing about LLMs that inherently requires massive GPU clusters with high-speed networking.

    Here’s a LLM running on a Raspberry Pi

    Of course, the amount of people running LLMs on Raspberry Pis is effectively that guy in the video to show a LLM running on a Raspberry Pi, and it’s not like it’s particularily fast without adding a GPU (and at the end of the day it’s still LLM output, so), so perhaps he’s just using “Large Language Models” as in “The LLMs that the vast majority of people actually use.”

    He’s not wrong about training, however.

    IMO it’s not a particularly good start to his newsletter. Because an easy counter to his statement is that not all LLMs require massive amounts of compute to run, but a counter against that counter is that training even smaller LLMs still require vast amounts of compute that the average person doesn’t have, in addition to the copyrighted material needed to train on, even with the win that Anthropic got meaning that any LLM trained in the future is going to require vast amounts of capitol for just the training data alone. The problem is that he doesn’t state any of that. Maybe he does know about that and decided to omit it for brevity. If he did, then, personally, I think that’s a mistake. Or maybe I’m just not reading it properly.

    The first paragraph immediately conflating all of generative AI with LLMs doesn’t particularly help his case either, even though stating that there are multiple types of generative AI wouldn’t really harm his thesis that this entire thing is a massive bubble. Again, perhaps he’s doing it for a reason that I’m not getting.














  • LLMs and humans are both sentence-producing machines, but they were shaped by different processes to do different work

    Except not really. We’re not sentence-producing machines, we’re “machines” (so to speak) that can produce sentences. Not the same thing.

    Once this is in place, they say, nations must be prepared to enforce these restrictions by bombing unregistered data centres, even if this risks nuclear war, “because datacenters can kill more people than nuclear weapons” (emphasis theirs).

    So the plan is still to kill everyone to death to prevent GPT-5 6 7 8