@vrighter

vrighter@discuss.tchncs.de · 1 day ago

these types of laws usually come from the most technically illiterate people ever

vrighter@discuss.tchncs.de · edit-2 2 days ago

yep. you could of course swap weights in and out, but that would slow things down to a crawl. So they get lots of vram (edit: for example, an H100 has 80gb of vram)

vrighter@discuss.tchncs.de · 2 days ago

that’s why they need huge datacenters and thousands of GPUs. And, pretty soon, dedicated power plants. It is insane just how wasteful this all is.

vrighter@discuss.tchncs.de · 3 days ago

i wasn’t born yet. I don’t even think half of me was in my dad’s balls yet

vrighter@discuss.tchncs.de · 3 days ago

imagine that to type one letter, you need to manually read all unicode code points several thousand times. When you’re done, you select one letter to type.

Then you start rereading all unicode code points again for thousands of times again, for the next letter.

That’s how llms work. When they say 175 billion parameters, it means at least that many calculations per token it generates

vrighter@discuss.tchncs.de · 3 days ago

funny how everyone who wants to write a new browser (except the ladybird guys) always skimp on writing the actual browser part

vrighter@discuss.tchncs.de · 7 days ago

in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time

vrighter@discuss.tchncs.de · 7 days ago

they are improving at an exponential rate. It’s just that the exponent is less than one.

vrighter@discuss.tchncs.de · 7 days ago

got a pc with a good deal. First thing I did was electrically cut off all unnecessary leds

vrighter@discuss.tchncs.de · 8 days ago

if you’re concerned about how much you need to move your hand, then you’ll probably love (neo)vim

vrighter@discuss.tchncs.de · 8 days ago

that’s why you get a little robot friend to clean it for you

vrighter@discuss.tchncs.de · 9 days ago

theoretically, they wouldn’t, and yes, that is how it works. The math says so.

vrighter@discuss.tchncs.de · 9 days ago

opposite or not, they are both tasks that the fixed-matrix-multiplications can utterly fail at. It’s not a regulation thing. It’s a math thing: this cannot possibly work.

If you could get the checker to be correct all of the time, then you could just do that on the model it’s “checking” because it is literally the same thing, with the same failure modes, and the same lack of any real authority in anything it spits

vrighter@discuss.tchncs.de · 10 days ago

so? It was never advertised as intelligent and capable of solving any task other than that one.

Meanwhile slop generators are capable of doing a lot of things and reasoning.

One claims to be good at chess. The other claims to be good at everything.

vrighter@discuss.tchncs.de · 10 days ago

the driver itself is kilobytes in size. Megabytes is huge for such a simple thing

vrighter@discuss.tchncs.de · 10 days ago

how does that stop the checker model from “hallucinating” a “yep, this is fine” when it should have said “nah, this is wrong”

vrighter@discuss.tchncs.de · 10 days ago

the first one was confident. But wrong. The second one could be just as confident and just as wrong.

vrighter@discuss.tchncs.de · 11 days ago

what makes the checker models any more accurate?

vrighter@discuss.tchncs.de · 20 days ago

you made me snort coffee out of my nose. I hoepe you’re proud of yourself

vrighter@discuss.tchncs.de · 23 days ago

most code from the before times, from the long-long-ago, actually didn’t need a browser, and could fit on a floppy disk!