• 0 Posts
  • 62 Comments
Joined 2 years ago
cake
Cake day: June 25th, 2023

help-circle

  • Well each token has a vector. So ‘co’ might be [0.8,0.3,0.7] just instead of 3 numbers it’s like 100-1000 long. And each token has a different such vector. Initially, those are just randomly generated. But the training algorithm is allowed to slowly modify them during training, pulling them this way and that, whichever way yields better results during training. So while for us, ‘th’ and ‘the’ are obviously related, for a model no such relation is given. It just sees random vectors and the training reorganizes them tho slowly have some structure. So who’s to say if for the model ‘d’, ‘da’ and ‘co’ are in the same general area (similar vectors) whereas ‘de’ could be in the opposite direction. Here’s an example of what this actually looks like. Tokens can be quite long, depending how common they are, here it’s ones related to disease-y terms ending up close together, as similar things tend to cluster at this step. You might have an place where it’s just common town name suffixes clustered close to each other.

    and all of this is just what gets input into the llm, essentially a preprocessing step. So imagine someone gave you a picture like the above, but instead of each dot having some label, it just had a unique color. And then they give you lists of different colored dots and ask you what color the next dot should be. You need to figure out the rules yourself, come up with more and more intricate rules that are correct the most. That’s kinda what an LLM does. To it, ‘da’ and ‘de’ could be identical dots in the same location or completely differents

    plus of course that’s before the llm not actually knowing what a letter or a word or counting is. But it does know that 5.6.1.5.4.3 is most likely followed by 7.7.2.9.7(simplilied representation), which when translating back, that maps to ‘there are 3 r’s in strawberry’. it’s actually quite amazing that they can get it halfway right given how they work, just based on ‘learning’ how text structure works.

    but so in this example, us state-y tokens are probably close together, ‘d’ is somewhere else, the relation between ‘d’ and different state-y tokens is not at all clear, plus other tokens making up the full state names could be who knows where. And tien there’s whatever the model does on top of that with the data.

    for a human it’s easy, just split by letters and count. For an llm it’s trying to correlate lots of different and somewhat unrelated things to their ‘d-ness’, so to speak



  • They don’t look at it letter by letter but in tokens, which are automatically generated separately based on occurrence. So while ‘z’ could be it’s own token, ‘ne’ or even ‘the’ could be treated as a single token vector. of course, ‘e’ would still be a separate token when it occurs in isolation. You could even have ‘le’ and ‘let’ as separate tokens, afaik. And each token is just a vector of numbers, like 300 or 1000 numbers that represent that token in a vector space. So ‘de’ and ‘e’ could be completely different and dissimilar vectors.

    so ‘delaware’ could look to an llm more like de-la-w-are or similar.

    of course you could train it to figure out letter counts based on those tokens with a lot of training data, though that could lower performance on other tasks and counting letters just isn’t that important, i guess, compared to other stuff


  • Of course there are. But I mean, women’s hormones do affect mood during the menstrual cycle (my wife certainly says she’s more iritable before her period), and afaik the hormone therapy is some of the same hormones, so it didn’t seem far fetched at all to me that it could play a role. hence me asking.

    but could as well have been some deep seated anger at the world or similar, or something in between. Mostly I was just trying to think of reasons for why she might not be as bad as she was seeming, benefit of the doubt kind of thing.


  • I used to work with a trans woman who was a huge bitch, at least some of the time. Like actually shouting at coworkers for tiny mistakes, all-caps shouting in company chat at people trying to help with stuff, thinking she’s the smartest person in any room, that kind of stuff.

    i’ve always wondered if she’s just a bitch or if at least some of it could be a side effect of hormone therapy? I mean, completely changing the hormones for your body must have some pretty dramatic effects in many areas and might take a long time until your body adjusts.

    but a definitely won’t just ask ‘yo. Are you just a huge bitch or is it your medication’ in a corporate setting.

    [edit] just for clarity, she started transitioning about 1 month after she joined that team and I left after about a year and a half, in part because of the mood on the team going to shit, among other reasons. But so I couldn’t compare to pre-hormone therapy or anything like that.

    [edit2] thank you for all the replies, this was really enlightening and answered a lot of questions! Especially on a topic i feel is discussed less often, or at least I haven’t come across.




  • I’m not really sure I follow.

    Just to be clear, I’m not justifying anything, and I’m not involved in those projects. But the examples I know concern LLMs customized/fine-tuned for clients for specific projects (so not used by others), and those clients asking to have confidence scores, people on our side saying that it’s possible but that it wouldn’t actually say anything about actual confidence/certainty, since the models don’t have any confidence metric beyond “how likely is the next token given these previous tokens” and the clients going “that’s fine, we want it anyways”.

    And if you ask me, LLMs shouldn’t be used for any of the stuff it’s used for there. It just cracks me up when the solution to “the lying machine is lying to me” is to ask the lying machine how much it’s lying. And when you tell them “it’ll lie about that too” they go “yeah, ok, that’s fine”.

    And making shit up is the whole functionality of LLMs, there’s nothing there other than that. It just can make shit up pretty well sometimes.



  • I tend to agree with Schopenhauer(other than it sounding quite arrogant/condescending the way he puts it…):

    The cheapest sort of pride is national pride; for if a man is proud of his own nation, it argues that he has no qualities of his own of which he can be proud; otherwise he would not have recourse to those which he shares with so many millions of his fellowmen. The man who is endowed with important personal qualities will be only too ready to see clearly in what respects his own nation falls short, since their failings will be constantly before his eyes. But every miserable fool who has nothing at all of which he can be proud adopts, as a last resource, pride in the nation to which he belongs; he is ready and glad to defend all its faults and follies tooth and nail, thus reimbursing himself for his own inferiority.


  • That makes sense, though at least where I’m from it’s usually not local. At least people seem to care most about soccer and ice hockey teams that are not from where they grew up or where they live. Maybe more handed down by parents?

    It’s mostly that shared parasocial relationships are weird to me. Like, the benefit of a parasocial relationship is that it helps with loneliness and fill social needs without any pressure. But a shared parasocial relationship, idk. You get pressure/obligations from your peers and you actually have a friend group for fulfilling social needs. at least i never felt an urge to combine my parasocial and social relationships.

    I mean, if it was just some activity you did to spend time with friends, sure, i get it. But it seems like the sport itself is more central than a group of friends, to the point of getting ostracized for liking another team. Or getting into fights over which team is better, that kind of stuff. I know that’s not how everyone interacts with team sports, but there is a sizable chunk of people that do take it pretty seriously, and that’s where I don’t follow why they do that and what they get out of it.


  • i think there’s some sports that are a bit acquired tastes, like I don’t think the skill is immediately apparent the first time watching soccer, it’s “just people running around”. The strategy, technique etc is not immediately apparent. As opposed to like skateboard tricks or dry tooling/ice climbing competitions, which also have depth but are impressive without any prior knowledge, imo.

    For me personally, it’s the fan aspect I don’t get. What’s the point of projecting the us vs. them mentality on some team, “we won”, and foflowing a team almost religiously, even building ones own identity around it, at least in part. In general, getting so emotionally invested in it, i don’t understand. And it seems to mostly be a team sport thing.





  • The solution proposed in “After Capitalism” is (with democratically worker managed companies):

    A flat-rate tax on the capital assets of all productive enterprises is collected by the central government, all of which is plowed back into the economy, assisting those firms needing funds for purposes of productive investment. These funds are dispersed throughout society, first to regions and communities on a per capita basis, then to public banks in accordance with past performance, then to those firms with profitable project proposals. Profitable projects that promise increased employment and/or further other democratically decided goals are favored over those that do not. At each level—national, regional, and local—legislatures decide what portion of the investment fund coming to them is to be set aside for public capital expenditures, then send down the remainder, no strings attached, to the next lower level. Associated with most banks are entrepreneurial divisions, which promote firm expansion and new firm creation. Large enterprises that operate regionally or nationally might need access to additional capital, in which case it would be appropriate for the network of local investment banks to be supplemented by regional and national investment banks.

    That’s for taking care of the investment part that stocks/shares fulfill for a large part right now.

    And for getting there:

    Legislation giving workers the right to buy their company if they so choose. If workers so desire, a referendum is held to determine if the majority of workers want to democratize the company. If the referendum succeeds, a labor trust is formed, its directors selected democratically by the work-force, which, using funds derived from payroll deductions, purchase shares of the company on the stock market. In due time, the labor trust will come to own the majority of shares, at which time it takes full control via a leveraged buyout, that is, by borrowing the money to buy up the remaining shares.

    Along with legislation that if a company is bailed out by the government, it gets nationalized and turned into a worker self managed company. If companies get sold, they can only be sold to the state (according to the value of current assets, not stock market cap or similar). And if a firm is not sold, it’s turned over to the workers if the founders death. If there’s multiple founders, each can sell their share to the state or workers separately.

    For stocks specifically, there’s the Meidner plan, where every company with more than 50 employees is required to issue new shares each year equivalent to 20% of its profits, these shares will be held in a trust owned by the government, and in an estimated 35 years, most firms would become nationalized (of course along side all newly founded firms having to be worker owned).

    Not saying I fully agree with all of Schweickharts proposals, but at least the book is a relatively concrete proposal for an alternative that can be discussed, and how to possibly get there, so I thought it merits sharing.