Their initial testing reveals that slight changes in the wording of queries can result in significantly different answers, undermining the reliability of the models.

  • davel [he/him]@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 months ago

    Such an unfortunate title. No one who knows what they’re talking about would say that LLMs can—or ever will—reason. It’s not a flaw when something can’t do what it wasn’t designed to do.