Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • HugeNerd@lemmy.ca
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    18
    ·
    11 hours ago

    they’re all just guessing, literally

    They’re literally not.

    • m0darn@lemmy.ca
      link
      fedilink
      English
      arrow-up
      22
      arrow-down
      2
      ·
      11 hours ago

      Isn’t it a probabilistic extrapolation? Isn’t that what a guess is?

      • Iconoclast@feddit.uk
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        3
        ·
        8 hours ago

        It’s a Large Language Model. It doesn’t “know” anything, doesn’t think, and has zero metacognition. It generates language based on patterns and probabilities. Its only goal is to produce linguistically coherent output - not factually correct one.

        It gets things right sometimes purely because it was trained on a massive pile of correct information - not because it understands anything it’s saying.

        So no, it doesn’t “guess.” It doesn’t even know it’s answering a question. It just talks.

        • SuspciousCarrot78@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 minutes ago

          A fair point but often over stated, I feel. And it overlooks something -

          Language itself encodes meaning. If you can statistically predict the next word, then you are implicitly modeling the structure of ideas, relationships, and concepts carried by that language.

          You don’t get coherence, useful reasoning, or consistently relevant answers from pure noise. The patterns reflect real regularities in the world, distilled through human communication.

          Yes, that doesn’t mean an LLM “understands” in the human sense, or that it’s infallible.

          But reducing it to “just autocomplete” misses the fact that sufficiently rich pattern modeling can approximate aspects of reasoning, abstraction, and knowledge use in ways that are practically meaningful, even if the underlying mechanism is different from human thought.

          TL;DR: it’s a bit more than just a fancy spell check. ICBW and YMMV

        • vii@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 hours ago

          It gets things right sometimes purely because it was trained on a massive pile of correct information - not because it understands anything it’s saying.

          I know some humans that applies to

        • KeenFlame@feddit.nu
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          2
          ·
          7 hours ago

          Yes it guesstimates what is wrong with you to argue like that about semantics?

      • vii@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        8 hours ago

        This gets very murky very fast when you start to think how humans learn and process, we’re just meaty pattern matching machines.