• RestrictedAccount@lemmy.world
    link
    fedilink
    arrow-up
    17
    ·
    2 days ago

    I keep hearing how Claude is so much better and I need to try it. Furthermore, last night I had an AI Expert spend an hour telling me how Claude could just do things by itself and how great it is.

    So I paid the money got Claude. I gave it a task to summarize a bunch of press releases and put them into a newsletter. After spending a lot of time getting the formatting, perfect I went to fact check what had been written.

    About a third of it was totally hallucinated

    This was today

    All it had to do was read press releases and make a summary and it couldn’t do that without hallucinating a bunch of fake DEI facts that just weren’t true

    So I tried to give it the task of verifying all of its declared statements and it ran out of juice so I have to wait till it resets.

    • toxoplasma0gondii@feddit.org
      link
      fedilink
      arrow-up
      1
      ·
      8 hours ago

      You may try different promting to make sure it only takes information out of your input. Also maybe look into ways to engineer your promt to give the LLM less room for creativity. Maybe make an assistant for the task. Claude is dev able to only use info off of your own input. We use it that way at work to make compliance stuff searchable.

      There are still different models best for different tasks though. One of the gemini models has very low hallucination percentage.