GPT-4 Can’t Reason

medium.com Just a moment...

Corresponding arXiv preprint: https://arxiv.org/abs/2308.03762

AI @lemmy.ml

GPT-4 Can’t Reason

medium.com /@konstantine_45825/gpt-4-cant-reason-2eab795e2523

2 2

4 comments

"Reason" is an absurdly loaded term
- I mean, the author of this piece pretty concretely defined the term as he was using it. More to the point, it’s a pretty accurate article in showing things that ChatGPT can’t do, and it matches my use experience with a couple of tasks I wanted it to help me with.
  Specifically, I wanted its help for making a conlang to my specifications. And I found that ChatGPT, even the paid version, could help me generate all kinds of grammatical rules and phonology and whatnot, but once we got all these rules together, it was utterly incapable of following said rules to generate text, or even example words, in the conlang we were developing. It was pretty infuriating to work with and I eventually gave up, although I could have(and probably should have) just taken the rules and run it through other purpose built programs for conlanging. But I was really hoping ChatGPT could have done it all with me.
  It can’t. It can write rules, but it can’t follow rules. It really doesn’t know how.
  Another thing it struggles with? Ask it to write a poem with specific form requirements. For the simplest example, try to get it to write blank verse— it will repeatedly insist on rhyming every last word, even though blank verse is defined as being unrhymed poetry. ChatGPT simply doesn’t know how to stop rhyming when writing poetry of any form.
  
  I like to explain LLMs to people as "glorified autocompletes". They're just stringing words together in "the most rational way possible" based on the training data. They're not "sentient" or "smart", but they can still surprise our meat brains.
  In other words, it doesn't "know" anything, but can still output a pattern that makes us go "Ooooooo it KNOWS".
  Folks are getting better at training specific goals into their models. So the math that failed yesterday may work tomorrow may fail the day after. These problems will be solved in time and we'll have a broader range of surprising output moments.
  I dunno, just feels like a waste of an article for anyone in the know and confusing for those not paying attention. "ChatGPT doesn't have a soul!" Ya, duh . . .

LLM can't reason.