Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.

The "Apple" part. CEOs only care what companies say.
- Apple is significantly behind and arrived late to the whole AI hype, so of course it's in their absolute best interest to keep showing how LLMs aren't special or amazingly revolutionary.
  They're not wrong, but the motivation is also pretty clear.
Proving it matters. Science is constantly proving any other thing that people believe is obvious because people have an uncanning ability to believe things that are false. Some people will believe things long after science has proven them false.
- I mean… “proving” is also just marketing speak. There is no clear definition of reasoning, so there’s also no way to prove or disprove that something/someone reasons.
"It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'." -Pamela McCorduck´.
It's called the AI Effect.
As Larry Tesler puts it, "AI is whatever hasn't been done yet.".
- That entire paragraph is much better at supporting the precise opposite argument. Computers can beat Kasparov at chess, but they're clearly not thinking when making a move - even if we use the most open biological definitions for thinking.
- I'm going to write a program to play tic-tac-toe. If y'all don't think it's "AI", then you're just haters. Nothing will ever be good enough for y'all. You want scientific evidence of intelligence?!?! I can't even define intelligence so take that! \s
  Seriously tho. This person is arguing that a checkers program is "AI". It kinda demonstrates the loooong history of this grift.
- Yesterday I asked an LLM "how much energy is stored in a grand piano?" It responded with saying there is no energy stored in a grad piano because it doesn't have a battery.
  Any reasoning human would have understood that question to be referring to the tension in the strings.
  Another example is asking "does lime cause kidney stones?". It didn't assume I mean lime the mineral and went with lime the citrus fruit instead.
  Once again a reasoning human would assume the question is about the mineral.
  Ask these questions again in a slightly different way and you might get a correct answer, but it won't be because the LLM was thinking.
This is why I say these articles are so similar to how right wing media covers issues about immigrants.
There's some weird media push to convince the left to hate AI. Think of all the headlines for these issues. There are so many similarities. They're taking jobs. They are a threat to our way of life. The headlines talk about how they will sexual assault your wife, your children, you. Threats to the environment. There's articles like this where they take something known as twist it to make it sound nefarious to keep the story alive and avoid decay of interest.
Then when they pass laws, we're all primed to accept them removing whatever it is that advantageous them and disadvantageous us.
- This is why I say these articles are so similar to how right wing media covers issues about immigrants.
  Maybe the actual problem is people who equate computer programs with people.
  Then when they pass laws, we’re all primed to accept them removing whatever it is that advantageous them and disadvantageous us.
  You mean laws like this? jfc.
  https://www.inc.com/sam-blum/trumps-budget-would-ban-states-from-regulating-ai-for-10-years-why-that-could-be-a-problem-for-everyday-americans/91198975
- Because it's a fear-mongering angle that still sells. AI has been a vehicle for scifi for so long that trying to convince Boomers that of won't kill us all is the hard part.
  I'm a moderate user for code and skeptic of LLM abilities, but 5 years from now when we are leveraging ML models for groundbreaking science and haven't been nuked by SkyNet, all of this will look quaint and silly.

Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.

This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.
It's hard to to be heard when you're buried under all that sweet VC/grant money.
And engineers who stood to make a lot of money
For me it kinda went the other way, I'm almost convinced that human intelligence is the same pattern repeating, just more general (yet)
- Except that wouldn't explain conscience. There's absolutely no need for conscience or an illusion(*) of conscience. Yet we have it.
  arguably, conscience can by definition not be an illusion. We either perceive "ourselves" or we don't

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

When given explicit instructions to follow models failed because they had not seen similar instructions before.
This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.
- I'm not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.
  If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.
Yeah these comments have the three hallmarks of Lemmy:
AI is just autocomplete mantras.
Apple is always synonymous with bad and dumb.
Rare pockets of really thoughtful comments.
Thanks for being at least the latter.
Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.
- Particularly to counter some more baseless marketing assertions about the nature of the technology.
- Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.
There's probably alot of misunderstanding because these grifters intentionally use misleading language: AI, reasoning, etc.
If they stuck to scientifically descriptive terms, it would be much more clear and much less sensational.
What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it's no longer reasoning? I feel like at this point a more relevant question is "What exactly is reasoning?". Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.
https://en.wikipedia.org/wiki/Reasoning_system
- If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It's like comparing PhD reasoning to a dog's reasoning.
  While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).
  Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it's designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don't have the tech to make a synthetic human.
- I think as we approach the uncanny valley of machine intelligence, it's no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.
- Sure, these grifters are shady AF about their wacky definition of "reason"... But that's just a continuation of the entire "AI" grift.
Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn't if AI can reason, but how its reasoning differs from ours.
What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to "reasoning models" that allow them to break free of the inherent boundaries of the statistical methods they are based on?
- I'd encourage you to research more about this space and learn more.
  As it is, the statement "Markov chains are still the basis of inference" doesn't make sense, because markov chains are a separate thing. You might be thinking of Markov decision processes, which is used in training RL agents, but that's also unrelated because these models are not RL agents, they're supervised learning agents. And even if they were RL agents, the MDP describes the training environment, not the model itself, so it's not really used for inference.
  I mean this just as an invitation to learn more, and not pushback for raising concerns. Many in the research community would be more than happy to welcome you into it. The world needs more people who are skeptical of AI doing research in this field.

No way!

Statistical Language models don't reason?

But OpenAI, robots taking over!

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.

Intuition is about the only thing it has. It's a statistical system. The problem is it doesn't have logic. We assume because its computer based that it must be more logic oriented but it's the opposite. That's the problem. We can't get it to do logic very well because it basically feels out the next token by something like instinct. In particular it doesn't mask or disconsider irrelevant information very well if two segments are near each other in embedding space, which doesn't guarantee relevance. So then the model is just weighing all of this info, relevant or irrelevant to a weighted feeling for the next token.
This is the core problem. People can handle fuzzy topics and discrete topics. But we really struggle to create any system that can do both like we can. Either we create programming logic that is purely discrete or we create statistics that are fuzzy.
Of course this issue of masking out information that is close in embedding space but is irrelevant to a logical premise is something many humans suck at too. But high functioning humans don't and we can't get these models to copy that ability. Too many people, sadly many on the left in particular, not only will treat association as always relevant but sometimes as equivalence. RE racism is assoc with nazism is assoc patriarchy is historically related to the origins of capitalism ∴ nazism ≡ capitalism. While national socialism was anti-capitalist. Associative thinking removes nuance. And sadly some people think this way. And they 100% can be replaced by LLMs today, because at least the LLM is mimicking what logic looks like better though still built on blind association. It just has more blind associations and finetune weighting for summing them. More than a human does. So it can carry that to mask as logical further than a human who is on the associative thought train can.
- You had a compelling description of how ML models work and just had to swerve into politics, huh?
People think they want AI, but they don’t even know what AI is on a conceptual level.
- They want something like the Star Trek computer or one of Tony Stark's AIs that were basically deus ex machinas for solving some hard problem behind the scenes. Then it can say "model solved" or they can show a test simulation where the ship doesn't explode (or sometimes a test where it only has an 85% chance of exploding when it used to be 100%, at which point human intuition comes in and saves the day by suddenly being better than the AI again and threads that 15% needle or maybe abducts the captain to go have lizard babies with).
  AIs that are smarter than us but for some reason don't replace or even really join us (Vision being an exception to the 2nd, and Ultron trying to be an exception to the 1st).
- Yeah I often think about this Rick N Morty cartoon. Grifters are like, "We made an AI ankle!!!" And I'm like, "That's not actually something that people with busted ankles want. They just want to walk. No need for a sentient ankle." It's a real gross distortion of science how everything needs to be "AI" nowadays.
I agree with you. In its current state, LLM is not sentient, and thus not "Intelligence".
- I think it's an easy mistake to confuse sentience and intelligence. It happens in Hollywood all the time - "Skynet began learning at a geometric rate, on July 23 2004 it became self-aware" yadda yadda
  But that's not how sentience works. We don't have to be as intelligent as Skynet supposedly was in order to be sentient. We don't start our lives as unthinking robots, and then one day - once we've finally got a handle on calculus or a deep enough understanding of the causes of the fall of the Roman empire - we suddenly blink into consciousness. On the contrary, even the stupidest humans are accepted as being sentient. Even a young child, not yet able to walk or do anything more than vomit on their parents' new sofa, is considered as a conscious individual.
  So there is no reason to think that AI - whenever it should be achieved, if ever - will be conscious any more than the dumb computers that precede it.
You'd think the M in LLM would give it away.
And that's pretty damn useful, but obnoxious to have expectations wildly set incorrectly.

Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.

I can envision a system where an LLM becomes one part of a reasoning AI, acting as a kind of fuzzy "dataset" that a proper neural network incorporates and reasons with, and the LLM could be kept real-time updated (sort of) with MCP servers that incorporate anything new it learns.
But I don't think we're anywhere near there yet.
- The only reason we're not there yet is memory limitations.
  Eventually some company will come out with AI hardware that lets you link up a petabyte of ultra fast memory to chips that contain a million parallel matrix math processors. Then we'll have an entirely new problem: AI that trains itself incorrectly too quickly.
  Just you watch: The next big breakthrough in AI tech will come around 2032-2035 (when the hardware is available) and everyone will be bitching that "chain reasoning" (or whatever the term turns out to be) isn't as smart as everyone thinks it is.
- LLMs (at least in their current form) are proper neural networks.
Unlike Markov models, modern LLMs use transformers that attend to full contexts, enabling them to simulate structured, multi-step reasoning (albeit imperfectly). While they don’t initiate reasoning like humans, they can generate and refine internal chains of thought when prompted, and emerging frameworks (like ReAct or Toolformer) allow them to update working memory via external tools. Reasoning is limited, but not physically impossible, it’s evolving beyond simple pattern-matching toward more dynamic and compositional processing.
- Reasoning is limited
  Most people wouldn't call zero of something 'limited'.
- I'm not convinced that humans don't reason in a similar fashion. When I'm asked to produce pointless bullshit at work my brain puts in a similar level of reasoning to an LLM.
  Think about "normal" programming: An experienced developer (that's self-trained on dozens of enterprise code bases) doesn't have to think much at all about 90% of what they're coding. It's all bog standard bullshit so they end up copying and pasting from previous work, Stack Overflow, etc because it's nothing special.
  The remaining 10% is "the hard stuff". They have to read documentation, search the Internet, and then—after all that effort to avoid having to think—they sigh and start actually start thinking in order to program the thing they need.
  LLMs go through similar motions behind the scenes! Probably because they were created by software developers but they still fail at that last 90%: The stuff that requires actual thinking.
  Eventually someone is going to figure out how to auto-generate LoRAs based on test cases combined with trial and error that then get used by the AI model to improve itself and that is when people are going to be like, "Oh shit! Maybe AGI really is imminent!" But again, they'll be wrong.
  AGI won't happen until AI models get good at retraining themselves with something better than basic reinforcement learning. In order for that to happen you need the working memory of the model to be nearly as big as the hardware that was used to train it. That, and loads and loads of spare matrix math processors ready to go for handing that retraining.
- previous input goes in. Completely static, prebuilt model processes it and comes up with a probability distribution.
  There is no "unlike markov chains". They are markov chains. Ones with a long context (a markov chain also kakes use of all the context provided to it, so I don't know what you're on about there). LLMs are just a (very) lossy compression scheme for the state transition table. Computed once, applied blindly to any context fed in.

this is so Apple, claiming to invent or discover something "first" 3 years later than the rest of the market

Trust Apple. Everyone else who were in the space first are lying.

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.

This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.
Computers are awesome at "recognizing patterns" as long as the pattern is a statistical average of some possibly worthless data set. And it really helps if the computer is setup to ahead of time to recognize pre-determined patterns.

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.

does ANY model reason at all?

No, and to make that work using the current structures we use for creating AI models we’d probably need all the collective computing power on earth at once.
- ...... So you're saying there's a chance?
Define reason.
Like humans? Of course not. They lack intent, awareness, and grounded meaning. They don’t “understand” problems, they generate token sequences.
- as it is defined in the article
I think I do. Might be an illusion, though.

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.

Even defining reason is hard and becomes a matter of philosophy more than science. For example, apply the same claims to people. Now I've given you something to think about. Or should I say the Markov chain in your head has a new topic to generate thought states for.
- By many definitions, reasoning IS just a form of pattern recognition so the lines are definitely blurred.

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

ICYMI: A.I. is a Religious Cult with Karen Hao

No shit

Why would they "prove" something that's completely obvious?

The burden of proof is on the grifters who have overwhelmingly been making false claims and distorting language for decades.

Why would they "prove" something that's completely obvious?
I don’t want to be critical, but I think if you step back a bit and look and what you’re saying, you’re asking why we would bother to experiment and prove what we think we know.
That’s a perfectly normal and reasonable scientific pursuit. Yes, in a rational society the burden of proof would be on the grifters, but that’s never how it actually works. It’s always the doctors disproving the cure-all, not the snake oil salesmen failing to prove their own prove their own product.
There is value in this research, even if it fits what you already believe on the subject. I would think you would be thrilled to have your hypothesis confirmed.
- The sticky wicket is the proof that humans (functioning 'normally') do more than pattern.
They’re just using the terminology that’s widespread in the field. In a sense, the paper’s purpose is to prove that this terminology is unsuitable.
- I understand that people in this "field" regularly use pseudo-scientific language (I actually deleted that part of my comment).
  But the terminology has never been suitable so it shouldn't be used in the first place. It pre-supposes the hypothesis that they're supposedly "disproving". They're feeding into the grift because that's what the field is. That's how they all get paid the big bucks.
That's called science
Not when large swaths of people are being told to use it everyday. Upper management has bought in on it.
- Yep. I'm retired now, but before retirement a month or so ago, I was working on a project that relied on several hundred people back in 2020. "Why can't AI do it?"
  The people I worked with are continuing the research and putting it up against the human coders, but...there was definitely an element of "AI can do that, we won't need people" next time. I sincerely hope management listens to reason. Our decisions would lead to potentially firing people, so I think we were able to push back on the "AI can make all of these decisions"...for now.
  The AI people were all in, they were ready to build an interface that told the human what the AI would recommend for each item. Errrm, no, that's not how an independent test works. We had to reel them back in.

Just like me

python code for reversing the linked list.

Most humans don't reason. They just parrot shit too. The design is very human.

LLMs deal with tokens. Essentially, predicting a series of bytes.
Humans do much, much, much, much, much, much, much more than that.
- No. They don't. We just call them proteins.
I hate this analogy. As a throwaway whimsical quip it'd be fine, but it's specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it's lowered my tolerance for it as a topic even if you did intend it flippantly.
- I don't mean it to extol LLM's but rather to denigrate humans. How many of us are self imprisoned in echo chambers so we can have our feelings validated to avoid the uncomfortable feeling of thinking critically and perhaps changing viewpoints?
  Humans have the ability to actually think, unlike LLM's. But it's frightening how far we'll go to make sure we don't.
Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive
Yeah I've always said the the flaw in Turing's Imitation Game concept is that if an AI was indistinguishable from a human it wouldn't prove it's intelligent. Because humans are dumb as shit. Dumb enough to force one of the smartest people in the world take a ton of drugs which eventually killed him simply because he was gay.
- I've heard something along the lines of, "it's not when computers can pass the Turing Test, it's when they start failing it on purpose that's the real problem."
- I think that person had to choose between the drugs or hard core prison of the 1950s England where being a bit odd was enough to guarantee an incredibly difficult time as they say in England, I would've chosen the drugs as well hoping they would fix me, too bad without testosterone you're going to be suicidal and depressed, I'd rather choose to keep my hair than to be horny all the time
- Yeah we’re so stupid we’ve figured out advanced maths, physics, built incredible skyscrapers and the LHC, we may as individuals be less or more intelligent but humans as a whole are incredibly intelligent

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

The funny thing about this "AI" griftosphere is how grifters will make some outlandish claim and then different grifters will "disprove" it. Plenty of grant/VC money for everybody.
Extept for Siri, right? Lol
- Apple Intelligence
Without being explicit with well researched material, then the marketing presentation gets to stand largely unopposed.
So this is good even if most experts in the field consider it an obvious result.

Fucking obviously. Until Data's positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.

It’s an expensive carbon spewing parrot.
- It's a very resource intensive autocomplete

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

This has been known for years, this is the default assumption of how these models work.

You would have to prove that some kind of actual reasoning capacity has arisen as... some kind of emergent complexity phenomenon.... not the other way around.

Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.

Define, "reasoning". For decades software developers have been writing code with conditionals. That's "reasoning."
LLMs are "reasoning"... They're just not doing human-like reasoning.
- Howabout uh...
  The ability to take a previously given set of knowledge, experiences and concepts, and combine or synthesize them in a consistent, non contradictory manner, to generate hitherto unrealized knowledge, or concepts, and then also be able to verify that those new knowledge and concepts are actually new, and actually valid, or at least be able to propose how one could test whether or not they are valid.
  Arguably this is or involves meta-cognition, but that is what I would say... is the difference between what we typically think of as 'machine reasoning', and 'human reasoning'.
  Now I will grant you that a large amount of humans essentially cannot do this, they suck at introspecting and maintaining logical consistency, that they are just told 'this is how things work', and they never question that untill decades later and their lives force them to address, or dismiss their own internally inconsisten beliefs.
  But I would also say that this means they are bad at 'human reasoning'.
  Basically, my definition of 'human reasoning' is perhaps more accurately described as 'critical thinking'.

No shit. This isn't new.

Employers who are foaming at the mouth at the thought of replacing their workers with cheap AI:

🫢

Can’t really replace. At best, this tech will make employees more productive at the cost of the rainforests.
- Yes but asshole employers haven’t realized this yet

Thank you Captain Obvious! Only those who think LLMs are like "little people in the computer" didn't knew this already.

Yeah, well there are a ton of people literally falling into psychosis, led by LLMs. So it’s unfortunately not that many people that already knew it.
- Dude they made chat gpt a little more boit licky and now many people are convinced they are literal messiahs. All it took for them was a chat bot and a few hours of talk.

I think it's important to note (i'm not an llm I know that phrase triggers you to assume I am) that they haven't proven this as an inherent architectural issue, which I think would be the next step to the assertion.

do we know that they don't and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don't? That's the big question that needs answered. It's still possible that we just haven't properly incentivized reason over memorization during training.

if someone can objectively answer "no" to that, the bubble collapses.

In case you haven't seen it, the paper is here - https://machinelearning.apple.com/research/illusion-of-thinking (PDF linked on the left).
The puzzles the researchers have chosen are spatial and logical reasoning puzzles - so certainly not the natural domain of LLMs. The paper doesn't unfortunately give a clear definition of reasoning, I think I might surmise it as "analysing a scenario and extracting rules that allow you to achieve a desired outcome".
They also don't provide the prompts they use - not even for the cases where they say they provide the algorithm in the prompt, which makes that aspect less convincing to me.
What I did find noteworthy was how the models were able to provide around 100 steps correctly for larger Tower of Hanoi problems, but only 4 or 5 correct steps for larger River Crossing problems. I think the River Crossing problem is like the one where you have a boatman who wants to get a fox, a chicken and a bag of rice across a river, but can only take two in his boat at one time? In any case, the researchers suggest that this could be because there will be plenty of examples of Towers of Hanoi with larger numbers of disks, while not so many examples of the River Crossing with a lot more than the typical number of items being ferried across. This being more evidence that the LLMs (and LRMs) are merely recalling examples they've seen, rather than genuinely working them out.
do we know that they don't and are incapable of reasoning.
"even when we provide the algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve"
- That indicates that this particular model does not follow instructions, not that it is architecturally fundamentally incapable.

This sort of thing has been published a lot for awhile now, but why is it assumed that this isn't what human reasoning consists of? Isn't all our reasoning ultimately a form of pattern memorization? I sure feel like it is. So to me all these studies that prove they're "just" memorizing patterns don't prove anything other than that, unless coupled with research on the human brain to prove we do something different.

Agreed. We don't seem to have a very cohesive idea of what human consciousness is or how it works.
- ... And so we should call machines "intelligent"? That's not how science works.
You've hit the nail on the head.
Personally, I wish that there's more progress in our understanding of human intelligence.
- Their argument is that we don't understand human intelligence so we should call computers intelligent.
  That's not hitting any nail on the head.
why is it assumed that this isn’t what human reasoning consists of?
Because science doesn't work work like that. Nobody should assume wild hypotheses without any evidence whatsoever.
Isn’t all our reasoning ultimately a form of pattern memorization? I sure feel like it is.
You should get a job in "AI". smh.
- Sorry, I can see why my original post was confusing, but I think you've misunderstood me. I'm not claiming that I know the way humans reason. In fact you and I are on total agreement that it is unscientific to assume hypotheses without evidence. This is exactly what I am saying is the mistake in the statement "AI doesn't actually reason, it just follows patterns". That is unscientific if we don't know whether or "actually reasoning" consists of following patterns, or something else. As far as I know, the jury is out on the fundamental nature of how human reasoning works. It's my personal, subjective feeling that human reasoning works by following patterns. But I'm not saying "AI does actually reason like humans because it follows patterns like we do". Again, I see how what I said could have come off that way. What I mean more precisely is:
  It's not clear whether AI's pattern-following techniques are the same as human reasoning, because we aren't clear on how human reasoning works. My intuition tells me that humans doing pattern following seems equally as valid of an initial guess as humans not doing pattern following, so shouldn't we have studies to back up the direction we lean in one way or the other?
  I think you and I are in agreement, we're upholding the same principle but in different directions.
This. Same with the discussion about consciousness. People always claim that AI is not real intelligence, but no one can ever define what real/human intelligence is. It's like people believe in something like a human soul without admitting it.
Humans apply judgment, because they have emotion. LLMs do not possess emotion. Mimicking emotion without ever actually having the capability of experiencing it is sociopathy. An LLM would at best apply patterns like a sociopath.
- But for something like solving a Towers of Hanoi puzzle, which is what this study is about, we're not looking for emotional judgements - we're trying to evaluate the logical reasoning capabilities. A sociopath would be equally capable of solving logic puzzles compared to a non-sociopath. In fact, simple computer programs do a great job of solving these puzzles, and they certainly have nothing like emotions. So I'm not sure that emotions have much relevance to the topic of AI or human reasoning and problem solving, at least not this particular aspect of it.
  As for analogizing LLMs to sociopaths, I think that's a bit odd too. The reason why we (stereotypically) find sociopathy concerning is that a person has their own desires which, in combination with a disinterest in others' feelings, incentivizes them to be deceitful or harmful in some scenarios. But LLMs are largely designed specifically as servile, having no will or desires of their own. If people find it concerning that LLMs imitate emotions, then I think we're giving them far too much credit as sentient autonomous beings - and this is coming from someone who thinks they think in the same way we do! The think like we do, IMO, but they lack a lot of the other subsystems that are necessary for an entity to function in a way that can be considered as autonomous/having free will/desires of its own choosing, etc.
- That just means they'd be great CEOs!
  According to Wall Street.

Would like a link to the original research paper, instead of a link of a screenshot of a screenshot

https://machinelearning.apple.com/research/illusion-of-thinking

It's all "one instruction at a time" regardless of high processor speeds and words like "intelligent" being bandied about. "Reason" discussions should fall into the same query bucket as "sentience".

My impression of LLM training and deployment is that it's actually massively parallel in nature - which can be implemented one instruction at a time - but isn't in practice.

Yah of course they do they’re computers

That's not really a valid argument for why, but yes the models which use training data to assemble statistical models are all bullshitting. TBH idk how people can convince themselves otherwise.
- TBH idk how people can convince themselves otherwise.
  They don’t convince themselves. They’re convinced by the multi billion dollar corporations pouring unholy amounts of money into not only the development of AI, but its marketing. Marketing designed to not only convince them that AI is something it’s not, but also that that anyone who says otherwise (like you) are just luddites who are going to be “left behind”.
- I think because it's language.
  There's a famous quote from Charles Babbage when he presented his difference engine (gear based calculator) and someone asking "if you put in the wrong figures, will the correct ones be output" and Babbage not understanding how someone can so thoroughly misunderstand that the machine is, just a machine.
  People are people, the main thing that's changed since the Cuneiform copper customer complaint is our materials science and networking ability. Most things that people interact with every day, most people just assume work like it appears to on the surface.
  And nothing other than a person can do math problems or talk back to you. So people assume that means intelligence.
- They aren't bullshitting because the training data is based on reality. Reality bleeds through the training data into the model. The model is a reflection of reality.
Computers are better at logic than brains are. We emulate logic; they do it natively.
It just so happens there's no logical algorithm for "reasoning" a problem through.

You assume humans do the opposite? We literally institutionalize humans who not follow set patterns.

Maybe you failed all your high school classes, but that ain't got none to do with me.
- Funny how triggering it is for some people when anyone acknowledges humans are just evolved primates doing the same pattern matching.
Some of them, sometimes. But some are adulated and free and contribute vast swathes to our culture and understanding.

XD so, like a regular school/university student that just wants to get passing grades?

Of course, that is obvious to all having basic knowledge of neural networks, no?

I still remember Geoff Hinton's criticisms of backpropagation.
IMO it is still remarkable what NNs managed to achieve: some form of emergent intelligence.

So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

Huh.

The AI stands for Actually Indians /s

I mean... Is that not reasoning, I guess? It's what my brain does-- recognizes patterns and makes split second decisions.

Yes, this comment seems to indicate that your brain does work that way.

hey I cant recognize patterns so theyre smarter than me at least

I use LLMs as advanced search engines. No ads or sponsored results.

There are search engines that do this better. There’s a world out there beyond Google.
- Like what?
  I don’t think there’s any search engine better than Perplexity. And for scientific research Consensus is miles ahead.
There are ads but they're subtle enough that you don't recognize them as such.

The difference between reasoning models and normal models is reasoning models are two steps, to oversimplify it a little they prompt "how would you go about responding to this" then prompt "write the response"

It's still predicting the most likely thing to come next, but the difference is that it gives the chance for the model to write the most likely instructions to follow for the task, then the most likely result of following the instructions - both of which are much more conformant to patterns than a single jump from prompt to response.

But it still manages to fuck it up.
I've been experimenting with using Claude's Sonnet model in Copilot in agent mode for my job, and one of the things that's become abundantly clear is that it has certain types of behavior that are heavily represented in the model, so it assumes you want that behavior even if you explicitly tell it you don't.
Say you're working in a yarn workspaces project, and you instruct Copilot to build and test a new dashboard using an instruction file. You'll need to include explicit and repeated reminders all throughout the file to use yarn, not NPM, because even though yarn is very popular today, there are so many older examples of using NPM in its model that it's just going to assume that's what you actually want - thereby fucking up your codebase.
I've also had lots of cases where I tell it I don't want it to edit any code, just to analyze and explain something that's there and how to update it... and then I have to stop it from editing code anyway, because halfway through it forgot that I didn't want edits, just explanations.
- I’ve also had lots of cases where I tell it I don’t want it to edit any code, just to analyze and explain something that’s there and how to update it… and then I have to stop it from editing code anyway, because halfway through it forgot that I didn’t want edits, just explanations.
  I find it hilarious that the only people these LLMs mimic are the incompetent ones. I had a coworker that changed things when asked to explain constantly.
- To be fair, the world of JavaScript is such a clusterfuck... Can you really blame the LLM for needing constant reminders about the specifics of your project?
  When a programming language has five hundred bazillion absolutely terrible ways of accomplishing a given thing—and endless absolutely awful code examples on the Internet to "learn from"—you're just asking for trouble. Not just from trying to get an LLM to produce what you want but also trying to get humans to do it.
  This is why LLMs are so fucking good at writing rust and Python: There's only so many ways to do a thing and the larger community pretty much always uses the same solutions.
  JavaScript? How can it even keep up? You're using yarn today but in a year you'll probably like, "fuuuuck this code is garbage... I need to convert this all to [new thing]."
The difference between reasoning models and normal models is reasoning models are two steps,
That's a garbage definition of "reasoning". Someone who is not a grifter would simply call them two-step models (or similar), instead of promoting misleading anthropomorphic terminology.

So they have worked out that LLMs do what they were programmed to do in the way that they were programmed? Shocking.

While I hate LLMs with passion and my opinion of them boiling down to being glorified search engines and data scrapers, I would ask Apple: how sour are the grapes, eh?

edit: wording

It's not just the memorization of patterns that matters, it's the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that's value - that's the new Google.

While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.
Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It's gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.
- Hallucinations and the cost of running the models.
  So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from "a book" or "the TV" has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.
  The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What's the price of the resulting increased ignorance of the general population due to the high cost of information access?
  What good is a bunch of knowledge stuck behind a search engine when people don't know how to access it, or access it efficiently?
  Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.
  Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.
  I also worry that "easy access" to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don't know because they're dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead...

Fair, but the same is true of me. I don't actually "reason"; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a "nasty logic error" pattern match at some point in the process, I "know" I've found a "flaw in the argument" or "bug in the design".

But there's no from-first-principles method by which I developed all these patterns; it's just things that have survived the test of time when other patterns have failed me.

I don't think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.

You either an llm, or don't know how your brain works.
- LLMs don't know how how they work
This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).
Higher reasoning is taught to humans. We have the capability. That's why we spend the first quarter of our lives in education. Sometimes not all of us are able.
I'm sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".

The guy selling the car doesn't tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it's utility is nearly limitless because Tesla has demonstrated there's no actual penalty for lying to investors.
Then use a different word. "AI" and "reasoning" makes people think of Skynet, which is what the weird tech bros want the lay person to think of. LLMs do not "think", but that's not to say I might not be persuaded of their utility. But thats not the way they are being marketed.
Cars are horses. How do you feel about statement?

It has so much data, it might as well be reasoning. As it helped me with my problem.

What a dumb title. I proved it by asking a series of questions. It’s not AI, stop calling it AI, it’s a dumb af language model. Can you get a ton of help from it, as a tool? Yes! Can it reason? NO! It never could and for the foreseeable future, it will not.

It’s phenomenal at patterns, much much better than us meat peeps. That’s why they’re accurate as hell when it comes to analyzing medical scans.