ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic
ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Did the author thinks ChatGPT is in fact an AGI? It's a chatbot. Why would it be good at chess? It's like saying an Atari 2600 running a dedicated chess program can beat Google Maps at chess.
AI including ChatGPT is being marketed as super awesome at everything, which is why that and similar AI is being forced into absolutely everything and being sold as a replacement for people.
Something marketed as AGI should be treated as AGI when proving it isn't AGI.
Not to help the AI companies, but why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff? It's obvious they're shit at it, why do they answer anyway? It's because they're programmed by know-it-all programmers, isn't it.
I don't think ai is being marketed as awesome at everything. It's got obvious flaws. Right now its not good for stuff like chess, probably not even tic tac toe. It's a language model, its hard for it to calculate the playing field. But ai is in development, it might not need much to start playing chess.
Most people do. It's just called AI in the media everywhere and marketing works. I think online folks forget that something as simple as getting a Lemmy account by yourself puts you into the top quintile of tech literacy.
Yet even on Lemmy people can't seem to make sense of these terms and are saying things like "LLM's are not AI"
Google Maps doesn't pretend to be good at chess. ChatGPT does.
A toddler can pretend to be good at chess but anybody with reasonable expectations knows that they are not.
well so much hype has been generated around chatgpt being close to AGI that now it makes sense to ask questions like "can chatgpt prove the Riemann hypothesis"
I agree with your general statement, but in theory since all ChatGPT does is regurgitate information back and a lot of chess is memorization of historical games and types, it might actually perform well. No, it can't think, but it can remember everything so at some point that might tip the results in it's favor.
Regurgitating an impression of, not regurgitating verbatim, that's the problem here.
Chess is 100% deterministic, so it falls flat.
I mean it may be possible but the complexity would be so many orders of magnitude greater. It'd be like learning chess by just memorizing all the moves great players made but without any context or understanding of the underlying strategy.
I think that’s generally the point is most people thing chat GPT is this sentient thing that knows everything and… no.
Do they though? No one I talked to, not my coworkers that use it for work, not my friends, not my 72 year old mother think they are sentient.
In all fairness. Machine learning in chess engines is actually pretty strong.
https://www.chess.com/terms/alphazero-chess-engine
Oh absolutely you can apply machine learning to game strategy. But you can't expect a generalized chatbot to do well at strategic decision making for a specific game.
Sure, but machine learning like that is very different to how LLMs are trained and their output.
Articles like this are good because it exposes the flaws with the ai and that it can't be trusted with complex multi step tasks.
Helps people see that think AI is close to a human that its not and its missing critical functionality
The problem is though that this perpetuates the idea that ChatGPT is actually an AI.
I like referring to LLMs as VI (Virtual Intelligence from Mass Effect) since they merely give the impression of intelligence but are little more than search engines. In the end all one is doing is displaying expected results based on a popularity algorithm. However they do this inconsistently due to bad data in and limited caching.
I mean, open AI seem to forget it isn’t.