Skip Navigation

User banner
InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)BA
Posts
2
Comments
315
Joined
2 yr. ago

  • Yeah when you use Gemini, it seems like sometimes it’ll just answer based on its training, and sometimes it’ll cite some source after a search, but it seems like you can’t control that. It’s not like Bing that will always summarize and link where it got that information from.

    I also think Gemini probably uses some sort of knowledge graph under the hoods, because it has some very up to date information sometimes.

  • I don’t even think it’s correct to say it’s querying anything, in the sense of a database. An LLM predicts the next token with no regard for the truth (there’s no sense of factual truth during training to penalize it, since that’s a very hard thing to measure).

    Keep in mind that the same characteristic that allows it to learn the language also allows it to sort of come up with facts, it’s just a statistical distribution based on the whole context, which needs a bit randomness so it can be “creative.” So the ability to come up with facts isn’t something LLMs were designed to do, it’s just something we noticed that happens as it learns the language.

    So it learned from a specific dataset, but the measure of whether it will learn any information depends on how well represented it is in that dataset. Information that appears repeatedly in the web is quite easy for it to answer as it was reinforced during training. Information that doesn’t show up much is just not gonna be learned consistently.[1]

    [1] https://youtu.be/dDUC-LqVrPU

  • I don’t think this will affect StackOverflow website though? The blog implies that ChatGPT will use StackOverflow API to use as a knowledge source (and probably be paid for it).

    OpenAI and Stack Overflow are coming together via OverflowAPI access to provide OpenAI users and customers with the accurate and vetted data foundation that AI tools need to quickly find a solution to a problem […]. OpenAI will also surface validated technical knowledge from Stack Overflow directly into ChatGPT, giving users easy access to trusted, attributed, accurate, and highly technical knowledge and code backed by the millions of developers that have contributed to the Stack Overflow platform for 15 years.

    This seems to be exactly to prevent hallucinations when there’s a good vetted answer already.

    Either people didn’t read the blog or is there something I’m missing?

  • Yeah maybe SO should have this kind of warning when you’re writing your problem or question, or maybe it does already (it’s been a long time I posted a question myself).

    In any case, it is an interesting case about a tricky social problem to solve. I used to listen to the SO podcast many years ago, and they always had multiple problems to deal with. One of them was to show the experts good questions, because beginner questions really turn off the experienced people and too much of that would drive them off the website, and at the same time beginners don’t have the habit of searching duplicates etc. so it’s common to spam the website with duplicate.

    At some point they also restricted questions about opinions, because they lead to never ending threads with no objective answers. I’m sure they had a reason for that based on SO history, so the baggage if restrictions start increasing for newcomers to understand the rules. It’s tricky to balance the needs of power users and casual users because they’re often conflicting.

  • I don’t think it’s awkward, it’s kinda necessary.

    Because the people who are answering questions there are doing it for that ideal of having a knowledge repository. No one is helping you because they think you and your specific problem are so important to demand their time. Especially with very tricky errors.

  • Ads have almost always been part and parcel of the YouTube experience. However, there's a point at which ads become so frequent, so irrelevant, and so relentless that they start hurting the user experience. We've been past that point for a while now.

    Ironically, without an ad blocker it’s hard to read the Android Police blog. I invite anyone to try.

  • Well, at least in Brazil Meta has been very okay with following judicial orders and prevent anything that could get them in trouble. Telegram is the one recently covering up Nazi groups and school shooters here.

  • I think most people don’t realize how little money online ads make. Companies resort to it because people won’t pay for every little thing they use, but it’s not a lucrative endeavor. Reason why so many newspapers are shutting down and companies getting desperate after the VC money ran out.