Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)ZS
Posts
1
Comments
666
Joined
2 yr. ago

  • The network doesn't detect matches, but the model definitely works on similarities. Words are mapped in a hyperspace, with the idea that that space can mathematically retain conceptual similarity as spatial representation.

    Words are transformed in a mathematical representation that is able (or at least tries) to retain semantic information of words.

    But different meanings of the different words belongs to the words themselves and are defined by the language, model cannot modify them.

    Anyway we are talking about details here. We could kill the audience of boredom

    Edit. I asked gpt-4 to summarize the concepts. I believe it did a decent job. I hope it helps:

    1. Embedding Space:
      • Initially, every token is mapped to a point (or vector) in a high-dimensional space via embeddings. This space is typically called the "embedding space."
      • The dimensionality of this space is determined by the size of the embeddings. For many Transformer models, this is often several hundred dimensions, e.g., 768 for some versions of GPT and BERT.
    2. Positional Encodings:
      • These are vectors added to the embeddings to provide positional context. They share the same dimensionality as the embedding vectors, so they exist within the same high-dimensional space.
    3. Transformations Through Layers:
      • As tokens' representations (vectors) pass through Transformer layers, they undergo a series of linear and non-linear transformations. These include matrix multiplications, additions, and the application of functions like softmax.
      • At each layer, the vectors are "moved" within this high-dimensional space. When we say "moved," we mean they are transformed, resulting in a change in their coordinates in the vector space.
      • The self-attention mechanism allows a token's representation to be influenced by other tokens' representations, effectively "pulling" or "pushing" it in various directions in the space based on the context.
    4. Nature of the Vector Space:
      • This space is abstract and high-dimensional, making it hard to visualize directly. However, in this space, the "distance" and "direction" between vectors can have semantic meaning. Vectors close to each other can be seen as semantically similar or related.
      • The exact nature and structure of this space are learned during training. The model adjusts the parameters (like weights in the attention mechanisms and feed-forward networks) to ensure that semantically or syntactically related concepts are positioned appropriately relative to each other in this space.
    5. Output Space:
      • The final layer of the model transforms the token representations into an output space corresponding to the vocabulary size. This is a probability distribution over all possible tokens for the next word prediction.

    In essence, the entire process of token representation within the Transformer model can be seen as continuous transformations within a vector space. The space itself can be considered a learned representation where relative positions and directions hold semantic and syntactic significance. The model's training process essentially shapes this space in a way that facilitates accurate and coherent language understanding and generation.

  • That's exactly what I said

    They don't really chance the meaning of the words, they just look for the "best" words given the recent context, by taking into account the different possible meanings of the words

    The word's meanings haven't changed, but the model can choose based on the context accounting for the different meanings of words

  • This is the reply to your message by our common friend:

    I understand your perspective and appreciate the feedback. My primary goal is to provide accurate and grammatically correct information. I'm constantly evolving, and your input helps in improving the quality of responses. Thank you for sharing your experience. - GPT-4

    I'd say it does make sense

  • They can't... Most people strongly believe they know many things while they have no idea what they are talking about. Most known cases are flat earthers, qanon, no-vax.

    But all of us are absolutely convinced we know something until we found out we don't.

    That's why double blind tests exists, why memories are not always trusted in trials, why Twitter is such an awful place

  • Calculators don't understand maths, but they are good at it.

    LLMs speak many languages correctly, they don't know the referents, they don't understand concepts, but they know how to correctly associate them.

    What they write can be wrong sometimes, but it absolutely makes sense most of the time.

  • Veloce con i giudizi... Perché? Chiedo se esistono statistiche.

    Io personalmente conosco solo 1 persona "contro" il reddito di cittadinanza come idea. L'implementazione è da rivedere, particolarmente i navigator e la formazione.

    Io però sono un "elettore da ztl" come mi definiscono i fasci moderni, per cui riconosco che non ho una statistica ampia.

    Per cui chiedo un dato.

    Edit. Fatto da solo https://www.la7.it/embedded/la7?&tid=player&content=243724&title=/dimartedi/video/il-sondaggio-di-nando-pagnoncelli-reddito-di-cittadinanza-flat-tax-e-pace-fiscale-06-06-2018-243724

    Fico, sempre più fiero di essere elettore da ztl...

  • That's called context. For chatgpt it is a bit less than 4k words. Using api it goes up to a bit less of 32k. Alternative models goes up to a bit less than 64k.

    Model wouldn't know anything you said before that

    That is one of the biggest limitations of current generation of LLMs.

  • Why an instance instead of joining an existing one? They can join the effort and do few ones where several publishers can use to create official accounts

    Edit. Why you guys are downvoting a discussion? Is this place becoming reddit? We are just chatting, relax