• Soyweiser@awful.systems
    link
    fedilink
    English
    arrow-up
    13
    ·
    15 days ago

    Talked to somebody who is really into chatbot roleplay (of the ‘longer term stories with new fantasy characters’ type), and he mentioned that he needs to take his characters stories and archetypes to different models every now and then as a sort of refresh, as the models tend to eventually converge into certain stuck patterns. First clue of this seems to be that the replies seem to start to become a similar pattern of text organization. Sorry if this is vague as it is second hand, but the main point is, text based LLMs prob also do this.

        • corbin@awful.systems
          link
          fedilink
          English
          arrow-up
          5
          ·
          11 days ago

          Nah, it’s more to do with stationary distributions. Most tokens tend to move towards it; only very surprising tokens can move away. (Insert physics metaphor here.) Most LLM architectures are Markov, so once they get near that distribution they cannot escape on their own. There can easily be hundreds of thousands of orbits near the stationary distribution, each fixated on a simple token sequence and unable to deviate. Moreover, since most LLM architectures have some sort of meta-learning (e.g. attention) they can simulate situations where part of a simulation can get stuck while the rest of it continues, e.g. only one chat participant is stationary and the others are not.