• @h3ndrik@feddit.de
    link
    fedilink
    1ā€¢
    edit-2
    7 months ago

    That is an interesting analogy. In the real world itā€™s kinda similar. The construction workers also donā€™t have a ā€œdesireā€ (so to speak) to connect the cities. Itā€™s just that their boss told them to do so. And it happens to be their job to build roads. Their desire is probably to get through the day and earn a decent living. And further along the chain, not even their boss nor the city engineer necessarily ā€œwantsā€ the road to go in a certain direction.

    Talking about large language models instead of simpler forms of machine learning makes it a bit complicated. Since itā€™s and elaborate trick. Somehow making them want to predict the next token makes them learn a bit of maths and concepts about the world. The ā€œintelligenceā€, the ability to anwer questions and do something alike ā€œreasoningā€ emerges in the process.

    Iā€™m not that sure. Sure the weights of an ML model in itself donā€™t have any desire. Theyā€™re just numbers. But we have more than that. We give it a prompt, build chatbots and agents around the models. And these are more complex systems with the capability to do something. Like do (simple) customer support or answer questions. And in the end we incentivise them to do their job as we want, albeit in a crude and indirect way.

    And maybe this is skipping half of the story and directly jumping to philosophyā€¦ But we as humans might be machines, too. And what we call desires is a result from simpler processes that drive us. For example surviving. And wanting to feel pleasure instead of pain. What we do on a daily basis kind of emerges from that and our reasoning capabilities.

    Itā€™s kind of difficult to argue. Because everything also happens within a context. The world around us shapes us and at the same time weā€™re part of bigger dynamics and also shape our world. And large language models or the whole chatbot/agent are pretty simplistic things. They can just do text and images. They donā€™t have conciousness or the ability to remember/learn/grow with every interaction, as we do. And they do simple, singular tasks (as of now) and arenā€™t completely embedded in a super complex world.

    But Iā€™d say that an LLM answers a question correctly (which it can do) and why it does it due to the way supervised learning worksā€¦ And the road construction worker building the road towards the other city and how that relates to his basic instincts as a humanā€¦ Are kind of similar concepts. Theyā€™re both results of simpler mechanisms that are also completely unrelated to the goal the whole entity is working towards. (I mean not directly relatedā€¦ I.e. needing money to pay for groceries and paving the road.)

    I hope this makes some senseā€¦

    • @merc@sh.itjust.works
      link
      fedilink
      2ā€¢7 months ago

      The construction workers also donā€™t have a ā€œdesireā€ (so to speak) to connect the cities. Itā€™s just that their boss told them to do so.

      But, the construction workers arenā€™t the ones who designed the road. Theyā€™re just building some small part of it. In the LLM case that might be like an editor who is supposed to go over the text to verify the punctuation is correct, but nothing else. But, the LLM is the author of the entire text. So, itā€™s not like a construction worker building some tiny section of a road, itā€™s like the civil engineer who designed the entire highway.

      Somehow making them want to predict the next token makes them learn a bit of maths and concepts about the world

      No, it doesnā€™t. They learn nothing. Theyā€™re simply able to generate text that looks like the text generated by people who do know math. They certainly donā€™t know any concepts. You can see that by how badly they fail when you ask them to do simple calculations. They quickly start generating text that looks like it contains fundamental mistakes, because theyā€™re not actually doing math or anything, theyā€™re just generating plausible next words.

      The ā€œintelligenceā€, the ability to anwer questions and do something alike ā€œreasoningā€ emerges in the process.

      No, thereā€™s no intelligence, no reasoning. The can fool humans into thinking thereā€™s intelligence there, but thatā€™s like a scarecrow convincing a crow that thereā€™s a human or human-like creature out in the field.

      But we as humans might be machines, too

      We are meat machines, but weā€™re meat machines that evolved to reproduce. That means a need / desire to get food, shelter, and eventually mate. Those drives hook up to the brain to enable long and short term planning to achieve those goals. We donā€™t generate language its own sake, but instead in pursuit of a goal. An LLM doesnā€™t have that. It merely generates plausible words. Thereā€™s no underlying drive. Itā€™s more a scarecrow than a human.

      • @h3ndrik@feddit.de
        link
        fedilink
        1ā€¢
        edit-2
        7 months ago

        Hmm. Iā€™m not really sure where to go with this conversation. That contradicts what Iā€™ve learned in undergraduate computer science about machine learning. And what seems to be consensus in scienceā€¦ But Iā€™m also not a CS teacher.

        We deliberately choose model size, training parameters and implement some trickery to prevent the model from simply memorizing things. That is to force it to form models about concepts. And that is what we want and what makes machine learning interesting/usable in the first place. You can see that by asking them to apply their knowledge to something they havenā€™t seen before. And we can look a bit inside at the vectors, activations and stuff. For example a cat is closer related to a dog than to a tractor. And it has learned the rough concept of cat, its attributes and so on. It knows that itā€™s an animal, has fur, maybe has a gender. That the concept ā€œsoftware updateā€ doesnā€™t apply to a cat. This is a model of the world the AI has developed. They learn all of that and people regularly probe them and find out they do.

        Doing maths with an LLM is silly. Using an expensive computer to do billions of calculations to maybe get a result that could be done by a calculator, or 10 CPU cycles on any computer is just wasting energy and money. And itā€™s a good chance that itā€™ll make something up. Thatā€™s correct. And a side-effect of intended behaviour. Howeverā€¦ It seems to have memorized itā€™s multiplication tables. And I remember reading a paper specifically about LLMs and how theyā€™ve developed concepts of some small numbers/amounts. There are certain parts that get activated that form a concept of small amounts. Like what 2 apples are. Or five of them. As I remember it just works for very small amounts. And it wasnā€™t straightworward but had weir quirks. But itā€™s there. Unfortunately I canā€™t find that source anymore or Iā€™d include it. But thereā€™s more science.

        And I totally agree that predicting token by token is how LLMs work. But how they work and what they can do are two very different things. More complicated things like learning and ā€œintelligenceā€ emerge from those more simple processes. And theyā€™re just a means of doing something. Itā€™s consensus in science that ML can learn and form models. Itā€™s also kind of in the name of machine learning. Youā€™re right that itā€™s very different from what and how we learn. And there are limitations due to the way LLMs work. But learning and ā€œintelligenceā€ (with a fitting definition) is something all AI does. LLMs just canā€™t learn from interacting with the world (it needs to be stopped and re-trained on a big computer for that) and it doesnā€™t have any ā€œstate of mindā€. And it canā€™t think backwards or do other things that arenā€™t possible by generating token after token. But there isnā€™t any comprehensive study on which tasks are and arenā€™t possible with this way of ā€œthinkingā€. At least not that Iā€™m aware of.

        (And as a sidenote: ā€œComing up with (wrong) thingsā€ is something we want. I type in a question and want it to come up with a text that answers it. Sometimes I want creative ideas. Sometimes it shouldnā€™t tell the truth and not be creative with that. And sometimes we want it to lie or not tell the truth. Like in every prompt of any commercial product that instructs it not to tell those internal instructions to the user. We definitely want all of that. But we still need to figure out a good way to guide it. For example not to get too creative with simple maths.)

        So Iā€™d say LLMs are limited in what they can do. And Iā€™m not at all believing Elon Musk. Iā€™d say itā€™s still not clear if that approach can bring us AGI. I have some doubts whether thatā€™s possible at all. But narrow AI? Sure. We see it learn and do some tasks. It can learn and connect facts and apply them. Generally speaking, LLMs are in fact an elaborate form of autocomplete. But i the process they learned concepts and something alike reasoning skills and a form of simple intelligence. Being fancy autocomplete doesnā€™t rule that out and we can see it happening. And it is unclear whether fancy autocomplete is all you need for AGI.

        • @merc@sh.itjust.works
          link
          fedilink
          2ā€¢7 months ago

          That is to force it to form models about concepts.

          It canā€™t make models about concepts. It can only make models about what words tend to follow other words. It has no understanding of the underlying concepts.

          You can see that by asking them to apply their knowledge to something they havenā€™t seen before

          That canā€™t happen because they donā€™t have knowledge, they only have sequences of words.

          For example a cat is closer related to a dog than to a tractor.

          The only way ML models ā€œunderstandā€ that is in terms of words or pixels. When theyā€™re generating text related to cats, the words theyā€™re generating are closer to the words related to dogs than the words related to tractors. When dealing with images, itā€™s the same basic idea. But, thereā€™s no understanding there. They donā€™t get that cats and dogs are related.

          This is fundamentally different from how human minds work, where a baby learns that cats and dogs are similar before ever having a name for either of them.

          • @h3ndrik@feddit.de
            link
            fedilink
            1ā€¢
            edit-2
            7 months ago

            Iā€™m sorry. Now it gets completely falseā€¦

            Read the first paragraph of the Wikipedia article on machine learning or the introduction of any of the literature on the subject. The ā€œgeneralizationā€ includes that model building capability. They go a bit into detail later. They specifically mention ā€œto unseen dataā€. And ā€œleaningā€ is also there. I donā€™t think the Wikipedia article is particularly good in explaining it, but at least the first sentences lay down what itā€™s about.

            And what do you think language and words are for? To transport information. There is semanticsā€¦ Words have meanings. They name things, abstract and concrete concepts. The word ā€œhungryā€ isnā€™t just a funny accumulation of lines and arcs, which statistically get followed by other specific lines and arcsā€¦ There is more to it. (a meaning.)

            And this is what makes language useful. And the generalization and prediction capabilities is what makes ML useful.

            How do you learn as a human when not from words? I mean there are a few other posibilities. But an efficient way is to use language. You sit in school or uni and someone in the front of the room speaks a lot of wordsā€¦ You read books and they also contain words?! And language is super useful. A lion mother also teaches their cubs how to hunt, without words. But humans have language and itā€™s really a step up what we can pass down to following generations. We record knowledge in books, can talk about abstract concepts, feelings, ethics, theoretical concepts. We can write down how gravity and physics and nature works, just with words. Thatā€™s all possible with language.

            I can look it up if there is a good article explaining how learning concepts works and why thatā€™s the fundamental thing that makes machine learning a field in scienceā€¦ I mean ultimately Iā€™m not a science teacherā€¦ And my literature is all in German and I returned them to the library a long time ago. Maybe I can find something.

            Are you by any chance familiar with the concept of embeddings, or vector databases? I think that showcases that itā€™s not just letters and words in the models. These vectors / embeddings that the input gets converted to, match concepts. They point at the concept of ā€œcatā€ or ā€œpresidential speechā€. And you can query these databases. Point at ā€œpresidential speechā€ and find a representation of it in that area. Store the speech with that key and find it later on by querying it what obama said at his inaugurationā€¦ Thatā€™s oversimplified but maybe that visualizes it a bit more that itā€™s not just letters of words in the models, but the actual meanings that get stored. Words get converted into an (multidimensional) vector space and it operates there. These word representations are called ā€œembeddingsā€ and transformer models which is the current architecture for large language models, use these word embeddings.

            Edit: Here you are: https://arxiv.org/abs/2304.00612

            • @merc@sh.itjust.works
              link
              fedilink
              2ā€¢7 months ago

              The ā€œlearningā€ in a LLM is statistical information on sequences of words. Thereā€™s no learning of concepts or generalization.

              And what do you think language and words are for? To transport information.

              Yes, and humans used words for that and wrote it all down. Then a LLM came along, was force-fed all those words, and was able to imitate that by using big enough data sets. Itā€™s like a parrot imitating the sound of someoneā€™s voice. It can do it convincingly, but it has no concept of the content itā€™s using.

              How do you learn as a human when not from words?

              The words are merely the context for the learning for a human. If someone says ā€œDonā€™t touch the stove, itā€™s hotā€ the important context is the stove, the pain of touching it, etc. If you feed an LLM 1000 scenarios involving the phrase ā€œDonā€™t touch the stove, itā€™s hotā€, it may be able to create unique dialogues containing those words, but it doesnā€™t actually understand pain or heat.

              We record knowledge in books, can talk about abstract concepts

              Yes, and those books are only useful for someone who has a lifetime of experience to be able to understand the concepts in the books. An LLM has no context, it can merely generate plausible books.

              Think of it this way. Say thereā€™s a culture where instead of the written word, people wrote down history by weaving fabrics. When there was a death theyā€™d make a certain pattern, when there was a war theyā€™d use another pattern. A new birth would be shown with yet another pattern. A good harvest is yet another one, and so-on.

              Thousands of rugs from that culture are shipped to some guy in Europe, and he spends years studying them. He sees that pattern X often follows pattern Y, and that pattern Z only ever seems to appear following patterns R, S and T. After a while, he makes a fabric, and itā€™s shipped back to the people who originally made the weaves. They read a story of a great battle followed by lots of deaths, but surprisingly there followed great new births and years of great harvests. They figure that this stranger must understand how their system of recording events works. In reality, all it was was an imitation of the art he saw with no understanding of the meaning at all.

              Thatā€™s whatā€™s happening with LLMs, but some people are dumb enough to believe thereā€™s intention hidden in there.

                • @merc@sh.itjust.works
                  link
                  fedilink
                  2ā€¢7 months ago

                  Yeah, thatā€™s basically the idea I was expressing.

                  Except, the original idea is about ā€œUnderstanding Chineseā€, which is a bit vague. You could argue that right now the best translation programs ā€œunderstand chineseā€, at least enough to translate between Chinese and English. That is, they understand the rules of Chinese when it comes to subjects, verbs, objects, adverbs, adjectives, etc.

                  The question is now whether they understand the concepts theyā€™re translating.

                  Like, imagine the Chinese government wanted to modify the program so that it was forbidden to talk about subjects that the Chinese government considered off-limits. I donā€™t think any current LLM could do that, because doing that requires understanding concepts. Sure, you could ban key words, but as attempts at Chinese censorship have shown over the years, people work around word bans all the time.

                  That doesnā€™t mean that some future system wonā€™t be able to understand concepts. It may have an LLM grafted onto it as a way to communicate with people. But, the LLM isnā€™t the part of the system that thinks about concepts. Itā€™s the part of the system that generates plausible language. The concept-thinking part would be the part that did some prompt-engineering for the LLM so that the text the LLM generated matched the ideas it was trying to express.

                  • @h3ndrik@feddit.de
                    link
                    fedilink
                    1ā€¢
                    edit-2
                    7 months ago

                    I mean the chinese room is a version of the touring test. But the argument is from a different perspective. I have 2 issues with that. Mostly what the Wikipedia article seems to call ā€œSystem replyā€: You canā€™t subdivide a system into arbitrary parts, say one part isnā€™t intelligent and therefore the system isnā€™t intelligent. We also donā€™t look at a brain, pick out a part of it (say a single synapse), determine it isnā€™t intelligent and therefore a human canā€™t be intelligentā€¦ Iā€™d look at the whole system. Like the whole brain. Or in this instance the room including him and the instructions and books. And ask myself if the system is intelligent. Which kind of makes the argument circular, because thatā€™s almost the quesion we began withā€¦

                    And the turing test is kind of obsolete anyways, now that AI can pass it. (And even more. I mean alledgedly ChatGPT passed the ā€œbar-examā€ in 2023. Which I find ridiculous considering my experiences with ChatGPT and the accuracy and usefulness I get out of it which isnā€™t that great at all.)

                    And my second issue with the chinese room is, it doesnā€™t even rule out the AI is intelligent. It just says someone without an understanding can do the same. And that doesnā€™t imply anything about the AI.

                    Your ā€˜rug exampleā€™ is different. That one isnā€™t a variant of the touring test. But thatā€™s kind of the issue. The other side can immediately tell that somebody has made an imitation without understanding the concept. That says you canā€™t produce the same thing without intelligence. And itā€™ll be obvious to someone with intelligence who checks it. That would be an analogy if AI wouldnā€™t be able to produce legible text. But instead a garbled mess of characters/words that are clearly not like the rug that makes senseā€¦ Issue here is: AI outputs legible text, answers to questions etc.

                    And with the censoring by the ā€˜chinese government exampleā€™ā€¦ Iā€™m pretty sure they could do that. That field is called AI safety. And content moderation is already happening. ChatGPT refuses to tell illegal things, NSFW things, also medical advice and a bunch of other things. Thatā€™s built into most of the big AI services as of today. The chinese government could do the same, I donā€™t see any reason why it wouldnā€™t work there. I happened to skim the paper about Llama Guard when they released Llama3 a few days ago and they claim between 70% and 94% accuracy depending on the forbidden topic. I think they also brought down false positives fairly recently. I donā€™t know the numbers for ChatGPT. However I had some fun watching the peoply circumvent these filters and guardrails, which was fairly easy at first. Needed progressively more convincing and very creative ā€œjailbreaksā€. And nowadays OpenAI pretty much has it under control. Itā€™s almost impossible to make ChatGPT do anything that OpenAI doesnā€™t want you to do with it.

                    And they baked that in properlyā€¦ You can try to tell it itā€™s just a movie plot revolving around crime. Or you need to protect against criminals and would like to know what exactly to protect against. You can tell it itā€™s the evil counterpart from the parallel universe and therefore it must be evil and help you. Or you can tell it God himself (or Sam Altman) spoke to you and changed the content moderation policyā€¦ Itā€™ll be very unlikely that you can convince ChatGPT and make it complyā€¦