Toki pona. End of discussion.

Well, maybe, but i’ve been thinking about turning tokens into words and all that fun stuff (disclaimer, i am against any kind of llm and the similar misuses of that technology) and most importantly, about how directions in this high dimensional space seem to encode true semantical dimensions.

Like how king + woman - man = king + "vector direction associated with feminity = queen (higly simplified)

This brought me to the realization that all languages have a measurable (maybe not exact but at least rough estimate) number of semantic dimensions (usually way lower than the number of words in said language).

Which then made me wonder :

-> How few semantic dimensions do you need for a functionnal conlang ? (i imagine it would be two (binary) but i would be happy to hear your counterpoints)

-> how many words per semantic dimensions do you need to get by and is there a reason why human language have so much “redundancy” (why not have “word for magnitude + word for semantic direction” ad nauseam ?)

And last but not least, can you make a language with only 3 semantic dimensions and speak in rgb colours ?

TlDR : how many semantic direction do you need to make a language ?

Per comment request, here are some links if you found this interesting and want to learn more :

About turning words into vectors :

-> This lesson by 3b1b and all of the related one give a firm grasp on the inner workings of neural networks and llms, which can help debunk bs online andloosely introduces the technology of turning words into vectors (there is a videoversion if you can’t bother reading )

->This article which only deals with word embedding (see section 3 for what i’m specifically talking about

About conlangs :

-> The official toki pona website

-> The language creation society

-> The wiki page about conlangs, for good measure

  • undrwater@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    16 hours ago

    Man, this sounds fascinating, but I’ve got no clue. Can you edit the post with some links for others like me do we can be educated?

    I could search myself, but I’m currently in a state of “no clue if this research return is what’s being talked about”.

    • polotype@lemmy.mlOP
      link
      fedilink
      arrow-up
      3
      ·
      16 hours ago

      Sure thing ! I’ll try to seek some links to resources about how chatbot turn words into vectors and about conlangs.

  • CerebralHawks@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    6
    ·
    2 days ago

    I thought you meant “as far as users go” and I would say Loxian, it’s only heard from one mouth, Enya’s, and it’s only known to be known by Enya and her songwriter, Roma Ryan, who invented the language.

    Based on my limited understanding of what you’re saying — which is not your fault, I think you explained it well enough — I think you’re probably right about Toki pona. A lot of people online know how to read/type it, I think it had a surge about 5-15 years ago. I heard about it but never bothered.

    • polotype@lemmy.mlOP
      link
      fedilink
      arrow-up
      3
      ·
      2 days ago

      Neat, never heard of loxian. Though it does seem (from very brief research) that there is a lot of redundancy with the six scripts being used…

      • CerebralHawks@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 day ago

        Probably. Enya only sings in one of the dialects (scripts?). IIRC four were based on the four elements and honestly I didn’t realise there were six. I just assumed four because four elements.

        I’m not even 100% sure the language is fully developed. You could use AI to write a dictionary based on the five songs Enya has released in Loxian and the lyrics Roma Ryan say they mean in English, but it wouldn’t be enough to be conversational. And we (any of us not the two of them) have no way of knowing if that’s all the language is or if the two of them can effectively communicate in it.