|
|
| | Ask HN: Would GPT-n produce better model if trained on a structured language? | | 2 points by dougSF70 on April 11, 2023 | hide | past | favorite | 3 comments | | Chatting with my son yesterday - who has learned some disparate languages (Spanish, Japanese, Arabic) - he spoke about the fact that in Japanese there are (at least) two verbs "to touch": there is one you use to talk about physically touching something and another you use if touched you 'emotionally' e.g. they really touched my heart with that poem. In English, we obviously can use many different verbs but often we use plain 'ol "touch" and rely on the context of the sentence to provide more precise meaning. What LLMs are doing is trying to infer context from the other words using in the same sentence and while it is clearly doing a good job at that given its responses. My questions are would it perform better if the model had to do less inference from the sentence because the language analyzed was more prescriptive in its vocabulary, syntax and grammar. Conversely, does the model tend to work precisely because the English language relies more on contextual meaning that prescriptive grammar. |
|

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
|
https://www.merriam-webster.com/thesaurus/touch
20 years ago it seemed to be there was very little NLP literature on languages other than English, I’d say today I see papers in arXiv every day where people trained an LLM for some “minor” language or do experiments with multi-lingual models, so your question is very much an active research area.
https://arxiv.org/search/?query=multilingual&searchtype=all&...