Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Chomsky hierarchy is beautiful, and a fundamental result in formal languages and automata theory and complexity that has far-reaching implications for computer science (from compiler design to computability).

What I learned only in a theoretical/formal linguistics class is that there exist other hierarchies of languages and associated machines (e.g. A, B, C1, C2, C3 languages) entirely orthogonal to the Chomsky hierarchy's classes, i.e. recursively enumerable, context-sensitive, context-free, and regular classes.

The way rules may look like and how they get applied, induce alternative universes of (hierarchies of) formal languages and their automata. Neural networks have come a long way from the Perceptron's inability to compute XOR to the OP's paper. It would be interesting to push that work further so as to include alternative hierarchies.



This is a bit of a layman's outsider perspective, so I'd welcome having my perceptions corrected by people who are more up on the way formal grammars are employed in the ML community.

To my mind, though, the Chomsky-ish ML focus on 'grammars' always seemed to me to miss some of the power of language. Like, the idea that there is something profound in the fact that "Colorless green ideas sleep furiously" is a grammatical fragment that is semantically meaningless seems to me to miss the fact that it's still a meaningful utterance in a language, even if what it denotes is nonsensical. As is proven by the fact that Chomsky used it in his work on the subject! Its meaninglessness is meaningful - you can use it to illustrate a point. The same is true of ungrammatical sentences, too. Not only do actual humans make ungrammatical utterances all the time, but for example if you are trying to teach someone a lamguage, you might show them an ungrammatical sentence by way of an example of what not to do. So any language that is capable of talking about its own grammar has to admit sentences that are ungrammatical in that language, precisely so you can talk about them!

Human language understanding isn't 'parsing', we don't reject utterances because we can't lex the tokens or construct an unambiguous syntax tree or extract semantic meaning. Chomskyish ML researchers seem focused on the problem of making systems that produce grammatical utterances. But I'm honestly more impressed with, for example, GPT's ability to handle and produce the ungrammatical and semantically nonsensical using the same system that it uses to work with grammatical and sensical input and output. That seems much more human-like to me.

That a particular neural architecture is incapable of 'rejecting' certain ungrammatical structures doesn't feel like it matters, so long as the architecture is capable of recognizing a more fuzzy analogue degree of 'gramatticalness' of an utterance.


>is that there exist other hierarchies of languages and associated machines

Please explain.


I believe parent is referring to the LA-grammar hierarchy[1] of formal languages.

1. https://www.lagrammar.net/monographs/1999/slides/pdf/chapter... (last slide)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: