Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>s to your second point—no government today is as large in scope as say Rome, the British Empire, the Persian empire, etc.

Utter hogwash. In terms of scale and intensiveness if state powers even small nations can easily outstrip the bureaucratic complexity of any ancient empire. The challenges of maintaining a welfare state is no small thing.

And plenty of countries manage incredible amounts of diversity. India has 50 official languages on its own, and that’s ignoring dialects and weird subgroups. If things seem more similar in spite of that it’s because nation-sized economies are inherently homogenizing through creation of national languages and bureaucracies. But this is itself part of the complexity. Modern governments literally aim to educate ALL the children born within them. This would have been insane to a Roman.



>And plenty of countries manage incredible amounts of diversity. India has 50 official languages on its own, and that’s ignoring dialects and weird subgroups.

Indo-Aryan is spoken by ~74% of India, with Dravidian spoken by ~24% of India. So in practice the real-life language diversity is very small.

As for 'managing diversity', India has historically 'solved' this through a caste system, not a solution I would propose.


Non-linguists tend to grossly underestimate the number of languages and diversity in regions of the earth. Because "language" is a political construct, it also depends a bit on how you count. According to official classification there are actually around 447 languages spoken in India by native speakers from the region. Of these, 122 are spoken by 10,000 people or more, and of these 22 are official languages.

Source: I'm trained as a general linguist, but are not specialized in ethno- and sociolinguistics. Check out https://www.ethnologue.com for more information.


> Non-linguists tend to grossly underestimate the number of languages and diversity in regions of the earth.

True, but old-school linguists tend to revel in this diversity, whereas more modern linguists try to find commonalities between these very diverse languages.

I remember auditing a linguistics class in college largely because I was interested in NLP. Sadly, I got the old-school professors who were more interested in students memorizing hundreds of variants of sounds than understanding the links between them. I dropped the course. Maybe if I stuck around long enough they would get to more global insights, but I was turned off by their entire approach.


Since I'm trained as a new linguist, not the old school you're referring to, I feel compelled to reply in order to correct some potential misunderstandings here. Structurally, you could also just take the position of Pollard & Sage (1994), for which they were so heavily criticized, and assume that every language on earth has the deep structure of English. By purely structural criteria many languages that historically unrelated would be somehow closely related to each other.

That is clearly not adequate if you want to take a look at languages and culture (in that language). Classifications of languages as languages are partly political, sometimes even highly political, and mutual comprehensibility is not a working criterion.

I don't recall the details, but the classification used by ethnologue.com are a fairly reasonable set of soft criteria. The alternative would be not to speak of languages and dialects at all, and instead only about varieties of various language families with a certain degree of mutual comprehensibility to other varieties, but that would be even more counter-intuitive to laymen.

As to your personal experience: That's sad, you merely seem to have ended up in the wrong department / with the wrong professors. Find some general linguists and computational linguists and you should get your global insights. Phonology was actually at the forefront of general structural descriptions, sometimes even earlier than in syntax. For example, they invented optimality theory. It's worth giving it another try!


> As to your personal experience: That's sad, you merely seem to have ended up in the wrong department / with the wrong professors.

It was a while ago (graduated in 2005), but I think linguistics was under the English or anthropology department (I do remember the building was in the liberal sciences area). While I saw a clear connection between linguistics and NLP, I got the sense that the focus was to understand culture through language, and less about understanding language and communication itself.

> It's worth giving it another try!

I'd like to, but in the ~15 years since I've graduated it feels like the entire computational linguistics field has grown so dramatically that many of the problems I was originally interested in have been more-or-less developed (e.g., segmenting words into their phonemes for machine learning models, building grammar trees). Today, I'm more likely to grab one of the many libraries that do all this magic under the hood while I remain ignorant.


Are you saying that every speaker of an Indo-Aryan language can understand every other speaker of an Indo-Aryan language?


I don’t believe that for the Dravidian languages mutual intelligibility is common. I’m a Tamil speaker and I can’t understand any Telugu or Kannada - it sounds like people speaking Tamil underwater in a dream.


Can someone with more linguistic background compare this to Roman languages for me.

Is this like saying Italian, French, Spanish are pretty much the same because they are Romamtic languages? Or English and German are interchangable, because English is a Germanic language?

Or, is much closer like British and American English- just mostly difference because of accents and dialects?

Trying to understand how different Indian subgroups of languages are.


>Is this like saying Italian, French, Spanish are pretty much the same because they are Romamtic languages? Or English and German are interchangable, because English is a Germanic language?

It really varies. Some languages, like Hindi, Gujarati, and Punjabi, are about as similar as Italian, Portuguese, and Spanish. In other cases the common languages can be very different. Telugu and Tamil, for instance, are probably much more like trying to go between English and German. Telugu and Kannada are kind of odd ducks. They're basically Dravidian languages that borrowed tons of their vocabulary from an Indo-European language (Sanskrit). That actually makes it a very good analogue for English, which is a Germanic language that spent a lot of time trying desperately to be a Romance language (French).


My understanding is that yes, to a very large extent, since they are dialects of the same language foundation, which is sanskrit and later prakrit.


No, you're referring to very broad language families. See my other post or www.ethnologue.com for accurate information.

It's a bit of a tricky business, because what counts as a language is often also determined by political decisions. Mutual comprehensibility is not a reliable criterion. However, modern linguistics uses pretty good sets of "soft" criteria for the classifications.


From personal experience, this is not really true, there are some languages that do maintain some sort of mutual comprehension, but the vast majority in India have no guarantee of that apart from a few choice root words. Speakers of a particular Dravidian language will almost certainly not be able to understand another, and you could make the same argument for Indo-Aryan languages with a very high percentage of success. It's like calling English and French the same language because they have some words that have a similar root. Or even English and German.


Those are language families. Not languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: