Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The mysterious origins of punctuation (bbc.com)
69 points by scdoshi on Sept 3, 2015 | hide | past | favorite | 15 comments


Sometimes the movies like to use $SCIENCE to bring forward someone from the past and see our world and be all surprised and stuff. It was almost its own genre in the 1980s. And they were always amazed at radios and cars and all the obvious things. But I also find it interesting at all the other things that they would find amazing and we don't even notice or know about, and as the article says, the sight of someone sitting on a park bench, reading a book, and their lips not moving would surprise quite a lot of our ancestors, to say nothing of a "normal person" doing it rather than a highly educated priest or obvious academic. We take it for granted, but as the article says, it's a relatively recent innovation. (On the timescale of "all of human civilization", anyhow.)


Any more information on this? Honestly something that I have never considered before.


A counterpoint to the claim about the ancients reading aloud: http://www.theguardian.com/books/2006/jul/29/featuresreviews...


When I learned modern Chinese, it had only sentence breaks. Individual words are one to four syllables and not separated. It takes a while to handle this. This would be like eliminating spaces in English. Chinese before 1900 is even worse. They didnt have sentence and paragraph breaks, leaving one large grid of characters. And the grid can go in any of four directions: vertical horizontal, rightwise, leftwise. Ancient Latin and Torahanic Hebrew is like this too.


So punctuation is a type of musical notation for high-level meter and pitch, while the words are like the lower-level rhythm. Pitch and meter fluctuations, over the ranges of phrases and sentences, are what many spoken languages use to chunk sentences and phrases, which helps listeners disambiguate the connective structure of the words.

Quite interesting that punctuation co-evolved with musical notations. I had thought that punctuation existed prior to musical notation.

Side note: although American English speakers employ quite diverse pitch and meter patterns, e.g. more homogenous populations, as in many European countries, have extremely consistent spoken pitch and meter usage patterns over phrases and sentences.


That's interesting. In English we would probably consider constant pitch and meter to be dull. Then again, I remember from Latin in college that the romans tended to play with sentence structure to put the subject at the end of the sentence, making it more suspenseful, so there are probably other language attributes that those more homogenous populations play with to introduce variety and distinguish interesting and dull speakers.


This appears to be a lightly summarised excerpt of Houston's (fascinating) book "Shady Characters". As mentioned briefly at the end of the article, he has an interesting blog where he discusses more: http://www.shadycharacters.co.uk .


It's worth pointing out that the scripts of many widely spoken languages do not have a word divider (e.g. the space character in English). Mandarin, Japanese, and Khmer are some examples.

I'm aware of some studies that try to determine whether or not this slows down reading, but in my own experience it doesn't get in the way much.


Also worth pointing out that in languages which do divide the words, they often divide the words too much, such that the individual words have little to no relation to the meaning of the longer phrase. E.g. new york, arm and a leg, kick the bucket.

This, over-dividing problem, along with the under-dividing problem you mentioned, are both huge hurdles for machine textual understanding and machine translation systems.

On the information-extraction system I'm working on now, roughly 80% of the entities we're trying to extract are multi-word expressions. Very difficult.

[1] https://en.wikipedia.org/wiki/Multiword_expression

[2] http://aclweb.org/aclwiki/index.php?title=Multiword_Expressi...

[3] http://lingo.stanford.edu/pubs/WP-2001-03.pdf


Is it actually described as "over-dividing" in an academic sense? Those individual words do have meaning, but they have later been recombined into forms that have new, sometimes orthogonal meanings. I can see the argument for mashing them back together in that case, but "over-divided" seems a strange way to look at it.


A related idea is linguistic "compositionality"

https://en.wikipedia.org/wiki/Principle_of_compositionality

I don't have hard numbers, but I know from experience that a large share of multiword expressions are non-compositional (the meaning of the larger phrase can't be inferred from its constituents), so in that case thinking of them as "over-divided" makes sense to me.


In linguistics, we call such phrases "idioms".

https://en.wikipedia.org/wiki/Idiom


Cool, I've also done a lot of thinking/work related to multiword expressions. A couple years back I did word segmentation on Khmer too. I'd be curious to hear more about your work!


And in some other languages like Arabic, Persian, Urdu the absence of word divider is compensated with multiple forms of every alphabet.

So at most every alphabet can have 4 forms, depending upon its position in the word: start, middle, end and standalone (but there are caveats)

Also in these languages usual punctuation marks are inverted like question mark, comma is drawn the other way round

After a while one gets used to it.


I think by "alphabet" you mean "character"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: