Waapuro: A dead-simple hiragana and katakana romanization library

I put my first package on PyPI. Then I found a bug, so I made a point release to fix it! There are probably more of those lurking, which you can report over on GitHub.

Just like so many others before me, I wrote this library to scratch an itch, namely providing romanized pronunciations for Omnipresence’s WWWJDIC plugin. (Speaking of which, if I ever get that bot’s code into a less shameful state, I should put it on PyPI too…)


I haven’t posted in a while, but I’m also too tired to write anything useful, so here’s a link to an autobiographical Japanese webcomic about parenting in French Canada where the family’s only common language is English. I saw it the other day on /r/japan and… well, okay, yes, I’m blatantly reposting it, why do you ask?

日本語が読めるフォローワーいるかな? さあ

So a discussion about the relationship between written Chinese and Japanese recently popped up on my dashboard. I hate reblog chains, so I’ll just link to @mnxmnkmnd’s response and give my own take on this.

First, let’s get this out of the way: The modern Chinese and Japanese orthographies are generally not mutually intelligible. The primary problem isn’t even just one of vocabulary, but the fact that basic morphological and syntactic elements are expressed in different scripts in each language, since Japanese morphosyntax is expressed largely through kana. I would use tense, but Chinese doesn’t even have tense, so let’s look at how adverbs are indicated instead. Here, Chinese uses the affix character , while Japanese instead uses varying kana conjugations ending in or . These are, of course, nothing alike.

That being said, the two languages do share a large written vocabulary. China and Japan’s long shared histories make tracing all this out rather complicated, but if you want a sloppy summary, there were two large borrowings. First, way back in the single-digit centuries AD, Japanese took a lot of existing Middle Chinese words. Then, in Japan’s Meiji era, a whole bunch of kanji neologisms were coined for imported concepts, and a lot of these made their way back to China. You’re not going to be able to figure out even a majority of what’s written in one language with only a knowledge of the other, but some sentences will be pretty transparent.

But then, of course, there are the differing simplifications and preferred expressions for certain phrases, which throws a wrench into the perhaps 20 percent (I made this number up) mutual comprehensibility rate. At this point, it’s getting late, so I won’t keep going.

TL;DR: The original analogy is silly. Maybe if the comparison had been between English and German instead. You get the vague ability to pick out some words that look similar, but the grammar is nowhere near the same, and what is that weird ß thing?

P.S. Someone should find actual papers on this.

@mnxmnkmnd writes:

incidentally, the other scanlators’ translation of this bugs me out.

In the past, Japan was only divided into “this” and “that” realm. However at one point in time, “yomi”, the name of “that” realm, was engulfed in chaos due to the dead.

that’s just so unnatural. what does that quote thing even correspond to in japanese? it’s ridiculous in english.

It’s common for Japanese text to use quotation marks for emphasis or to indicate idiosyncratic usage, which isn’t considered improper to the degree that it is with English. Incidentally, the construction “the name of ‘that’ realm” also probably arises from translating the set phrase XというY too literally; a direct translation would be “the Y named/called X,” but nine times out of ten this sounds redundant or out of place in English, and a simple appositive would suffice instead.

The more you know.

thefutureghost writes:

I have a question about Japanese. Why are some vowels just like not pronounced? Like, “ichi” is spelled like that, but I don’t think I’ve ever heard that second “i” pronounced. Or like “Yusuke” which sounds like “Yoos-kay”, where the second “u” isn’t pronounced. Or “Asuka”.

Are they just like really short vowels (vowel length) and technically are pronounced but my English-speaking ears don’t hear it?

The “traditional” analysis (see, e.g., T.J. Vance, An Introduction to Japanese Phonology, 1987), is that high vowels are devoiced or elided entirely between two voiceless consonants (/p/ /t(s)/ /ch/ /s(h)/ /k/ /h/), or word-finally after affricates and fricatives (/s/ /sh/ /ch/). The apparent motivation for this would be assimilation to the vowel’s surrounding environment – it’s “easier” in some sense to pronounce three voiceless segments in a row than a “voiceless, voiced, voiceless” sequence. There are some differences on the particular phonological feature involved (maybe it’s [spread glottis] instead?), and at least one study argues that this is a probabilistic occurrence rather than a hard-and-fast rule, affected by external circumstances – but that information is likely only of interest if you’re a phonologist.

Speak Ruby in Japanese