An analysis of the Japanese mora pair frequencies to aid Japanese language learners who want to improve their speaking by targeting the specific mouth movements required to confidently pronounce Japanese words.
- Identify all valid two-pair mora combinations in a Japanese corpus and rank them by frequency
- Generate a vocabulary list that would expose a language learner to all possible mora combinations, while simultaneously selecting for words that are frequently used in Japanese.
As I have been learning Japanese, I have noticed on many occassions that I have difficulty pronouncing certain words due to awkward sound pairings (e.g. 温かい「あたたかい」, 続ける「つづける」). These difficulties persist even when I understand the word well.
My understanding of the human brain due to my background in cognitive neuroscience and developmental psychology leads me to suspect that the phenomenon where second-language learners struggle to pronounce words is due to an underdevelopment of the various motor cortices, and language production centers (specifically Broca's Area) in the front and left temporal lobe of the brain. In other words, through childhood, these areas have not developed to allow to mouth to move in ways that produce speech other than their native language.
However, I believe human beings are adaptive creatures. With enough hard work and practice, even an adult can slowly rewire their brain to become accustomed to different mouth movements. To that end, I sought to develop a list of words comprising of each known sound pairing that a person could use to practice moving their mouth. I decided to so with sound pairings as I feel pronouncing single sounds should present very little on-going difficulty for a language learning. I restricted the pairing to two-pairs as this provided a reasonable number of permutations to explore. I posit that practicing words containing any number of two pair sound combinations is beneficial, as the order of the spoken sounds determines the shape that the mouth will form, with each sound pre-empting the shape of the mouth for the next sound.
I did not wish to generate a list of words that are so infrequently used that it would be impractical for a language learner to know. So I attempted to balance the list by selecting for frequently used words as much as possible, while ensuring complete coverage of the two-pair sound pairings.
For a more detailed report...
I would recommend that any language learner who is serious about improving their speech would use the lists generated to practice listening to and pronouncing the words. I would not suggest focusing on knowing the meaning of the words. While knowing the meanings would appear to be an obvious goal for the language learner, it may detract from the purpose of improving speech.
This is not an open project or software, but you are free to use this code to run your own analyses and develop your own optimal word list from which you could build your own flash card decks to practice speaking.