Wednesday, June 28, 2006

To get some sense of my progress, I've been trying to estimate how many words I know in Thai. I have approached this in a couple of different ways. First, I've thought through related sets of words. I know twelve months, ten colors, seven days of the week, fourty-four consonants, etc. Secondly, I've glanced over word lists, such as the vocabulary index of a Thai textbook and the entries in a learner's dictionary. By using the size of a list and the percentage of words I recognize, I can produce an estimate. My best estimate is that I know about a thousand words.

In doing this exercise, I've realized that it's actually not clear to me what it means to "know" a word. While very frequent words are clearly "known", and words I've never heard are "unknown", there is a whole range of other possibilities. There are words that I understand when listening but cannot correctly use. There are words that I recognize and understand only in context, and there are words for which my sense is still emerging and incomplete. Even taking into account this ambiguity, I think 1000 words is reasonably accurate.

Now that I know what I know, I'd like to find statistics for Thai showing that the most frequent 1000 words cover x% of spoken language, the most frequent 2000 words cover y%, etc. This lexical coverage information is easy to find for English, but I have been unable to find anything for Thai. So I've resigned myself to trying to estimate for Thai by using what is known for English.

One consideration in trying to apply English lexical coverage to Thai is that Thai morphology is not as productive as that of English. An ESL learner who acquires a word like "create" also acquires a whole family of words, including "creates", "created", "creative", "creation", and "recreate". In Thai, there are no such families of words. Other words function in place of morphology. For example, to say "created", a Thai speaker would say "create already". Word families in Thai are families of one.

I found some research on Marlise Horst's website showing that, with a vocabulary of the thousand most frequent word families in English, students understand about 85% of spoken language. To increase that comprehension to 98%, a vocabulary of 6000-7000 word families is needed. Due to the difference in morphology, statistics for word families in English might give a rough approximation of statistics for individual words in Thai. This jibes with my experience. With my thousand word vocabulary, I think it's accurate that I understand about 85% of spoken Thai. This assumes an idealization where the only impediments to following a dialogue are vocabulary and grammar. The ability to listen to spoken dialogue at a normal rate of speed in a variety of regional accents is a separate issue.

The Linguist, an interesting ESL website, has another way to measure proficiency in a second language using the number of known words.

Beginner a) 2,000 b) 3,500 Intermediate a) 5,000 b) 7,500 Advanced a) 10,000 b) 12,500 (source: The Linguist blog)

This system is for English, and every word in a word family is counted, so an attempt to apply it to Thai would again require taking into account the difference in morphology. Playing with the numeric data from Horst's site, it appears that there is an average of two words in an English word family, with the most frequent families being the largest. Since Thai has word families of one word each, it seems reasonable to multiply the number of words in my vocabulary by a little more than 2 to acquire a rough estimate of an equivalent ESL vocabulary. With my thousand word vocabulary, I'm the equivalent of an ESL student a little past "Beginner A". This seems about right.

Tuesday, June 27, 2006

It's interesting to observe how learning works. J. Marvin Brown's article "Learn Languages Like Children" mentions an "incubation" phenomenon, where a student's ability in a foreign language improves after a hiatus. As a graduate student, I certainly experienced this with mathematics. As a hobbyist musician, I also observe this when practicing an instrument.

My recent approach with Thai has been simply to listen "effortlessly but with understanding". One of my staples is watching movies. It is often easy to understand a movie without understanding all of the dialogue, by using visual cues and other context. This is especially true with children's movies.

It's very interesting to come back to the same movie again after a period of time. My comprehension invariably increases without any work on my part. Clearly, this is at least partly because I am on the next iteration of hearing the same dialogue, but I wonder whether it is also related to "incubation" that occurs between subsequent viewings, when I am not even exposed to Thai.

Saturday, June 10, 2006


I'm removing a few posts that are no longer interesting to me. You can reach the homepage here.

Thanks for your interest!

Wednesday, June 07, 2006

I have been working through the AUA books entitled Reading and Writing. The Thai writing system has 44 consonants, which are divided into three classes, arbitrarily named high, mid, and low. The consonant classes are used to determine tones. One of the challenges Thai students face is to learn which consonants belong to which class. During the Thai class I took in graduate school, the writing system was de-emphasized until the second half of the course, at which point there was an intensive memorization effort aimed at the consonant classes and the corresponding rules. I did well with the homework, quizzes, and exams, but I never felt that I had mastered the material.

As usual, I find the AUA approach to be remarkably innovative, well thought-out, and interesting. Rather than jumping right into memorizing classes and tone rules, the AUA book starts by distinguishing sonorant consonants from aspirates and plain stops. Grouping consonants in this way requires no memorization -once the student understands the distinction, it is obvious which group each consonant belongs to. For example, "m" and "n" are sonorants because the larynx vibrates when they are pronounced. Further lessons explain that that sonorants are all low class and plain stops are all mid class, while explaining the tone rules for each class. This makes learning their classes and the corresponding rules very easy. The aspirates then have to be sorted into high and low class. Sorting out the aspirates is also easy, because the initial syllable of the name of each aspirate has either a rising tone (ขอ ไข่) or a mid tone (คอ ควาย). In the former case, the class is high, and in the latter case, the class is low.

Saturday, June 03, 2006

I have another movie review, this time for Ong Bak, Thai Warrior. It's a great film for muay thai boxing and stunt work, but a bad movie for learning Thai, at least at my level. The dialogue is composed mainly of the Isan dialect and central slang. It is also somewhat sparse, as a lot of time is taken up by boxing and stunts.

I did get a better sense of some words in informal Thai, like the particle "wa" and the word "dtang" for money, and it was interesting to hear Isan. I also really enjoyed the muay thai, the elaborate stunt work, and the action scenes in Bangkok. But I don't think seeing this movie again soon would help me much with Thai language skill.

***** for muay thai
***** for stunts
** for value as a language acquisition tool

Thursday, June 01, 2006

Last night I watched 6ixtynin9, a Thai movie that I rented from Netflix. It was great. The reviews pointed out that it's heavily influenced by Quentin Tarantino. I agree that some aspects show his influence, such as characteristic framing and liberal doses of tongue-in-cheek violence. I also wonder about the influence of David Lynch. There are some quirky scenes that remind me more of Lynch than Tarantino, such as one in which a mobster starts to cry about missing his mother. I thought Lalita Panyopas' portrayal of the main character was very good. I had only seen her in a campy police drama made for television, so I was surprised to see how well she can act.

The story is easy to follow without using subtitles. I would say that I understood 30-40% of the spoken dialogue, and 90-100% of the story. As AJ Hoge nicely puts it in his AUA observations, it was easy to "forget" that I was watching a movie in Thai. I'll probably watch this at least once again before sending it back. It is a nice change from watching "Finding Nemo" dubbed into Thai for the nth time.