Friday, May 22, 2009

Kang (1995) The Effects of a Context-Embedded Approach to Second Language Vocabulary Learning

This is another paper I am reading as part of a meta-analysis of different approaches to teaching vocabulary. This was a very interesting study that showed big improvements not only in vocabulary retention on straightforward recall tests, but also on listening comprehension and knowledge transfer. There were four experimental conditions, including teacher led class study (P&P), computer study lists (CW), computer study lists with pictures (CP), and the clear leader; computer study based on learning vocabulary in the context of narrative (CC). I would really have liked to see some example images from the interface. The results seem to show that learning simple vocabulary in the context of narratives is highly effective.

I think it is fascinating that students in the context condition (CC in the above diagram) outperformed on pure recall tests. I had been starting to form the impression that preparing for a particular form of test is the best way to improve performance on that test, but in this case it seems like preparing for knowledge transfer also led to better scores on straight recall tests; although I'd need to see interface images to check if this really does contradict the kind of cross-match up we see in Groot (2000) where concordancing practice boosts performance on concordance tests, but not on straight paired associate recall, and vice versa.

Kang cites lots of relevant theory such as "inert knowledge" (Brown et al., 1989) and "cognitive embedding" (Ausubel, 1968), which links up with points I was making in a recent journal paper I co-authored with Maria Uther of Brunel University in the UK: Joseph & Uther (2009) In that paper I referred to some research (Chi & Koeske, 1983) indicating that information that is more deeply embedded in an individual's knowledge network is likely to be remembered longer; although this is a common theme in the literature on memory and second language acquisition, the references that Kang cites are different from those that I have been aware of so far. Of course that is not so surprising, but it gives me further pointers to link up this concept as far as computer assisted language learning goes.

One query I have is about the reliability coefficient that Kang reports for each type of test. My, possibly flawed, understanding of reliability is that one is attempting to work out how effectively some measure is at assessing an individual construct; such as in a social science questionnaire where multiple questions attempt to probe the same underlying construct like racism or sexism. However in an experiment like Kang's each test is attempting to measure the learner's knowledge of a particular word. There are no repeat measurements using different instruments, except to the extent that Kang employs three types of vocabulary tests. One could assess reliability across those tests, but Kang reports reliability for each individually.

The only way I can make sense of a reliability coefficient for a single type of test on multiple words, over multiple learners is that we are thinking of knowledge of multiple words as a single construct and are assessing reliability of the test instrument in those terms. However since the leaner may have had individual difficulties with each word separately, i.e. they are likely to learn at different rates for different words, that doesn't quite make sense. Unless all the words are very similar in terms of abstractness, visualizability, frequency etc., i.e. we have determined that the each test is probing the users learning of the same sort of word, e.g. the reliability of a test type such as productive recall for assessing learning of concrete nouns. Kang describes the vocabulary used in the study as common everyday words such as household items and routine activities. Anyhow, I don't think this is a serious criticism of a very interesting study, and it's highly likely that I am misunderstanding the meaning of reliability measures in this context, but it does seem a little like a situation where the statistical software generates reliability measures and they are reported verbatim without assessment of their suitability for the experiment in question.

cited by 15 [ATGSATOP]

My references

Chi, M.T.H. & Koeske, R.D. (1983) Network representation of a child's dinosaur knowledge. Developmental Psychology 19(1) 29-39.

Joseph, S.R.H & Uther M. (2009) Mobile Devices for Language Learning: Multimedia Approaches. In Research and Practice in Technology Enhanced Learning 4(1) 1-26.

Kang's references:
ALVAREZ, M. C. (1990) Case-based instruction and learning: an interdisciplinary project (Cited by 6). Paper presented at the 34th annual meeting of the College Reading Association, Nashville, Tennessee.
ANDERSON, R. C. and NAGY, W. E. (1991) Word meanings. In Barr, R., Kamil, M. L., Mosenthal, P. B. and Pearson, P. D. (eds), Handbook of Reading Research, Vol. 2, pp. 690-724. New York: Longman.
AUSUBEL, D. P. (1968) Educational Psychology: a Cognitive View (Cited by 3792). Chicago: Holt, Rinehart and Winston.
BROWN, J. S., COLLINS, A. and DUGUID, P. (1989) Situated cognition and the culture of learning (Cited by 5933). Educational Researcher 18(1), 32-42.
Cognition & Technology Group at Vanderbilt. (1990) Anchored instruction and its relationship to situated cognition. Educational Researcher 19(6), 2-10.
CRAIK, F. I. M. and LOCKHART, R. S. (1972) Levels of processing: a framework for memory research (Cited by 3428). Journal of Verbal Learning and Verbal Behavior 11,671-684.
KRASHEN, S. D. (1981) Second Language Acquisition and Second Language Learning (Cited by 1936). New York: Pergamon Press.
KRASHEN, S. D. (1982) Principles and Practice in Second Language Acquisition (Cited by 2986). New York: Pergamon Press.
KRASHEN, S. D. (1989) We acquire vocabulary and spelling by reading: additional evidence for the input hypothesis (Cited by 318). The Modern Language Journal 73, 440-464.
MEARA, P. (1980) Vocabulary acquisition: a neglected aspect of language learning (Cited by 110). Language Teaching and Linguistics 13, 221-246.
MILLER, G. A. (1985) Dictionaries of the mind (Cited by 2444). Proceedings of the 23rd Annual Meeting of the Association for Computational Linguists, pp. 305-314. Chicago: Author.
MILLER, G. A. and GILDEA, P. M. (1987) How children learn words (Cited by 240). Scientific American 257(3), 94-99.
NAGY, W. E. and HERMAN, P. A. (1987) Depth and breadth of vocabulary knowledge: Implications for acquisition and instruction (Cited by 10). In McKeown, M. G. and Curtis, M. E. (eds), The Nature of Vocabulary Acquisition, pp. 19-35. Hillsdale, NJ: Erlbaum.
NATION, I. S. P. (1990) Teaching and Learning Vocabulary (Cited by 912). New York: Newbury House.
OMAGGIO, A. C. (1986) Teaching Language in Context: Proficiency-Oriented Instruction (Cited by 472). Boston, MA: Heinle and Heinle.
RIVERS, W. M. (1981) Teaching Foreign-Language Skills (Cited by 462). 2nd Edn-Chicago, 1L: The University of Chicago Press.
SPIRO, R. J., COULSON, R. L., FELTOVICH, P. J. and ANDERSON, D. K. (1988) Cognitive flexibility theory: advanced knowledge acquisition in ill-structured domains (Cited by 473). Tech. Rep. No. 441. Champaign: University of Illinois, Center for the Study of Reading.
SPIRO, R. J., VISPOEL, W. L., SCHMITZ, J. G~, SAMARAPUNGAVAN, A. and BOERGER, A. A. (1987) Knowledge acquisition for application: cognitive flexibility and transfer in complex content domains. Tech. Rep. No 409. Champaign: University of Illinois, Center for the Study of Reading.

1 comment:

Yi-Jiun (or Angela) said...

Brown, JD. (2005). Testing In Language Programs: A Comprehensive Guide To English Language Assessment.

Might be interesting to read C.8 (on language test reliability) and C.10 (on psychological construct)

On p. 196, the author sums up that if all other factors are held constant, the following statements are usually true:
3. “A test made up of items that assess similar language material tends to be more reliable that a test that assesses a wide variety of material.”
4. “A test with items that discriminates well tends to be more reliable that a test that do not discriminate well.”

According to the book, I guess these might be the reasons why Kang reported the Cronbach alpha coefficient. It’s at least a good way to show the readers how well her in-house assessment worked. In addition, it seems to me that a construct is more abstract than individual words. It might refer to vocabulary development in this case.