IIMS 94 contents
[ IIMS 94 contents ]

The impact of sound and image features of IMM on CALL

Xiao Xi
University of Newcastle, New South Wales
Language teaching, at least in communicative approach, is basically an audiovisual interaction. The lack of true speech sound and image in CALL had disheartened many enthusiasts. Now, with IMM, they walk with their heads high. This paper analyses the need of interactive multimedia for CALL, and looks at the sound and image features of IMM in the light of language teaching. The future impact of IMM to CALL is anticipated with a brief illustration of the software "Monking", a Chinese character tutorial developed by the author.

CALL calls for IMM

Computer Assisted Language Learning (CALL) has met understandably negative reactions from some language teachers and educational administrators. "There is the usual wall of professional apathy, inertia or disenchantment." (McCarthy, 1993, 2) Apart from those cynics who by nature are disinclined to play with electronic equipment, and who are unlikely to leap for joy at the prospect of having to go through another familiarisation process with new equipment at which some of their students are better and quicker in learning, many teachers who were quick in recognising the potential of computers have since lost their enthusiasm because of the cost, the lack of institutional support, etc. It was disheartening that, without true speech sound and little in the way of crude pictures, CALL had brought language teaching back to the age before tape recorder and video were used, despite all other merits. Our students have grown up with computer and video games in which animation, sound and even some speech abound. It is understandable that a screenful of text is less motivating and reduces the impact of CALL.

In recent years we have seen the shift in language teaching theory and practice from structural approach to communicative approach. Communicative skills are emphasised instead of language as knowledge (like grammar rules). Speech sound and real world image become crucial elements in teaching language skills like speaking, listening, and proper usage of the language in social situations. All this implies a "downgrading of those very areas which best lend themselves to computerisation (eg, grammar rules instruction and drilling) and a corresponding upgrading of aspects less susceptible. to being computerised ... the computer thus finds itself making a greater contribution to marginal than to central elements. ... It is indeed ironical that computer technology ... should have arrived at a time when the theories about language learning which could best accommodate it were on their way out."(Kenning & Kenning, 1990, 48)

With the rapid development in computer technology, the interactive multimedia has promoted CALL into a new stage. The deaf and mute (those able to make a little noise, to be exact) now could talk and understand a bit in true speech sound and give vivid picture of a situation in the real world. This again set the imagination of CALL enthusiasts on fire.

The sound and image features of IMM that serve CALL

Sound features

By using sound cards, music and sound effects are greatly enriched, which will certainly contribute to the motivation of the learner and enhance the realness of pictures of reality. There are other features that answered CALL's need.

Voice output. Ready made sentences, phrases or words are activated by the learner's key in response. Those digitised recordings of true speech sound are natural utterances. Voice synthesiser, which can read any text that the learner keys in, can be used in ELIZA kind of talking and proof reading, although it suffers from immaturity in naturalness.

Another new technology worth mentioning here is called Talking Book, which turns barcode reading into real speech sound. With a barcode wand connected to a computer (without monitor), learners can look at a text on paper and listen to the reading aloud at the same time.

"The added audio dimension offers great scope for teaching those concepts and skills which depend critically upon sound for full communication." (Shaw, 1993, 213) For instance, the tones of Chinese characters are very challenging to LOTE learners. Without accompanying speech sound, it is difficult to acquire them even if the Pinyin and tone mark is on the screen.

Voice input. Voices of teachers and students can be recorded and played back by computer. This helps phonetic instruction and pronunciation training.

Speech recognition, the most exiting new technology, has shown its potential in CALL. For example, pronunciation training needs constant feedback and repetition. The learner often feels embarrassed before a teacher and frustrated after repeated failure. With a computer as a backfeeder, the anxiety is greatly reduced. Although this technology is still in its infancy and the reliability of recognition needs improvement, we can expect that in the near future our students can really talk to, and with, the computer in target language.

Image features

Graphics. Graphics have their role in showing simple maps and objects.

Animation. Using authorwares like Actions II, animation becomes a easy task for the designer. It certainly contributes to the high motivation of the software. Sometimes it serves better than real life pictures because of its cartoon nature. With certain topics, it pays a decisive role. For example, without animation, it its hard to present the stroke order of Chinese characters.

Video. This is the most convincing step forward in CALL. In communicative teaching, it is the important and the most difficult task for the teacher to provide a situation close to the real world in order to impart knowledge of social-cultural aspects of the language in use. Without a true video play, at least the meaning of body language and facial expressions can hardly be conveyed. Interactive video made a breakthrough here and Aussie Barbie is certainly the landmark in CALL.

Some of these sound and image features have been in existence in other medium used in language teaching, such as audio recorders and video recorders, but the merit of interactive multimedia lies in its capacity in organising all necessary media into a computer which is "capable of generating human language while other forms of technology ... can only reproduce language." (Lee, 1993, 2)

The interactive feature distinguish IMM from any other technology. "True interactivity implies a dialogue in which both sides of a two way exchange adapt their behaviour in the light of the other's response. For this reason, linear videotape can never be fully interactive ..." (Coleman, 1991, 93)

The future impact of IMM on CALL

  1. With the sound and image features of IMM, the popularisation process of CALL will be quickened. "The preferred modalities of learning - visual, tactile, auditory, kinaesthetic, group, individual - can be brought into focus using interactive multimedia technology." (Garton, 1992, 21) Teachers of language and educational administrators will see the benefit of adopting CALL into the curriculum. This will in turn stimulate language teachers to be engaged in using, researching and making CALL courseware.

  2. Communicative simulation software will increase in number because of the case of using sound and image technology with authoring languages. A CALL program named SMILE is in research and development stage in Edith Cowan University, Perth, which uses interactive animation to immerse students in real language situations such as identifying road and shop signs, obeying screen warnings, or following a shopping list in a department store (Frick, 1993, 2).

  3. The combination of the above mentioned features, together with a mature answer processor, will make possible virtue reality used in CALL. For example, we could imaging, when speech recognition is combined with interactive video, a student could see himself stopping a pedestrian in the street and asking the way in real voice. He will get different answers in real voice and is directed to different places, depending on whom he is asking, the properness of his way of asking, the acceptability of his pronunciation, his present location and his destination.

  4. While the use of interactive video is very attractive, software with audio features only will predominate. This happens because compared with interactive video software, interactive audio is much easier to make and cost effective in terms of materials production as well as delivery in presenting the product to the students. Most of educational institutions are tight in financial considerations, this is especially the case with CALL, since a lot of administrators still "do not see the point in investing their institutions money in equipment which clearly belongs to the domain of 'real' subjects like math, the science and commerce" (McCarthy, 1993, 2).

  5. Software evaluation becomes essential. "The ease of the newer authoring software, particularly with multimedia and hypermedia is enticing ever expanding numbers of community based workers with wide ranging educational roles to attempt the production of computer assisted learning (CAL) ... these naive CAL producers threaten to tarnish the reputation of CAL." (Farrow, 1992, 1). This danger calls for the training of language teachers in software making and evaluation.

Monking: A Chinese character tutorial using IMM

Chinese writing system, the characters, is believed to be difficult to learn, with 17 strokes on the average for each character fitted in a small box and each with a different tone. Old CALL found it a hard nut to crack. However, recent research has discovered its value of being one of the most advanced writing systems for its terseness, clarity, productivity, etc (Xiao Xi, 1993). Above all, it is meaning, form and sound three in one. This picto-phono-graphic nature lends itself very well to IMM's sound and image features.

The author has developed a software using these features to teach the characters and has received optimistic feedback. The program uses a Soundblaster card, an authoring language under windows called ActionII. The whole process of teaching is organised into an adventure of Monkey King who tries to discover and steal treasures in Emperor Jade's Palace in Heaven. The sequence in teaching is totally different from traditional ones. Instead of learning characters lesson by lesson, learners go from one character to another by exploring with the mouse on the character on screen, by adding one more stroke at a proper place (clicking on a hidden button that is in the shape of a stroke). This discovery learning reduces the memory load, also builds up the ability for distinguishing similar characters which are confusing in form.

Graphics and animation plays a essential part here. Each character is drawn with some part (such as radicals which give hints to the meaning of the character) highlighted in a different colour. Chinese characters are basically picto-graphic and self explanatory. For example, the character for "tree" is in a simplified shape of a tree. The character "rest" is radical of "man" plus the character "tree", meaning a man leaning against a tree. Animation is used here to illuminate this feature. When the icon for explanation is clicked, a man walks to the tree and take a rest. Another example is the sun rises from the horizon, meaning "morning" or "daybreak", which explains why the character is a "sun" above "earth" represented by a horizontal stroke. These visual illustrations greatly enhanced the comprehension and memory of, not only the meaning and form of each character taught but also the way characters are formed.

Speech sound, sound effect and music are also playing a part in the program. The testing section consists of arcade games. For example, as soon as a character is read out, many balloons begin to drift to the sky. The testee (now the Monkey King) should shoot the one with the right character before it is out of sight. The sound effect of cheers from crowd, urging music, the gunshot, all this makes the test very exiting.


Three cheers for the daily updating of IMM technology, but problems like high cost, low reliability in speech recognition, lack of mature natural language input processor, are waiting for the technology to overcome. After all, we should do our bit as language teachers, the major force in adopting IMM in CALL, to make a new CALL a reality.


Coleman, J. (1991). Interactive multimedia. In Brierley & Kemble (Eds).

Corbel, C. & Victoria, A. (1993). The 'Third Generation': An overview of some recent CALL texts. On-Call, 7(2). 24-28.

Dunkel, P. (ed) (1990). Computer assisted language learning and testing research issues and practice. Newberry House.

Farrow, P. (1992). Crystal ball gazing: Expanding CAL experience and expertise beyond the University into the community. In A future promised, Proceedings ASCILITE'92, 1-5.

Frick, W. (1993). The development of teaching materials. Paper delivered at CSAA 3rd Biennial Conference, July, 1993, Griffith University.

Garton, J. (1992). Learning how to manage text with interactive multimedia. On-Call, 7(1). 17-22.

Kenning, M. M. & Kenning, M. J. (1990). Computers and language learning: Current theory and Practice. Ellis Horwood.

Lee, T. (1993). The CALL potentials of Word for Windows. On-Call, 7(3). 2-6

Maxwell, R. (1993). Occupational health and safety with talking books. On-Call, 7(2). 29-30.

McCarthy, B. (1993). Developing CALL material for the foreign language classroom: Ideals and practicalities. On-Call, 7(2). 2-9.

Shaw, N. A. (1987). Interactive audio: A challenge or companion to interactive video? Using computers intelligently, 212-221.

Xiao, Xi (1993). The rediscovery in Chinese character system and 7 schools of teaching experiments in China. Paper delivered in CSAA 3rd Biennial Conference, July, 1993, Griffith University.

Author: Xiao Xi, Visiting Lecturer, Curriculum Department
Box 8 Hunter Building, University of Newcastle
Newcastle NSW 2308
Tel. 049 216 370

Please cite as: Xiao, Xi (1994). The impact of sound and image features of IMM on CALL. In C. McBeath and R. Atkinson (Eds), Proceedings of the Second International Interactive Multimedia Symposium, 594-596. Perth, Western Australia, 23-28 January. Promaco Conventions. http://www.aset.org.au/confs/iims/1994/qz/xiao.html

[ IIMS 94 contents ] [ IIMS Main ] [ ASET home ]
This URL: http://www.aset.org.au/confs/iims/1994/qz/xiao.html
© 1994 Promaco Conventions. Reproduced by permission. Last revision: 15 Feb 2004. Editor: Roger Atkinson
Previous URL 15 Oct 2000 to 30 Sep 2002: http://cleo.murdoch.edu.au/gen/aset/confs/iims/94/qz/xiao.html