Some recent developments in interactive Audio are discussed, with particular reference to the use of various delivery systems, including CD-ROM. Specific attributes of this educational technology are discussed, and enhancements through the use of photographs (and pseudo video or a form of "pixilation") are described.
Interactive Audio refers to a technology involving the combined use of audio with the simultaneous (and synchronous) display of computer generated visuals, such as text, graphics and photographs. Being just one of many alternative approaches to Computer Aided Learning (CAI), interactive audio occupies a position on the wide spectrum of educational technologies, somewhere between conventional CAI and interactive video (using videodisc).
Unlike conventional CAI, which is silent, and videodiscs which rely to a large extent upon the impact of high quality video, interactive audio depends mainly upon audio for conveying information. Audio can also be used to convey additional meaning, moods, and emotions through the intonations of voice and background sound effects.
These three technologies (namely conventional CAI, interactive audio, and interactive video) each have a specific role to play in the tasks of teaching and learning. Knowing the attributes of all these approaches, and selecting the one which is most appropriate for a given situation, is a challenge for all producers of computer-controlled educational programs.
This paper describes the development of interactive audio, the types of projects which are well suited to this technology, and some delivery systems suitable for interactive audio. Special reference is made to optical disc technologies, involving Compact Disc Read Only Memory (CD-ROM), and some recent projects involving the use of this technology are described.
These digitally recorded sounds were reproduced under program control, together with the simultaneous display of text and pictures on the computer monitor. Digital audio stored on magnetic disc enabled fast, reliable and "random" access to speech files and other audio files stored on the disc.
Although the hardware component was inexpensive, the voice card being approximately $200, the audio consumed memory at the rate of 4 kbytes per second. Typical programs containing say 30 minutes of audio therefore consumed at least 7 megabytes of the hard disc. This space consideration, the awkwardness of transferring large programs, and the relatively poor quality of audio reproduced through the voice card, were all factors which led our development team to adopt an alternative approach, namely the use of optical disc technology.
Authentic pronunciations of these sounds, by a native speaker of the language, are delivered to the student from a recently produced CD-ROM, while photographs of the corresponding symbols are also displayed on the monitor. The student is then asked to read the row of symbols currently displayed; these attempts are recorded through the voice card and temporarily stored in the computer's RAM Next the "correct" basic sounds are replayed from the CD-ROM, followed immediately by a replay of the student's attempt, which is recalled from RAM and played back through the voice card.
This process can be repeated by the student until a self judged level of competency is achieved. The final student attempt is also stored uniquely on hard disc, so that a tutor can check on the student's progress at a later date.
Combining the use of sounds and pictures from a CD-ROM, with sounds recorded through a voice card, offers a new and potentially effective way of learning foreign languages.
If even lower sampling rates are used, such as the 4 kbyte/sec used by the voice card described earlier, then at least 38 hours of audio could be stored on one disc.
Text and numerical databases are the typical contents of most CD-ROMs, and these data types account for the great majority of discs commercially available. However the latest programs produced by FIT contain a mixture of both audio and a database of photographs.
The proportion of disc space allocated to each medium is simply a function of individual project requirements. For example, the CD-ROM called "A Talking Dictionary of Medical Terminology" contains mostly audio, with a companion set of 80 photographs used to enhance certain explanations within the lesson. On the other hand, a project under development for the Museum requires 8,000 photographs to be archived onto one disc, with a short audio description associated with each picture.
As in the case of digitised audio, photographs can be digitised with varying degrees of resolution, resulting in picture files which have a considerable range in memory storage requirements. For example, a photograph which has a resolution of 512 x 512 pixels on a computer monitor and uses a maximum of 24 bits for colour, and a further 8 bits for graphics overlays would require at least one megabyte of memory. On the other hand, a photograph which has a resolution of only 256 x 200 pixels and uses 256 colours or shades of grey would produce a file size of only 51 kbyte.
Compression algorithms can further reduce the size of these picture files. Compression factors varying from 5 to 100 or more are possible, depending upon the complexity of the original picture and the particular algorithm used to reduce the file size. Producers should, however, be aware that accessing a picture file from CD-ROM or from hard disc requires a finite time, which is further increased if a compressed file needs to be unravelled prior to display.
During the presentation of this paper, the display of colour and black and white photographs, which use different amounts of memory, will be demonstrated. These uncompressed pictures are called from disc and displayed within one or two seconds.
A comparison between interactive audio and interactive video was presented by Shaw (1987-D) using a set of criteria involving among other things, cost, ease of production and overall production time. Our three most recent CD-ROM projects have taken an average of approximately four weeks each to produce.
Rather than debating the merits of one technique over another, it is more appropriate to emphasise an earlier comment, namely that producers and educational technologists must decide which technique is appropriate for the particular task in hand. Matching the attributes of a technology with the project objectives must always remain a prime consideration.
The announcement of recent display systems referred to as CD-I (Compact Disc Interactive) and DVI (Digital Video Interactive) means that photographic images can now be stored and recalled from discs, other than videodiscs, with sufficient speed to satisfy the requirements of "video". Sophisticated algorithms which compress, and later reconstruct the images have been developed elsewhere, and educational technologists await the introduction of these techniques.
In the meantime, the Centre for Research and Development at FIT has created pseudo video from a series of photographic images in some experimental projects.
Earlier it was stated that digitised photographs, which fill most of the available space on screen, can be displayed in about one second. It follows that a smaller section of that image (say 50 x 40 pixels) can be "refreshed" at a much faster rate, thereby raising the possibility of pseudo video or "pixilation" within part of the existing image.
At least two possible methods of generating this video effect can be described as follows:
For example, a micrometer was photographed numerous limes, with slight alterations of the position of the barrel and jaws occurring between each photograph. When operating in the replay mode, the program displays the first photograph completely, but then subsequent images simply refresh only the barrel and jaws of the micrometer. The resulting effect is the appearance of a micrometer being operated as if photographed under normal video conditions.
In another application, photographs of the Japanese characters (Hiragana) are displayed on the computer monitor. Sequential photographs allow each character to be written to screen step by-step, thereby showing the stroke order used when writing these complicated characters. In this example different (but predetermined) areas of the photographs are refreshed to create the illusion of video.
Projects in which audio is critical to learning a particular skill have been identified in medical education (especially Nurse Education) where students endeavour to learn sounds associated with particular body functions. For example, listening to and counting the foetal heart beat, and determining if the foetus is healthy, was the focus of an early program Similarly the measurement of blood pressure has been successfully taught with the aid of an interactive audio program (Shaw and Spratling, 1987-E).
Learning any foreign language (or learning English as a second language) is a field particularly well suited to interactive audio Authentic sounds and the corresponding symbols can be delivered to students (from the CD-ROM), and student responses can be recorded through a digital voice card. Fast response times, and the option to hear repetitions of a word or sentence, can encourage the student to practice the pronunciations of new or difficult words.
Learning to spell difficult English words, or simply words which appear phonetically similar ( such as affect and effect) is another task well suited to interactive audio. Even a talking dictionary, which explains and pronounces difficult jargon, is an appropriate task for interactive audio.
Shaw, N. A. (1987-B). Interactive Audio as a Tool for Learning. Conference for Australian Tertiary Language and Learning Skills, Perth.
Shaw, N. A. and Spratling, M. (1987-C). Using Interactive Audio in Medical Education. Conference of Australian and New Zealand Association of Medical Educators (ANZAME), Perth.
Shaw, N. A. (1987-D). Interactive Audio - A Challenge of Companion of Interactive Video? Conference for the Australian Society for Computer Aided Learning (ASCILITE), Sydney.
Shaw, N. A. (1988-E). Using Interactive Audio in Speech Rehabilitation. Conference for the Australian Association of Speech and Hearing, Brisbane.
Shaw, N. A. (1988-F). The Talking Dictionary of Medical Terminology. Conference for Australian and New Zealand Medical Educators (ANZAME), Adelaide.
Shaw, N. A. (1988-G). CD-ROM in Education and Training. Biennial Conference of the Library Association of Australia (LAA) and the International Federation of Library Associations (IFLA).
|Please cite as: Shaw, N. (1988). Interactive audio in education and training: Applications of CD-ROM. In J. Steele and J. G. Hedberg (Eds), Designing for Learning in Industry and Education, 31-36. Proceedings of EdTech'88. Canberra: AJET Publications. http://www.aset.org.au/confs/edtech88/shaw.html|