Different Amplitudes Of Speech And Other Sounds example essay topic

1,916 words
Basis of Processing Sound Strategies Introduction to Coding Strategies: D.J. AllumCoding strategies define the way in which acoustic sounds in our world are transformed into electrical signals that we can understand in our brain. The normal-hearing person already has away to code acoustic sounds when the inner ear (cochlear) is functioning. The cochlea is the sensory organ that transforms acoustic signals into electrical signals. However, a deaf person does not have a functioning cochlea. The cochlear implant takes over its function.

Technically, it is relatively easy to send electrical current through implanted electrodes. The more difficult part is to make the electrical signals carry the appropriate information about speech and other sounds. This responsibility is taken over by coding strategies. The more efficient the coding strategy, the better the possibility that the brain will interpret the information as having meaning. Without meaning, sound is only unwanted noise.

Some basic vocabulary is useful in understanding coding strategies: Frequency. Speech is composed of a range of frequencies from high-frequency sounds ( , p ) to low-frequency sounds (ah). These frequencies also occur for sounds in our environment. The speech-frequency range is from about 250 Hz to 6,000 Hertz (Hz). Amplitude. The amount of amplitude, or intensity, defines how loud a sound is heard.

The usual range from the softest to the loudest sound is about 30 dB. The normal range for human hearing is around 120 dB. Tono topic. A special characteristic of the cochlea and the auditory nerve. It means that the apical region of the cochlea (and the nerve near this region) is more sensitive to low frequencies and that the basal region is more sensitive to high-frequencies. The relationship between the most basal to the most apical region is a progression from high-to-low frequency sensitivity.

Filters. Filters are used to divide, electronically, acoustic signals into different ranges. For instance, for a speech-frequency range of 4,000 Hz, we could divide the total range by 10 and each filter would hold 400 Hz. Stimulation Rate. The number of times an electrode is turned on and off, i. e., activated with electrical stimulation. The normal cochlea is like a series of filters.

Sounds that have high-frequencies will fall into filters at the basal end of the cochlea and those with low-frequencies will fall into filters in the apical end, i. e., in a arrangement. Since the cochlea cannot accomplish this for a deaf person, the cochlear implant takes its place. It is important to remember that the auditory nerve is even if the cochlea cannot transmit information because of deafness. The auditory nerve lies in waiting for stimulation to arrive at a certain place in the cochlea. Thus, a series of electrodes are placed in the cochlea. Each electrode is associated with a place in the cochlea (base or apical) and with a filter (high-frequency to low-frequency).

This is how the auditory nerve receives information. Because speech is composed of different frequencies and, therefore, normally analyzed in different parts of the cochlea (order), a coding strategy needs to divide speech electronically into different frequency bands and then send the information to different places along the cochlea. The normal cochlea is an amplitude analyzer. The relative amplitude (loudness) of sound is very important. As mentioned earlier, the's-sound is always softer than ah.

If we were to change that relationship in a word, it would no longer be the same word. The speech coding strategy must be able to analyze the different amplitudes of speech and other sounds. The next step is to send the information to the brain. This is determined by the firing of the nerve, or a group of nerves working together.

The cochlear implant activates the nerve with its stimulation rate. It is possible to stimulate the nerve so that it fires with every stimulation or to over-drive the nerve so that it is forced to share the information with a group of nerves. While one is resting, the other fires, and so forth, and in this way it begins to respond as the normal-hearing ear does. Let us summarize what is needed from a coding strategy: Analysis Divide speech into different frequency bands defined by filters Determine the amplitude relationships of the sounds within the filters (i. e., / / will always be softer than / upu/) Transform Define where to send stimulation (location) Define how often to send stimulation (stimulation rate) Define how much to send in order to preserve the amplitude relationship Coding strategies are usually named for the type of analysis that is made; for instance, feature extraction, spectral peak extraction, compressed analog and continuous interleaved sampling. Feature Extraction: A speech feature is a specific acoustic characteristic of the sound. The combination of features will make one sound different from another.

An example is the difference between the phoneme / t / (tia) /d/ (dies). The only significant difference between the two is the vibration of the vocal folds. This is only one feature called voicing but there are many other features. If just the right number of features that distinguish different sounds from one another could be selected, it should be possible to code speech using these features. The theory of feature extraction proposes to send only the most important features of speech, leaving out other information that may not be as useful for understanding. In this way, a non-functioning auditory system will not be overburdened.

Further more, the electronics for such a system are not as complicated as that required for processing all of the speech information. This type of coding was quite successful for a number of years until technological advances made it possible to process (analyze) speech more rapidly and more efficiently. In an extraction system, not all electrodes available for stimulation are activated at any given point in time. In this case, a group of electrodes is selected that are associated with particular features.

If there are 16 electrodes, four would be selected, but it might be a different set of four at any given time (a roving four). Compressed Analog: An analog signal is like what we are accustomed to hearing over the radio. The radio takes a signal and tries to reproduce it faithfully. If we assume that speech occurs like a series of sound pictures, an analog representation is like a picture of the sound. This means that a set of filters separate speech into different bands, preserving the amplitude relationship, and sends all the information to the cochlear electrodes. The major feature of analog signals is that all electrodes are activated simultaneously.

It is the only strategy that uses this type of transmission. All others rapidly stimulate one electrode at a time. Several problems exist with this technology, mainly in regard to uncontrolled interaction between the electrical fields from the electrodes. If two electrodes are stimulated at the same moment and are next to each other, they will affect one another but in an unknown way. The amplitudes mix or the frequency information overlaps, and these interactions are not predictable because they may differ from moment to moment. Still, this strategy was considered the most promising because it best simulates normal hearing.

The 'compressed' part of the strategy relates to fitting the normal range of speech amplitudes into the electrical range, which is much smaller. Also, because it must send stimulation simultaneously, the implant works very hard and needs a lot of power. This means either larger batteries or short battery life for the speech processor. Maximum Spectral Peak Extraction: This is a much more liberal type of extraction. As with all strategies, speech is divided among a series of filters that are associated with electrodes in the cochlea. The strategy analyzes which filters have the greatest amount of energy (are loudest), selects a subset of filters (always less than the total number of electrodes in the cochlea) and sends a signal to one electrode at a time.

In the implementation, known as SPEAK, 3-10 electrodes are activated at any moment intime with a stimulation rate which averages about 250 pulses per second. In the method called-of-m, the number of electrodes (filters) selected is less than the maximum number available. That is where it derives its name. The major difference is that the number of channels is fixed, i. e., will not vary between 3-10 and the stimulation rate is very rapid (greater than 2,000 pulses per channel). Continuous Interleaved Sampling: This strategy, the newest, stimulates all electrodes (rather than selecting ones as with feature extraction or n-of-m) one after the other (not simultaneously) and sends all information. The major characteristic of this strategy is the speed at which information is transmitted (a rapid stimulation rate) and that it uses the amplitude detection to build the sound images.

The slowest rate might be approx. 850 pulses per second. There is a trend, however, to think that faster is better. This is because the normal cochlea receives and transmits speech information very rapidly. One can imagine that it builds its sound pictures dot-by-dot. The more dots, the clearer the images.

Similar to this, the cochlea implant builds the sound images with its stimulation rate. In theory, the more activations of the auditory nerve, the more information reaches the brain. However, this must be done in a very short period of time so that the brain can put the information together as one image. Like a video player that is sending the picture frames too slowly, the sense of movement is soon lost. If auditory signals are sent too slowly, the sound image may not be as complete. However, this idea of speed is relatively new and no one is sure about what 'fast' really means.

In summary, speech coding strategies perform one of the most important functions of a cochlear implant. Without effective analysis of acoustic signals and effective, efficient transformation of analyzed signals into electrical information, cochlear implant users would not have the ability to understand speech. SPEAK- First High Performing Speech Coding Strategy SPEAK conveys sound information by stimulating many different sites along the cochlea, providing a rich detailed representation of sound. Different electrodes are stimulated to match the variety of pitches in the incoming sound. Here's how SPEAK works: Sound enters the speech processor through the microphone and is divided into twenty frequency bands SPEAK selects the 6 to 10 frequency bands with the most information. Each frequency band stimulates a specific electrode along the electrode array.

The electrode stimulated depends on the pitch of the sound. For example, in the word 'show', the high pitch sound (sh) causes stimulation of electrodes placed near the entrance of the cochlea, where the hearing nerve fibers respond to high pitch sounds. The low pitch sound (ow) stimulates electrodes further into the cochlea, where the hearing nerve fibers respond to low pitch sounds. SPEAK's dynamic stimulation along 20 electrodes assists.