首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Traditional word-recognition tests typically use phonetically balanced (PB) word lists produced by one talker at one speaking rate. Intelligibility measures based on these tests may not adequately evaluate the perceptual processes used to perceive speech under more natural listening conditions involving many sources of stimulus variability. The purpose of this study was to examine the influence of stimulus variability and lexical difficulty on the speech-perception abilities of 17 adults with mild-to-moderate hearing loss. The effects of stimulus variability were studied by comparing word-identification performance in single-talker versus multiple-talker conditions and at different speaking rates. Lexical difficulty was assessed by comparing recognition of "easy" words (i.e., words that occur frequently and have few phonemically similar neighbors) with "hard" words (i.e., words that occur infrequently and have many similar neighbors). Subjects also completed a 20-item questionnaire to rate their speech understanding abilities in daily listening situations. Both sources of stimulus variability produced significant effects on speech intelligibility. Identification scores were poorer in the multiple-talker condition than in the single-talker condition, and word-recognition performance decreased as speaking rate increased. Lexical effects on speech intelligibility were also observed. Word-recognition performance was significantly higher for lexically easy words than lexically hard words. Finally, word-recognition performance was correlated with scores on the self-report questionnaire rating speech understanding under natural listening conditions. The pattern of results suggest that perceptually robust speech-discrimination tests are able to assess several underlying aspects of speech perception in the laboratory and clinic that appear to generalize to conditions encountered in natural listening situations where the listener is faced with many different sources of stimulus variability. That is, word-recognition performance measured under conditions where the talker varied from trial to trial was better correlated with self-reports of listening ability than was performance in a single-talker condition where variability was constrained.  相似文献   

2.
Two talkers' productions of the same phoneme may be quite different acoustically, whereas their productions of different speech sounds may be virtually identical. Despite this lack of invariance in the relationship between the speech signal and linguistic categories, listeners experience phonetic constancy across a wide range of talkers, speaking styles, linguistic contexts, and acoustic environments. The authors present evidence that perceptual sensitivity to talker variability involves an active cognitive mechanism: Listeners expecting to hear 2 different talkers differing only slightly in average pitch showed performance costs typical of adjusting to talker variability, whereas listeners hearing the same materials but expecting a single talker or given no special instructions did not show these performance costs. The authors discuss the implications for understanding phonetic constancy despite variability between talkers (and other sources of variability) and for theories of speech perception. The results provide further evidence for active, controlled processing in real-time speech perception and are consistent with a model of talker normalization that involves contextual tuning. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
Research has shown that speech articulated in a clear manner is easier to understand than conversationally spoken speech in both the auditory-only (A-only) and auditory-visual (AV) domains. Because this research has been conducted using younger adults, it is unknown whether age-related changes in auditory and/or visual processing affect older adults' ability to benefit when a talker speaks clearly. The present study examined how speaking mode (clear vs conversational) and presentation mode (A-only vs AV) influenced nonsense sentence recognition by older listeners. Results showed that neither age nor hearing loss limited the amount of benefit that older adults obtained from a talker speaking clearly. However, age was inversely correlated with identification of AV (but not A-only) conversational speech, even when pure-tone thresholds were controlled statistically.  相似文献   

4.
This study investigated the perceptual adjustments that occur when listeners recognize highly compressed speech. In Experiment 1, adjustment was examined as a function of the amount of exposure to compressed speech by use of 2 different speakers and compression rates. The results demonstrated that adjustment takes place over a number of sentences, depending on the compression rate. Lower compression rates required less experience before full adjustment occurred. In Experiment 2, the impact of an abrupt change in talker characteristics was investigated; in Experiment 3, the impact of an abrupt change in compression rate was studied. The results of these 2 experiments indicated that sudden changes in talker characteristics or compression rate had little impact on the adjustment process. The findings are discussed with respect to the level of speech processing at which such adjustment might occur. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

5.
It is widely assumed that the proper transformation of acoustic amplitude to electric amplitude is a critical factor affecting speech recognition in cochlear implant users and normal-hearing listeners. A four-channel noise-band speech processor was implemented, reducing spectral information to four bands. A power-law transformation was applied to the amplitude mapping stage in the speech processor design, and the exponent of the power function varied from a strongly compressive (p = 0.05) to a weakly compressive (p = 0.75) for implant listeners and from 0.3 to 3.0 for acoustic listeners. Results for implants showed that the best performance was achieved with an exponent of about 0.2, and performance gradually deteriorated when either more compressive or less compressive exponents were applied. The loudness growth functions of the four activated electrodes in each subject were measured and those data were well fit by a power function with a mean exponent of 2.72. The results indicated that the best performance was achieved when the normal loudness growth was restored. For acoustic listeners, results were similar to those observed with cochlear implant listeners, except that best performance was achieved with no amplitude nonlinearity (p = 1.0). The similarity of results in both acoustic and electric stimulation indicated that the performance deterioration observed for extreme nonlinearity was due to similar perceptual effects. The function relating amplitude mapping exponent and performance was relatively flat, indicating that phoneme recognition was only mildly affected by amplitude nonlinearity.  相似文献   

6.
Variability in talker identity and speaking rate, commonly referred to as indexical variation, has demonstrable effects on the speed and accuracy of spoken word recognition. The present study examines the time course of indexical specificity effects to evaluate the hypothesis that such effects occur relatively late in the perceptual processing of spoken words. In 3 long-term repetition priming experiments, the authors examined reaction times to targets that were primed by stimuli that matched or mismatched on the indexical variable of interest (either talker identity or speaking rate). Each experiment was designed to manipulate the speed with which participants processed the stimuli. The results demonstrate that indexical variability affects participants' perception of spoken words only when processing is relatively slow and effortful. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
The contribution of reduced speaking rate to the intelligibility of "clear" speech (Picheny, Durlach, & Braida, 1985) was evaluated by adjusting the durations of speech segments (a) via nonuniform signal time-scaling, (b) by deleting and inserting pauses, and (c) by eliciting materials from a professional speaker at a wide range of speaking rates. Key words in clearly spoken nonsense sentences were substantially more intelligible than those spoken conversationally (15 points) when presented in quiet for listeners with sensorineural impairments and when presented in a noise background to listeners with normal hearing. Repeated presentation of conversational materials also improved scores (6 points). However, degradations introduced by segment-by-segment time-scaling rendered this time-scaling technique problematic as a means of converting speaking styles. Scores for key words excised from these materials and presented in isolation generally exhibited the same trends as in sentence contexts. Manipulation of pause structure reduced scores both when additional pauses were introduced into conversational sentences and when pauses were deleted from clear sentences. Key-word scores for materials produced by a professional talker were inversely correlated with speaking rate, but conversational rate scores did not approach those of clear speech for other talkers. In all experiments, listeners with normal hearing exposed to flat-spectrum background noise performed similarly to listeners with hearing loss.  相似文献   

8.
Topographic brain mapping was used to investigate the ability of young and elderly female listeners to attend to tones at one ear in the presence of speech competition at the opposite ear. An oddball stimulus presentation paradigm was used to record the N1, P2, and P300 components of the late auditory evoked potential from 19 scalp locations. With speech competition, elderly listeners exhibited significantly larger reductions in N1 amplitude than did young listeners. This suggests that N1 may provide an electrophysiologic index of age-related breakdowns in processing sounds in the presence of background competition. An unexpected difference was also found between young and elderly listeners in P300 scalp topography. While the young listeners' P300 response was centered at midline for both left and right ear stimulation, the elderly participants had P300 maxima centered in the parietal area of the hemisphere located contralateral to the test ear. This suggests that some of the functional properties (e.g., timing, strength, orientation) of the P300 neural generators may change with age or, alternatively, that different generators may be operative in elderly listeners.  相似文献   

9.
Speech remains intelligible despite the elimination of canonical acoustic correlates of phonemes from the spectrum. A portion of this perceptual flexibility can be attributed to modulation sensitivity in the auditory-to-phonetic projection, although signal-independent properties of lexical neighborhoods also affect intelligibility in utterances composed of words. Three tests were conducted to estimate the effects of exposure to natural and sine-wave samples of speech in this kind of perceptual versatility. First, sine-wave versions of the easy and hard word sets were created, modeled on the speech samples of a single talker. The performance difference in recognition of easy and hard words was used to index the perceptual reliance on signal-independent properties of lexical contrasts. Second, several kinds of exposure produced familiarity with an aspect of sine-wave speech: (a) sine-wave sentences modeled on the same talker; (b) sine-wave sentences modeled on a different talker, to create familiarity with a sine-wave carrier; and (c) natural sentences spoken by the same talker, to create familiarity with the idiolect expressed in the sine-wave words. Recognition performance with both easy and hard sine-wave words improved after exposure only to sine-wave sentences modeled on the same talker. Third, a control test showed that signal-independent uncertainty is a plausible cause of differences in recognition of easy and hard sine-wave words. The conditions of beneficial exposure reveal the specificity of attention underlying versatility in speech perception. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

10.
The verbal transformation effect (VTE) is a perceptual phenomenon in which listeners report hearing illusory utterances when a spoken word is rapidly repeated for an extended period of time. The cause of the illusion was investigated by identifying regularities across the transformations that listeners reported and then testing hypotheses about the cause of those regularities. Variants of the standard transformation paradigm were used across 3 experiments to demonstrate that perceptual regrouping of the elements in the repeating utterance is 1 cause of the VTE. Findings also suggest that regrouping is influenced by whether the stimulus is perceived as speech or as nonspeech. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
Young normal-hearing listeners and young-elderly listeners between 55 and 65 years of age, ranging from near-normal hearing to moderate hearing loss, were compared using different speech recognition tasks (consonant recognition in quiet and in noise, and time-compressed sentences) and working memory tasks (serial word recall and digit ordering). The results showed that the group of young-elderly listeners performed worse on both the speech recognition and working memory tasks than the young listeners. However, when pure-tone audiometric thresholds were used as a covariate variable, the significant differences between groups disappeared. These results support the hypothesis that sensory decline in young-elderly listeners seems to be an important factor in explaining the decrease in speech processing and working memory capacity observed at these ages. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
A series of experiments was conducted to determine if linguistic representations accessed during reading include auditory imagery for characteristics of a talker's voice. In 3 experiments, participants were familiarized with two talkers during a brief prerecorded conversation. One talker spoke at a fast speaking rate, and one spoke at a slow speaking rate. Each talker was identified by name. At test, participants were asked to either read aloud (Experiment 1) or silently (Experiments 1, 2, and 3) a passage that they were told was written by either the fast or the slow talker. Reading times, both silent and aloud, were significantly slower when participants thought they were reading a passage written by the slow talker than when reading a passage written by the fast talker. Reading times differed as a function of passage author more for difficult than for easy texts, and individual differences in general auditory imagery ability were related to reading times. These results suggest that readers engage in a type of auditory imagery while reading that preserves the perceptual details of an author's voice. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
Among the contextual factors known to play a role in segmental perception are the rate at which the speech was produced and the lexical status of the item, that is, whether it is a meaningful word of the language. In a series of experiments on the word-initial /b/p/ voicing distinction, we investigated the conditions under which these factors operate during speech processing. The results indicated that under instructions of speeded responding, listeners could, on some trials, ignore some later occurring contextual information within the word that specified rate and lexical status. Importantly, however, they could not ignore speaking rate entirely. Although they could base their decision on only the early portion of the word, when doing so they treated the word as if it were physically short—that is to say, as if there were no later occurring information specifying a slower rate. This suggests that listeners always take account of rate when identifying the voicing value of a consonant, but precisely which information within the word is used to specify rate can vary with task demands. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
Humans are remarkably adept at identifying individuals by the sound of their voice, a behavior supported by the nervous system’s ability to integrate information from voice and speech perception. Talker-identification abilities are significantly impaired when listeners are unfamiliar with the language being spoken. Recent behavioral studies describing the language-familiarity effect implicate functionally integrated neural systems for speech and voice perception, yet specific neuroscientific evidence demonstrating the basis for such integration has not yet been shown. Listeners in the present study learned to identify voices speaking a familiar (native) or unfamiliar (foreign) language. The talker-identification performance of neural circuitry in each cerebral hemisphere was assessed using dichotic listening. To determine the relative contribution of circuitry in each hemisphere to ecological (binaural) talker identification abilities, we compared the predictive capacity of dichotic performance on binaural performance across languages. Listeners’ right-ear (left hemisphere) performance was a better predictor of binaural accuracy in their native language than a foreign one. This enhanced role of the classically language-dominant left hemisphere in listeners’ native language demonstrates functionally integrated neural systems for speech and voice perception during talker identification. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
A class of selective attention models often applied to speech perception is used to study effects of training on the perception of an unfamiliar phonetic contrast. Attention-to-dimension (A2D) models of perceptual learning assume that the dimensions that structure listeners' perceptual space are constant and that learning involves only the reweighting of existing dimensions to emphasize or de-emphasize different sensory dimensions. Multidimensional scaling is used to identify the acoustic-phonetic dimensions listeners use before and after training to recognize the 3 classes of Korean stop consonants. Results suggest that A2D models can account for some observed restructuring of listeners' perceptual space, but listeners also show evidence of directing attention to a previously unattended dimension of phonetic contrast. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
This study investigated the hypothesis that age effects exert an increased influence on speech recognition performance as the number of acoustic degradations of the speech signal increases. Four groups participated: young listeners with normal hearing, elderly listeners with normal hearing, young listeners with hearing loss, and elderly listeners with hearing loss. Recognition was assessed for sentence materials degraded by noise, reverberation, or time compression, either in isolation or in binary combinations. Performance scores were converted to an equivalent signal-to-noise ratio index to facilitate direct comparison of the effects of different forms of stimulus degradation. Age effects were observed primarily in multiple degradation conditions featuring time compression of the stimuli. These results are discussed in terms of a postulated change in functional signal-to-noise ratio with increasing age.  相似文献   

17.
Speech comprehension is resistant to acoustic distortion in the input, reflecting listeners' ability to adjust perceptual processes to match the speech input. This adjustment is reflected in improved comprehension of distorted speech with experience. For noise vocoding, a manipulation that removes spectral detail from speech, listeners' word report showed a significantly greater improvement over trials for listeners that heard clear speech presentations before rather than after hearing distorted speech (clear-then-distorted compared with distorted-then-clear feedback, in Experiment 1). This perceptual learning generalized to untrained words suggesting a sublexical locus for learning and was equivalent for word and nonword training stimuli (Experiment 2). These findings point to the crucial involvement of phonological short-term memory and top-down processes in the perceptual learning of noise-vocoded speech. Similar processes may facilitate comprehension of speech in an unfamiliar accent or following cochlear implantation. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
This study compared memory for words and the font in which they appeared (or the voice speaking them) in young and old participants, to explore whether age-related differences in episodic word memory are due to age-related differences in memory for perceptual–contextual information. In each of 3 experiments, young and older participants were presented with words to learn. The words were presented in either 1 of 2 font types, or in 1 of 2 male voices, and participants paid attention either to the fonts or voices or to the meaning of the words. Participants were then tested on both word and font or voice memory. Results showed that younger participants had better explicit memory for font and voice memory and for the words themselves but that older participants benefited at least as much as younger people when perceptual characteristics of the words were reinstated. There was no evidence of an age-related impairment in the encoding of perceptual–contextual information. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
OBJECTIVE: To compare the performance of cochlear implant patients and normal-hearing subjects on a musical interval labeling task, and to determine whether information regarding musical interval size is available to cochlear implant patients under realistic everyday listening conditions. DESIGN: Two Nucleus cochlear implant patients listened to musical intervals that consisted of systematic variations of electric pulse rate on single bipolar intracochlear electrode pairs, whereas normal-hearing listeners were presented with the acoustical analog of these stimuli. Subjects labeled the intonation quality of the stimulus intervals ("flat," "sharp," or "in tune"), relative to their memory for specific intervals abstracted from familiar melodies. The cochlear implant patients, in addition, performed this task with realistic acoustical musical stimuli. RESULTS: The interval labeling behavior of cochlear implant subjects, at low pulse rates, was similar to that of normal-hearing subjects. Furthermore, pitch interval information does not appear to be available to cochlear implant subjects when they are listening to acoustical stimuli via their speech processors. CONCLUSIONS: Temporal information appears to be sufficient for the perception of musical pitch. Encoding strategies that are highly successful in restoring speech understanding do not necessarily provide information regarding melodic pitch interval size.  相似文献   

20.
Emotion is conveyed in speech by semantic content (what is said) and by prosody (how it is said). Prior research suggests that older adults benefit from linguistic prosody when comprehending language but that they have difficulty understanding affective prosody. In a series of 3 experiments, young and older adults listened to sentences in which the emotional cues conveyed by semantic content and affective prosody were either congruent or incongruent and then indicated whether the talker sounded happy or sad. When judging the emotion of the talker, young adults were more attentive to the affective prosodic cues than to the semantic cues, whereas older adults performed less consistently when these cues conflicted. Participants’ reading and repetition of the sentences were recorded so that age- and emotion-related changes in the production of emotional speech cues could be examined. Both young and older adults were able to produce affective prosody. The age-related difference in perceiving emotion was eliminated when listeners repeated the sentences before responding, consistent with previous findings regarding the beneficial role of repetition in conversation. The results of these experiments suggest that there are age-related differences in interpreting affective prosody but that repeating may be a compensatory strategy that could minimize the everyday consequences of these differences. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号