Multisensory Integration of Native and Nonnative Speech in Bilingual and Monolingual Adults
Face-to-face speech communication is an audiovisual process during which the interlocuters use both the auditory speech signals as well as visual, oral articulations to understand the other. These sensory inputs are merged into a single, unified process known as multisensory integration. Audiovisual speech integration is known to be influenced by many factors, including listener experience. In this study, we investigated the roles of bilingualism and language experience on integration. We used a McGurk paradigm in which participants were presented with incongruent auditory and visual speech. This included an auditory utterance of 'ba' paired with visual articulations of 'ga' that often induce the perception of 'da' or 'tha', a fusion effect that is strong evidence of integration, as well as an auditory utterance of 'ga' paired with visual articulations of 'ba' that often induce the perception of 'bga', a combination effect that is weaker evidence of integration. We compared fusion and combination effects on three groups ( N = 20 each), English monolinguals, Spanish-English bilinguals, and Arabic-English bilinguals, with stimuli presented in all three languages. Monolinguals exhibited significantly stronger multisensory integration than bilinguals in fusion effects, regardless of the stimulus language. Bilinguals exhibited a nonsignificant trend by which greater experience led to increased integration as measured by fusion. These results held regardless of whether McGurk presentations were presented as stand-alone syllables or in the context of real words.
The Impact of Viewing Distance and Proprioceptive Manipulations on a Virtual Reality Based Balance Test
Our ability to maintain our balance plays a pivotal role in day-to-day activities. This ability is believed to be the result of interactions between several sensory modalities including vision and proprioception. Past research has revealed that different aspects of vision including relative visual motion (i.e., sensed motion of the visual field due to head motion), which can be manipulated by changing the viewing distance between the individual and the predominant visual cues, have an impact on balance. However, only a small number of studies have examined this in the context of virtual reality, and none examined the impact of proprioceptive manipulations for viewing distances greater than 3.5 m. To address this, we conducted an experiment in which 25 healthy adults viewed a dartboard in a virtual gymnasium while standing in narrow stance on firm and compliant surfaces. The dartboard distance varied with three different conditions of 1.5 m, 6 m, and 24 m, including a blacked-out condition. Our results indicate that decreases in relative visual motion, due to an increased viewing distance, yield decreased postural stability - but only with simultaneous proprioceptive disruptions.
Evidence for a Causal Dissociation of the McGurk Effect and Congruent Audiovisual Speech Perception via TMS to the Left pSTS
Congruent visual speech improves speech perception accuracy, particularly in noisy environments. Conversely, mismatched visual speech can alter what is heard, leading to an illusory percept that differs from the auditory and visual components, known as the McGurk effect. While prior transcranial magnetic stimulation (TMS) and neuroimaging studies have identified the left posterior superior temporal sulcus (pSTS) as a causal region involved in the generation of the McGurk effect, it remains unclear whether this region is critical only for this illusion or also for the more general benefits of congruent visual speech (e.g., increased accuracy and faster reaction times). Indeed, recent correlative research suggests that the benefits of congruent visual speech and the McGurk effect rely on largely independent mechanisms. To better understand how these different features of audiovisual integration are causally generated by the left pSTS, we used single-pulse TMS to temporarily disrupt processing within this region while subjects were presented with either congruent or incongruent (McGurk) audiovisual combinations. Consistent with past research, we observed that TMS to the left pSTS reduced the strength of the McGurk effect. Importantly, however, left pSTS stimulation had no effect on the positive benefits of congruent audiovisual speech (increased accuracy and faster reaction times), demonstrating a causal dissociation between the two processes. Our results are consistent with models proposing that the pSTS is but one of multiple critical areas supporting audiovisual speech interactions. Moreover, these data add to a growing body of evidence suggesting that the McGurk effect is an imperfect surrogate measure for more general and ecologically valid audiovisual speech behaviors.
What is the Relation between Chemosensory Perception and Chemosensory Mental Imagery?
The study of chemosensory mental imagery is undoubtedly made more difficult because of the profound individual differences that have been reported in the vividness of (e.g.) olfactory mental imagery. At the same time, the majority of those researchers who have attempted to study people's mental imagery abilities for taste (gustation) have actually mostly been studying flavour mental imagery. Nevertheless, there exists a body of human psychophysical research showing that chemosensory mental imagery exhibits a number of similarities with chemosensory perception. Furthermore, the two systems have frequently been shown to interact with one another, the similarities and differences between chemosensory perception and chemosensory mental imagery at the introspective, behavioural, psychophysical, and cognitive neuroscience levels in humans are considered in this narrative historical review. The latest neuroimaging evidence show that many of the same brain areas are engaged by chemosensory mental imagery as have previously been documented to be involved in chemosensory perception. That said, the pattern of neural connectively is reversed between the 'top-down' control of chemosensory mental imagery and the 'bottom-up' control seen in the case of chemosensory perception. At the same time, however, there remain a number of intriguing questions as to whether it is even possible to distinguish between orthonasal and retronasal olfactory mental imagery, and the extent to which mental imagery for flavour, which most people not only describe as, but also perceive to be, the 'taste' of food and drink, is capable of reactivating the entire flavour network in the human brain.
Can Multisensory Olfactory Training Improve Olfactory Dysfunction Caused by COVID-19?
Approximately 30-60% of people suffer from olfactory dysfunction (OD) such as hyposmia or anosmia after being diagnosed with COVID-19; 15-20% of these cases last beyond resolution of the acute phase. Previous studies have shown that olfactory training can be beneficial for patients affected by OD caused by viral infections of the upper respiratory tract. The aim of the study is to evaluate whether a multisensory olfactory training involving simultaneously tasting and seeing congruent stimuli is more effective than the classical olfactory training. We recruited 68 participants with persistent OD for two months or more after COVID-19 infection; they were divided into three groups. One group received olfactory training which involved smelling four odorants (strawberry, cheese, coffee, lemon; classical olfactory training). The other group received the same olfactory stimuli but presented retronasally (i.e., as droplets on their tongue); while simultaneous and congruent gustatory (i.e., sweet, salty, bitter, sour) and visual (corresponding images) stimuli were presented (multisensory olfactory training). The third group received odorless propylene glycol in four bottles (control group). Training was carried out twice daily for 12 weeks. We assessed olfactory function and olfactory specific quality of life before and after the intervention. Both intervention groups showed a similar significant improvement of olfactory function, although there was no difference in the assessment of quality of life. Both multisensory and classical training can be beneficial for OD following a viral infection; however, only the classical olfactory training paradigm leads to an improvement that was significantly stronger than the control group.
Audiovisual Speech Perception Benefits are Stable from Preschool through Adolescence
The ability to leverage visual cues in speech perception - especially in noisy backgrounds - is well established from infancy to adulthood. Yet, the developmental trajectory of audiovisual benefits stays a topic of debate. The inconsistency in findings can be attributed to relatively small sample sizes or tasks that are not appropriate for given age groups. We designed an audiovisual speech perception task that was cognitively and linguistically age-appropriate from preschool to adolescence and recruited a large sample ( N = 161) of children (age 4-15). We found that even the youngest children show reliable speech perception benefits when provided with visual cues and that these benefits are consistent throughout development when auditory and visual signals match. Individual variability is explained by how the child experiences their speech-in-noise performance rather than the quality of the signal itself. This underscores the importance of visual speech for young children who are regularly in noisy environments like classrooms and playgrounds.
Glassware Influences the Perception of Orange Juice in Simulated Naturalistic versus Urban Conditions
The latest research demonstrates that people's perception of orange juice can be influenced by the shape/type of receptacle in which it happens to be served. Two studies are reported that were designed to investigate the impact, if any, that the shape/type of glass might exert over the perception of the contents, the emotions induced on tasting the juice and the consumer's intention to purchase orange juice. The same quantity of orange juice (100 ml) was presented and evaluated in three different glasses: a straight-sided, a curved and a tapered glass. Questionnaires were used to assess taste (aroma, flavour intensity, sweetness, freshness and fruitiness), pleasantness and intention to buy orange juice. Study 2 assessed the impact of the same three glasses in two digitally rendered atmospheric conditions (nature vs urban). In Study 1, the perceived sweetness and pleasantness of the orange juice was significantly influenced by the shape/type of the glass in which it was presented. Study 2 reported significant interactions between condition (nature vs urban) and glass shape (tapered, straight-sided and curved). Perceived aroma, flavour intensity and pleasantness were all significantly affected by the simulated audiovisual context or atmosphere. Compared to the urban condition, perceived aroma, freshness, fruitiness and pleasantness were rated significantly higher in the nature condition. On the other hand, flavour intensity and sweetness were rated significantly higher in the urban condition than in the natural condition. These results are likely to be relevant for those interested in providing food services, or company managers offering beverages to their customers.
Revisiting the Deviation Effects of Irrelevant Sound on Serial and Nonserial Tasks
Two types of disruptive effects of irrelevant sound on visual tasks have been reported: the changing-state effect and the deviation effect. The idea that the deviation effect, which arises from attentional capture, is independent of task requirements, whereas the changing-state effect is specific to tasks that require serial processing, has been examined by comparing tasks that do or do not require serial-order processing. While many previous studies used the missing-item task as the nonserial task, it is unclear whether other cognitive tasks lead to similar results regarding the different task specificity of both effects. Kattner et al. (Memory and Cognition, 2023) used the mental-arithmetic task as the nonserial task, and failed to demonstrate the deviation effect. However, there were several procedural factors that could account for the lack of deviation effect, such as differences in design and procedures (e.g., conducted online, intermixed conditions). In the present study, we aimed to investigate whether the deviation effect could be observed in both the serial-recall and mental-arithmetic tasks when these procedural factors were modified. We found strong evidence of the deviation effect in both the serial-recall and the mental-arithmetic tasks when stimulus presentation and experimental design were aligned with previous studies that demonstrated the deviation effect (e.g., conducted in-person, blockwise presentation of sound, etc.). The results support the idea that the deviation effect is not task-specific.
Is Front associated with Above and Back with Below? Association between Allocentric Representations of Spatial Dimensions
Previous research has revealed congruency effects between different spatial dimensions such as right and up. In the audiovisual context, high-pitched sounds are associated with the spatial dimensions of up/above and front, while low-pitched sounds are associated with the spatial dimensions of down/below and back. This opens the question of whether there could also be a spatial association between above and front and/or below and back. Participants were presented with a high- or low-pitch stimulus at the time of the onset of the visual stimulus. In one block, participants responded according to the above/below location of the visual target stimulus if the target appeared in front of the reference object, and in the other block, they performed these above/below responses if the target appeared at the back of the reference. In general, reaction times revealed an advantage in processing the target location in the front-above and back-below locations. The front-above/back-below effect was more robust concerning the back-below component of the effect, and significantly larger in reaction times that were slower rather than faster than the median value of a participant. However, the pitch did not robustly influence responding to front/back or above/below locations. We propose that this effect might be based on the conceptual association between different spatial dimensions.
Perceptual Adaptation to Noise-Vocoded Speech by Lip-Read Information: No Difference between Dyslexic and Typical Readers
Auditory speech can be difficult to understand but seeing the articulatory movements of a speaker can drastically improve spoken-word recognition and, on the longer-term, it helps listeners to adapt to acoustically distorted speech. Given that individuals with developmental dyslexia (DD) have sometimes been reported to rely less on lip-read speech than typical readers, we examined lip-read-driven adaptation to distorted speech in a group of adults with DD ( N = 29) and a comparison group of typical readers ( N = 29). Participants were presented with acoustically distorted Dutch words (six-channel noise-vocoded speech, NVS) in audiovisual training blocks (where the speaker could be seen) interspersed with audio-only test blocks. Results showed that words were more accurately recognized if the speaker could be seen (a lip-read advantage), and that performance steadily improved across subsequent auditory-only test blocks (adaptation). There were no group differences, suggesting that perceptual adaptation to disrupted spoken words is comparable for dyslexic and typical readers. These data open up a research avenue to investigate the degree to which lip-read-driven speech adaptation generalizes across different types of auditory degradation, and across dyslexic readers with decoding versus comprehension difficulties.
Four-Stroke Apparent Motion Can Effectively Induce Visual Self-Motion Perception: an Examination Using Expanding, Rotating, and Translating Motion
The current investigation examined whether visual motion without continuous visual displacement could effectively induce self-motion perception (vection). Four-stroke apparent motions (4SAM) were employed in the experiments as visual inducers. The 4SAM pattern contained luminance-defined motion energy equivalent to the real motion pattern, and the participants perceived unidirectional motion according to the motion energy but without displacements (the visual elements flickered on the spot). The experiments revealed that the 4SAM stimulus could effectively induce vection in the horizontal, expanding, or rotational directions, although its strength was significantly weaker than that induced by the real-motion stimulus. This result suggests that visual displacement is not essential, and the luminance-defined motion energy and/or the resulting perceived motion of the visual inducer would be sufficient for inducing visual self-motion perception. Conversely, when the 4SAM and real-motion patterns were presented simultaneously, self-motion perception was mainly determined in accordance with real motion, suggesting that the real-motion stimulus is a predominant determinant of vection. These research outcomes may be worthy of considering the perceptual and neurological mechanisms underlying self-motion perception.
The Multimodal Trust Effects of Face, Voice, and Sentence Content
Trust is an aspect critical to human social interaction and research has identified many cues that help in the assimilation of this social trait. Two of these cues are the pitch of the voice and the width-to-height ratio of the face (fWHR). Additionally, research has indicated that the content of a spoken sentence itself has an effect on trustworthiness; a finding that has not yet been brought into multisensory research. The current research aims to investigate previously developed theories on trust in relation to vocal pitch, fWHR, and sentence content in a multimodal setting. Twenty-six female participants were asked to judge the trustworthiness of a voice speaking a neutral or romantic sentence while seeing a face. The average pitch of the voice and the fWHR were varied systematically. Results indicate that the content of the spoken message was an important predictor of trustworthiness extending into multimodality. Further, the mean pitch of the voice and fWHR of the face appeared to be useful indicators in a multimodal setting. These effects interacted with one another across modalities. The data demonstrate that trust in the voice is shaped by task-irrelevant visual stimuli. Future research is encouraged to clarify whether these findings remain consistent across genders, age groups, and languages.
Perceived Audio-Visual Simultaneity Is Recalibrated by the Visual Intensity of the Preceding Trial
A vital heuristic used when making judgements on whether audio-visual signals arise from the same event, is the temporal coincidence of the respective signals. Previous research has highlighted a process, whereby the perception of simultaneity rapidly recalibrates to account for differences in the physical temporal offsets of stimuli. The current paper investigated whether rapid recalibration also occurs in response to differences in central arrival latencies, driven by visual-intensity-dependent processing times. In a behavioural experiment, observers completed a temporal-order judgement (TOJ), simultaneity judgement (SJ) and simple reaction-time (RT) task and responded to audio-visual trials that were preceded by other audio-visual trials with either a bright or dim visual stimulus. It was found that the point of subjective simultaneity shifted, due to the visual intensity of the preceding stimulus, in the TOJ, but not SJ task, while the RT data revealed no effect of preceding intensity. Our data therefore provide some evidence that the perception of simultaneity rapidly recalibrates based on stimulus intensity.
Tactile Landmarks: the Relative Landmark Location Alters Spatial Distortions
The influence of landmarks, that is, nearby non-target stimuli, on spatial perception has been shown in multiple ways. These include altered target localization variability near landmarks and systematic spatial distortions of target localizations. Previous studies have mostly been conducted in the visual modality using temporary, artificial landmarks or the tactile modality with persistent landmarks on the body. Thus, it is unclear whether both landmark types produce the same spatial distortions as they were never investigated in the same modality. Addressing this, we used a novel tactile setup to present temporary, artificial landmarks on the forearm and systematically manipulated their location to either be close to a persistent landmark (wrist or elbow) or in between both persistent landmarks at the middle of the forearm. Initial data (Exp. 1 and Exp. 2) suggested systematic differences of temporary landmarks based on their distance from the persistent landmark, possibly indicating different distortions of temporary and persistent landmarks. Subsequent control studies (Exp. 3 and Exp. 4) showed this effect was driven by the relative landmark location within the target distribution. Specifically, landmarks in the middle of the target distribution led to systematic distortions of target localizations toward the landmark, whereas landmarks at the side led to distortions away from the landmark for nearby targets, and toward the landmark with wider distances. Our results indicate that experimental results with temporary landmarks can be generalized to more natural settings with persistent landmarks, and further reveal that the relative landmark location leads to different effects of the pattern of spatial distortions.
Addressing the Association Between Action Video Game Playing Experience and Visual Search in Naturalistic Multisensory Scenes
Prior studies investigating the effects of routine action video game play have demonstrated improvements in a variety of cognitive processes, including improvements in attentional tasks. However, there is little evidence indicating that the cognitive benefits of playing action video games generalize from simplified unisensory stimuli to multisensory scenes - a fundamental characteristic of natural, everyday life environments. The present study addressed if video game experience has an impact on crossmodal congruency effects when searching through such multisensory scenes. We compared the performance of action video game players (AVGPs) and non-video game players (NVGPs) on a visual search task for objects embedded in video clips of realistic scenes. We conducted two identical online experiments with gender-balanced samples, for a total of N = 130. Overall, the data replicated previous findings reporting search benefits when visual targets were accompanied by semantically congruent auditory events, compared to neutral or incongruent ones. However, according to the results, AVGPs did not consistently outperform NVGPs in the overall search task, nor did they use multisensory cues more efficiently than NVGPs. Exploratory analyses with self-reported gender as a variable revealed a potential difference in response strategy between experienced male and female AVGPs when dealing with crossmodal cues. These findings suggest that the generalization of the advantage of AVG experience to realistic, crossmodal situations should be made with caution and considering gender-related issues.
Cross-Modal Contributions to Episodic Memory for Voices
Multisensory context often facilitates perception and memory. In fact, encoding items within a multisensory context can improve memory even on strictly unisensory tests (i.e., when the multisensory context is absent). Prior studies that have consistently found these multisensory facilitation effects have largely employed multisensory contexts in which the stimuli were meaningfully related to the items targeting for remembering (e.g., pairing canonical sounds and images). Other studies have used unrelated stimuli as multisensory context. A third possible type of multisensory context is one that is environmentally related simply because the stimuli are often encountered together in the real world. We predicted that encountering such a multisensory context would also enhance memory through cross-modal associations, or representations relating to one's prior multisensory experience with that sort of stimuli in general. In two memory experiments, we used faces and voices of unfamiliar people as everyday stimuli individuals have substantial experience integrating the perceptual features of. We assigned participants to face- or voice-recognition groups and ensured that, during the study phase, half of the face or voice targets were encountered also with information in the other modality. Voices initially encoded along with faces were consistently remembered better, providing evidence that cross-modal associations could explain the observed multisensory facilitation.
Spatial Sensory References for Vestibular Self-Motion Perception
While navigating through the surroundings, we constantly rely on inertial vestibular signals for self-motion along with visual and acoustic spatial references from the environment. However, the interaction between inertial cues and environmental spatial references is not yet fully understood. Here we investigated whether vestibular self-motion sensitivity is influenced by sensory spatial references. Healthy participants were administered a Vestibular Self-Motion Detection Task in which they were asked to detect vestibular self-motion sensations induced by low-intensity Galvanic Vestibular Stimulation. Participants performed this detection task with or without an external visual or acoustic spatial reference placed directly in front of them. We computed the d prime ( d ' ) as a measure of participants' vestibular sensitivity and the criterion as an index of their response bias. Results showed that the visual spatial reference increased sensitivity to detect vestibular self-motion. Conversely, the acoustic spatial reference did not influence self-motion sensitivity. Both visual and auditory spatial references did not cause changes in response bias. Environmental visual spatial references provide relevant information to enhance our ability to perceive inertial self-motion cues, suggesting a specific interaction between visual and vestibular systems in self-motion perception.
Reflections on Cross-Modal Correspondences: Current Understanding and Issues for Future Research
The past two decades have seen an explosion of research on cross-modal correspondences. Broadly speaking, this term has been used to encompass associations between and among features, dimensions, or attributes across the senses. There has been an increasing interest in this topic amongst researchers from multiple fields (psychology, neuroscience, music, art, environmental design, etc.) and, importantly, an increasing breadth of the topic's scope. Here, this narrative review aims to reflect on what cross-modal correspondences are, where they come from, and what underlies them. We suggest that cross-modal correspondences are usefully conceived as relative associations between different actual or imagined sensory stimuli, many of these correspondences being shared by most people. A taxonomy of correspondences with four major kinds of associations (physiological, semantic, statistical, and affective) characterizes cross-modal correspondences. Sensory dimensions (quantity/quality) and sensory features (lower perceptual/higher cognitive) correspond in cross-modal correspondences. Cross-modal correspondences may be understood (or measured) from two complementary perspectives: the phenomenal view (perceptual experiences of subjective matching) and the behavioural response view (observable patterns of behavioural response to multiple sensory stimuli). Importantly, we reflect on remaining questions and standing issues that need to be addressed in order to develop an explanatory framework for cross-modal correspondences. Future research needs (a) to understand better when (and why) phenomenal and behavioural measures are coincidental and when they are not, and, ideally, (b) to determine whether different kinds of cross-modal correspondence (quantity/quality, lower perceptual/higher cognitive) rely on the same or different mechanisms.
Stationary Haptic Stimuli Do not Produce Ocular Accommodation in Most Individuals
This study aimed to determine the extent to which haptic stimuli can influence ocular accommodation, either alone or in combination with vision. Accommodation was measured objectively in 15 young adults as they read stationary targets containing Braille letters. These cards were presented at four distances in the range 20-50 cm. In the Touch condition, the participant read by touch with their dominant hand in a dark room. Afterward, they estimated card distance with their non-dominant hand. In the Vision condition, they read by sight binocularly without touch in a lighted room. In the Touch with Vision condition, they read by sight binocularly and with touch in a lighted room. Sensory modality had a significant overall effect on the slope of the accommodative stimulus-response function. The slope in the Touch condition was not significantly different from zero, even though depth perception from touch was accurate. Nevertheless, one atypical participant had a moderate accommodative slope in the Touch condition. The accommodative slope in the Touch condition was significantly poorer than in the Vision condition. The accommodative slopes in the Vision condition and Touch with Vision condition were not significantly different. For most individuals, haptic stimuli for stationary objects do not influence the accommodation response, alone or in combination with vision. These haptic stimuli provide accurate distance perception, thus questioning the general validity of Heath's model of proximal accommodation as driven by perceived distance. Instead, proximally induced accommodation relies on visual rather than touch stimuli.
Motor Signals Mediate Stationarity Perception
Head movement relative to the stationary environment gives rise to congruent vestibular and visual optic-flow signals. The resulting perception of a stationary visual environment, referred to herein as stationarity perception, depends on mechanisms that compare visual and vestibular signals to evaluate their congruence. Here we investigate the functioning of these mechanisms and their dependence on fixation behavior as well as on the active versus passive nature of the head movement. Stationarity perception was measured by modifying the gain on visual motion relative to head movement on individual trials and asking subjects to report whether the gain was too low or too high. Fitting a psychometric function to the data yields two key parameters of performance. The mean is a measure of accuracy, and the standard deviation is a measure of precision. Experiments were conducted using a head-mounted display with fixation behavior monitored by an embedded eye tracker. During active conditions, subjects rotated their heads in yaw ∼15 deg/s over ∼1 s. Each subject's movements were recorded and played back via rotating chair during the passive condition. During head-fixed and scene-fixed fixation the fixation target moved with the head or scene, respectively. Both precision and accuracy were better during active than passive head movement, likely due to increased precision on the head movement estimate arising from motor prediction and neck proprioception. Performance was also better during scene-fixed than head-fixed fixation, perhaps due to decreased velocity of retinal image motion and increased precision on the retinal image motion estimate. These results reveal how the nature of head and eye movements mediate encoding, processing, and comparison of relevant sensory and motor signals.
Investigating the Role of Leading Sensory Modality and Autistic Traits in the Visual-Tactile Temporal Binding Window
Our ability to integrate multisensory information depends on processes occurring during the temporal binding window. There is limited research investigating the temporal binding window for visual-tactile integration and its relationship with autistic traits, sensory sensitivity, and unusual sensory experiences. We measured the temporal binding window for visual-tactile integration in 27 neurotypical participants who completed a simultaneity judgement task and three questionnaires: the Autism Quotient, the Glasgow Sensory Questionnaire, and the Multi-Modality Unusual Sensory Experiences Questionnaire. The average width of the visual-leading visual-tactile (VT) temporal binding window was 123 ms, significantly narrower than the tactile-leading visual-tactile (TV) window (193 ms). When comparing crossmodal (visual-tactile) stimuli with unimodal (visual-visual or tactile-tactile), the temporal binding window was significantly larger for crossmodal stimuli (VT: 123 ms; TV: 193 ms) than for unimodal pairs of stimuli (visual: 38 ms; tactile 42 ms). We did not find evidence to support a relationship between the size of the temporal binding window and autistic traits, sensory sensitivities, or unusual sensory perceptual experiences in this neurotypical population. Our results indicate that the leading sense presented in a multisensory pair influences the width of the temporal binding window. When tactile stimuli precede visual stimuli it may be difficult to determine the temporal boundaries of the stimuli, which leads to a delay in shifting attention from tactile to visual stimuli. This ambiguity in determining temporal boundaries of stimuli likely influences our ability to decide on whether stimuli are simultaneous or nonsimultaneous, which in turn leads to wider temporal binding windows.
Beyond the Eye: Multisensory Contributions to the Sensation of Illusory Self-Motion (Vection)
Vection is typically defined as the embodied illusion of self-motion in the absence of real physical movement through space. Vection can occur in real-life situations (e.g., 'train illusion') and in virtual environments and simulators. The vast majority of vection research focuses on vection caused by visual stimulation. Even though visually induced vection is arguably the most compelling type of vection, the role of nonvisual sensory inputs, such as auditory, biomechanical, tactile, and vestibular cues, have recently gained more attention. Non-visual cues can play an important role in inducing vection in two ways. First, nonvisual cues can affect the occurrence and strength of vection when added to corresponding visual information. Second, nonvisual cues can also elicit vection in the absence of visual information, for instance when observers are blindfolded or tested in darkness. The present paper provides a narrative review of the literature on multimodal contributions to vection. We will discuss both the theoretical and applied relevance of multisensory processing as related to the experience of vection and provide design considerations on how to enhance vection in various contexts.
Joint Contributions of Auditory, Proprioceptive and Visual Cues on Human Balance
One's ability to maintain their center of mass within their base of support (i.e., balance) is believed to be the result of multisensory integration. Much of the research in this literature has focused on integration of visual, vestibular, and proprioceptive cues. However, several recent studies have found evidence that auditory cues can impact balance control metrics. In the present study, we sought to better characterize the impact of auditory cues on narrow stance balance task performance with different combinations of visual stimuli (virtual and real world) and support surfaces (firm and compliant). In line with past results, we found that reducing the reliability of proprioceptive cues and visual cues yielded consistent increases in center-of-pressure (CoP) sway metrics, indicating more imbalance. Masking ambient auditory cues with broadband noise led to less consistent findings; however, when effects were observed they were substantially smaller for auditory cues than for proprioceptive and visual cues - and in the opposite direction (i.e., masking ambient auditory cues with broadband noise reduced sway in some situations). Additionally, trials that used virtual and real-world visual stimuli did not differ unless participants were standing on a surface that disrupted proprioceptive cues; disruption of proprioception led to increased CoP sway metrics in the virtual visual condition. This is the first manuscript to report the effect size of different perturbations in this context, and the first to study the impact of acoustically complex environments on balance in comparison to visual and proprioceptive contributions. Future research is needed to better characterize the impact of different acoustic environments on balance.
The Audiovisual Mismatch Negativity in Predictive and Non-Predictive Speech Stimuli in Older Adults With and Without Hearing Loss
Adults with aging-related hearing loss (ARHL) experience adaptive neural changes to optimize their sensory experiences; for example, enhanced audiovisual (AV) and predictive processing during speech perception. The mismatch negativity (MMN) event-related potential is an index of central auditory processing; however, it has not been explored as an index of AV and predictive processing in adults with ARHL. In a pilot study we examined the AV MMN in two conditions of a passive oddball paradigm - one AV condition in which the visual aspect of the stimulus can predict the auditory percept and one AV control condition in which the visual aspect of the stimulus cannot predict the auditory percept. In adults with ARHL, evoked responses in the AV conditions occurred in the early MMN time window while the older adults with normal hearing showed a later MMN. Findings suggest that adults with ARHL are sensitive to AV incongruity, even when the visual is not predictive of the auditory signal. This suggests that predictive coding for AV speech processing may be heightened in adults with ARHL. This paradigm can be used in future studies to measure treatment related changes, for example via aural rehabilitation, in older adults with ARHL.
Exploring Crossmodal Associations Between Sound and the Chemical Senses: A Systematic Review Including Interactive Visualizations
This is the first systematic review that focuses on the influence of product-intrinsic and extrinsic sounds on the chemical senses involving both food and aroma stimuli. This review has a particular focus on all methodological details (stimuli, experimental design, dependent variables, and data analysis techniques) of 95 experiments, published in 83 publications from 2012 to 2023. 329 distinct crossmodal auditory-chemosensory associations were uncovered across this analysis. What is more, instead of relying solely on static figures and tables, we created a first-of-its-kind comprehensive Power BI dashboard (interactive data visualization tool by Microsoft) on methodologies and significant findings, incorporating various filters and visualizations allowing readers to explore statistics for specific subsets of experiments. We believe that this review can be helpful for researchers and practitioners working in the food and beverage industry and beyond these scopes (e.g., cosmetics). Theoretical and practical implications discussed in this article point to computational approaches that facilitate decision-making regarding multisensory experimental methodology design.
From the Outside in: ASMR Is Characterised by Reduced Interoceptive Accuracy but Higher Sensation Seeking
Autonomous Sensory Meridian Response (ASMR) is a complex sensory-perceptual phenomenon characterised by relaxing and pleasurable scalp-tingling sensations. The ASMR trait is nonuniversal, thought to have developmental origins, and a prevalence rate of 20%. Previous theory and research suggest that trait ASMR may be underlined by atypical multisensory perception from both interoceptive and exteroceptive modalities. In this study, we examined whether ASMR responders differed from nonresponders in interoceptive accuracy and multisensory processing style. Results showed that ASMR responders had lower interoceptive accuracy but a greater tendency towards sensation seeking, especially for tactile, olfactory, and gustatory modalities. Exploratory mediation analyses suggest that sensation-seeking behaviours in trait ASMR could reflect a compensatory mechanism for either deficits in interoceptive accuracy, a tendency to weight exteroceptive signals more strongly, or both. This study provides the foundations for understanding how interoceptive and exteroceptive mechanisms might explain not only the ASMR trait, but also individual differences in the ability to experience complex positive emotions more generally.
Subjective Audibility Modulates the Susceptibility to Sound-Induced Flash Illusion: Effect of Loudness and Auditory Masking
When a brief flash is presented along with two brief sounds, the single flash is often perceived as two flashes. This phenomenon is called a sound-induced flash illusion, in which the auditory sense, with its relatively higher reliability in providing temporal information, modifies the visual perception. Decline of audibility due to hearing impairment is known to make subjects less susceptible to the flash illusion. However, the effect of decline of audibility on susceptibility to the illusion has not been directly investigated in subjects with normal hearing. The present study investigates the relationship between audibility and susceptibility to the illusion by varying the sound pressure level of the stimulus. In the task for reporting the number of auditory stimuli, lowering the sound pressure level caused the rate of perceiving two sounds to decrease on account of forward masking. The occurrence of the illusory flash was reduced as the intensity of the second auditory stimulus decreased, and was significantly correlated with the rate of perceiving the two auditory stimuli. These results suggest that the susceptibility to sound-induced flash illusion depends on the subjective audibility of each sound.
Motion-Binding Property Contributes to Accurate Temporal-Order Perception in Audiovisual Synchrony
Temporal perception in multisensory processing is important for an accurate and efficient understanding of the physical world. In general, it is executed in a dynamic environment in our daily lives. In particular, the motion-binding property is important for correctly identifying moving objects in the external environment. However, how this property affects multisensory temporal perception remains unclear. We investigate whether the motion-binding property influences audiovisual temporal integration. The study subjects performed four types of temporal-order judgment (TOJ) task experiments using three types of perception. In Experiment 1, the subjects conducted audiovisual TOJ tasks in the motion-binding condition, between two flashes, and in the simultaneous condition, in which the two flashes are perceived as simultaneous stimuli without motion. In Experiment 2, subjects conducted audiovisual TOJ tasks in the motion-binding condition and the short and long successive interval condition, in which the two stimuli are perceived as successive with no motion. The results revealed that the point of subjective simultaneity (PSS) and the just-noticeable difference (JND) in the motion-binding condition differed significantly from those in the simultaneous and short and long successive interval conditions. Specifically, the PSS in the motion-binding condition was shifted toward a sound-lead stimulus in which the PSS became closer to zero (i.e., physical simultaneity) and the JND became narrower compared to other conditions. This suggests that the motion-binding property contributes to accurate temporal integration in multisensory processing by precisely encoding the temporal order of the physical stimuli.
Visuo-Tactile Congruence Leads to Stronger Illusion Than Visuo-Proprioceptive Congruence: a Quantitative and Qualitative Approach to Explore the Rubber Hand Illusion
The Rubber Hand Illusion (RHI) arises through multisensory congruence and informative cues from the most relevant sensory channels. Some studies have explored the RHI phenomenon on the fingers, but none of them modulated the congruence of visuo-tactile and visuo-proprioceptive information by changing the posture of the fingers. This study hypothesizes that RHI induction is possible despite a partial visuo-proprioceptive or visuo-tactile incongruence. With quantitative and qualitative measures, we observed that gradual induction of the sense of body ownership depends on the congruence of multisensory information, with an emphasis on visuo-tactile information rather than visuo-proprioceptive signals. Based on the overall measures, the RHI observed went from stronger to weaker with full congruence; visuo-proprioceptive incongruence and visuo-tactile congruence; visuo-proprioceptive congruence and visuo-tactile incongruence; full incongruence. Our results confirm that congruent visual and tactile mapping is important, though not mandatory, to induce a strong sense of ownership. By changing index finger and thumb postures rather than the rotation of the whole hand, our study investigates the contribution of visuo-proprioception and postural congruence in the field of RHI research. The results are in favor of a probabilistic multisensory integration theory and do not resonate with rules and constraints found in internal body models. The RHI could be illustrated as a continuum: the more multisensory information is congruent, the stronger the RHI.
What Makes the Detection of Movement Different Within the Autistic Traits Spectrum? Evidence From the Audiovisual Depth Paradigm
Atypical sensory processing is now considered a diagnostic feature of autism. Although multisensory integration (MSI) may have cascading effects on the development of higher-level skills such as socio-communicative functioning, there is a clear lack of understanding of how autistic individuals integrate multiple sensory inputs. Multisensory dynamic information is a more ecological construct than static stimuli, reflecting naturalistic sensory experiences given that our environment involves moving stimulation of more than one sensory modality at a time. In particular, depth movement informs about crucial social (approaching to interact) and non-social (avoiding threats/collisions) information. As autistic characteristics are distributed on a spectrum over clinical and general populations, our work aimed to explore the multisensory integration of depth cues in the autistic personality spectrum, using a go/no-go detection task. The autistic profile of 38 participants from the general population was assessed using questionnaires extensively used in the literature. Participants performed a detection task of auditory and/or visual depth moving stimuli compared to static stimuli. We found that subjects with high-autistic traits overreacted to depth movement and exhibited faster reaction times to audiovisual cues, particularly when the audiovisual stimuli were looming and/or were presented at a fast speed. These results provide evidence of sensory particularities in people with high-autistic traits and suggest that low-level stages of multisensory integration could operate differently all along the autistic personality spectrum.