chatbot
Great! I will now conduct deep research on:
- The crossmodal integration in dorsal streams (visual and auditory)
- How both dorsal streams respond to auditory, visual, and audiovisual stimuli
- Studies using MEG (and EEG, if applicable) with Representational Similarity Analysis (RSA) to examine stable crossmodal representations
- Novel areas that have not been extensively studied in terms of dorsal stream multimodality
- Whether the visual dorsal stream is more multimodal than the auditory dorsal stream
- Degree of novelty of this research question
- Connections of dorsal streams to the motor cortex (for the further research section)
- Research questions that can be addressed by this project
I will return with findings and a refined project proposal shortly.
Multimodal Integration in the Dorsal Streams of Vision and Audition
Both the visual and auditory dorsal streams – classically associated with spatial processing (“where”/“how” pathways) – have shown multimodal capabilities. Recent research indicates that dorsal-stream regions in the brain can integrate auditory, visual, and combined (audiovisual) spatial information rather than operating in isolation for a single sense. Below, we review evidence that both dorsal pathways are multimodal, consider whether one stream might be more crossmodal than the other, and highlight studies (especially MEG/EEG with RSA) that quantify such integration. We also discuss potential novel dorsal regions involved in crossmodal processing, the novelty of this research direction, and the idea of dorsal-stream integration with motor systems (as a future direction). Finally, we propose some key research questions that this line of study can address. Throughout, we focus on spatial motion and simple/artificial stimuli (e.g. moving dots or sounds) to engage dorsal-stream processing while avoiding ventral stream (object recognition) activation (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
).
Evidence That Both Dorsal Streams Are Multimodal
Classic studies established that higher-order sensory areas often integrate inputs from multiple modalities. In both animals and humans, auditory–visual integration has been observed in associative regions like the superior temporal polysensory (STP) area and in the parietal cortex, indicating early hints of crossmodal processing in dorsal pathways (
Neuroimaging investigations of dorsal stream processing and effects of stimulus synchrony in schizophrenia - PMC
). Modern neuroimaging confirms that each “where” pathway can represent spatial information from more than one sense:
-
Visual Dorsal Stream: The motion-sensitive visual area hMT+ (V5), a key node of the visual dorsal stream, not only processes visual motion but also responds to auditory motion direction. In a 2020 fMRI study, Rezk et al. showed that hMT+ encodes the direction of moving sounds in a format partially aligned with its coding of visual motion (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). They could decode motion direction from hMT+ for both modalities, and even predict visual motion direction from the pattern of auditory responses (and vice versa), revealing a shared representation of motion across vision and hearing (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). Despite some differences in activity patterns (visual vs. auditory evoked responses were inverted in sign for some voxels), this finding supports the idea that the visual dorsal stream is truly multimodal for spatial motion (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed).
-
Auditory Dorsal Stream: The auditory “where” pathway, projecting from auditory cortex to parietal regions, also shows crossmodal integration. For instance, space-sensitive auditory cortex in the posterior superior temporal region (planum temporale) is modulated by visual input. An fMRI study of the ventriloquist effect found that when a sound is paired with a discrepant visual stimulus, the normal monotonic spatial tuning in posterior auditory cortex is attenuated by the visual cue (
An fMRI Study of the Ventriloquism Effect - PMC
). In other words, auditory dorsal areas alter their activity in response to visual location information, reflecting the brain’s combined audiovisual estimate of source location (
An fMRI Study of the Ventriloquism Effect - PMC
). This aligns with behavioral ventriloquism, where the perceived sound location shifts toward the visual stimulus. Such crossmodal influence on auditory cortical activity demonstrates that the auditory dorsal stream is also multisensory, incorporating visual spatial data into its processing of sound location (
An fMRI Study of the Ventriloquism Effect - PMC
). Similarly, other work has shown the ventriloquism effect correlates with activity changes in the planum temporale (auditory spatial cortex) when visual cues are present (The role of auditory cortex in the spatial ventriloquism aftereffect). -
Convergence in Parietal Cortex: Importantly, the dorsal streams for vision and audition anatomically converge in parietal regions. The intraparietal sulcus (IPS) and surrounding parietal cortex appear to encode a supramodal spatial representation. Rohe and Noppeney (2018) used an audiovisual localization task and found that while early visual and auditory areas represented location in a modality-specific way, the IPS integrated both auditory and visual spatial cues weighted by their relative reliability (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed). In other words, IPS activity reflected a combined position estimate (closer to optimal integration) rather than purely visual or purely auditory location (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed). This suggests the parietal node of the dorsal stream contains a common representation of space that transcends modality – a hallmark of multimodal integration. Consistently, patient and imaging studies have long implied that posterior parietal damage can disrupt spatial awareness across vision and hearing, hinting at such shared coding. Recent computational fMRI work confirms that IPS implements reliability-weighted cue integration for spatial stimuli (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed), cementing its role as a multimodal hub in the dorsal pathway.
Broadly, these findings overturn a strict separation of the dorsal streams by modality. Instead, they support a model in which both dorsal streams process spatial information from multiple senses. As summarized by Rauschecker (2011), the posterior superior temporal (pST) and inferior parietal lobule (IPL) – key components of the auditory dorsal stream – “participate in the processing of auditory space and motion, and integrate input from several modalities” (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Likewise, visual-dorsal areas can respond to non-visual spatial cues. Thus, both streams are better viewed as parts of an interconnected multisensory spatial network (with parietal cortex as a convergence zone) rather than isolated modality-specific channels.
Is the Visual Dorsal Stream More Multimodal Than the Auditory?
One intriguing question is whether the visual dorsal stream has a greater propensity for multimodal processing than the auditory dorsal stream (or vice versa). In other words, does one pathway integrate cross-sensory spatial information more extensively? The evidence suggests some asymmetries in how visual vs. auditory dorsal regions handle crossmodal inputs:
-
Visual Dominance in Spatial Tasks: In many audiovisual spatial tasks, vision tends to dominate perception – a classic example being the ventriloquism effect (the visual “capture” of sound location) (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
). Behaviorally, humans rely more on the high-acuity visual spatial cues than auditory cues when both are present (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
). This dominance is reflected neurally: visual signals strongly influence activity in auditory dorsal areas (e.g. the attenuation of auditory spatial tuning by a visual stimulus in pST (
An fMRI Study of the Ventriloquism Effect - PMC
)). Recent studies even show that the visual bias in integration is stronger than predicted by optimal models, indicating a genuine sensory hierarchy favoring vision (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
). By contrast, the influence of auditory cues on early visual-dorsal areas is less pronounced in spatial paradigms – we do not typically see “auditory capture” of visual location in the same way. Thus, one could argue that the auditory dorsal stream is highly dependent on visual input to refine its spatial representations, whereas the visual dorsal stream is less altered by auditory input. This might give the impression that the visual stream is the more dominant multimodal player in spatial tasks. -
Crossmodal Responses in Visual Areas vs. Auditory Areas: On the other hand, it is remarkable that a core visual motion area (hMT+) can encode auditory motion at all (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed) – suggesting the visual dorsal stream does welcome non-visual input under certain conditions. hMT+ showed a partially shared coding for auditory and visual motion directions, implying a true crossmodal representation (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). By comparison, auditory-specific areas (like primary auditory cortex) generally do not respond to silent visual motion in the absence of training or plasticity. In fact, to observe visual motion responses in auditory cortex usually requires special cases (e.g. early-blind individuals, where auditory areas become crossmodal (Early Blindness Results in Developmental Plasticity for Auditory …)). In normal sighted participants, auditory dorsal areas mainly show visual effects when the two stimuli are paired, not for visual stimuli alone. Visual dorsal areas like IPS or frontal eye fields, however, often activate during spatial attention regardless of whether cues are visual or auditory – reflecting a more inherent multimodal role. This suggests that dorsal visual regions may intrinsically handle multisensory spatial processing, whereas auditory cortical areas engage vision in a more context-driven way.
-
Stage of Integration: It might be that earlier stages of the visual dorsal stream (occipital motion areas) already exhibit crossmodal tuning (as shown in hMT+ (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed)), whereas early auditory dorsal stages (auditory belt areas) remain unimodal and only later (in parietal cortex) become crossmodal. This could mean the visual dorsal stream is carrying multimodal information sooner. Meanwhile, the auditory dorsal stream might rely on convergence in parietal cortex to achieve multimodal integration. In other words, the point of audiovisual convergence may occur at different levels: relatively early in the visual pathway vs. later in the auditory pathway.
In summary, both streams integrate multisensory spatial information, but their contributions are not necessarily equal. The visual dorsal stream provides the more precise spatial information and can even encode other modalities in visual coordinates, often leading the integration (consistent with strong visual capture effects (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
)). The auditory dorsal stream, for its part, clearly incorporates visual input (crucial for resolving sound location ambiguities) and shares a common parietal hub with vision (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed). Rather than saying one is definitively “more multimodal,” it is fair to say they complement each other: the visual dorsal network may supply a dominant spatial frame, while the auditory dorsal network aligns with it to yield a unified multisensory spatial percept. This interplay results in crossmodal representations in parietal and dorsal-frontal areas that reflect both auditory and visual cues.
Using MEG and RSA to Probe Multisensory Dorsal Representations
To understand how and where the dorsal streams integrate multisensory information, researchers have begun using techniques like magnetoencephalography (MEG) and electroencephalography (EEG) combined with Representational Similarity Analysis (RSA). These tools allow us to quantify and compare neural response patterns evoked by different modalities over time, offering insight into stable crossmodal representations in the brain.
Representational Similarity Analysis (RSA) is a multivariate approach that compares patterns of neural activity (or representations) across conditions. By correlating the representational geometry of responses to, say, visual vs. auditory stimuli, we can test if a brain region encodes some abstract property (like spatial location) similarly across modalities. This is well-suited to detect modality-invariant representations. For example, if a dorsal region truly encodes “location = left vs right” in a crossmodal way, the neural pattern difference between “left vs right” for vision should resemble that for audition, yielding a high similarity in the representational structure. RSA can capture this invariance even if the overall signals differ in amplitude or modality-specific aspects (Representational Similarity Analysis Reveals Commonalities and …).
Several studies highlight the value of RSA or similar multivariate methods in multisensory research:
-
EEG RSA for Audiovisual Timing: Cecere et al. (2017) used topographical RSA on high-density EEG to compare brain response patterns when either the auditory or visual input led in an asynchronous audiovisual pair ((PDF) Being First Matters: Topographical Representational Similarity Analysis of ERP Signals Reveals Separate Networks for Audiovisual Temporal Binding Depending on the Leading Sense). They found that different networks were engaged depending on which modality came first, indicating distinct neural representations for the two temporal order conditions ((PDF) Being First Matters: Topographical Representational Similarity Analysis of ERP Signals Reveals Separate Networks for Audiovisual Temporal Binding Depending on the Leading Sense) ((PDF) Being First Matters: Topographical Representational Similarity Analysis of ERP Signals Reveals Separate Networks for Audiovisual Temporal Binding Depending on the Leading Sense). This demonstrates how RSA on fast EEG/MEG signals can reveal differences in crossmodal processing that may not be evident with univariate measures. Notably, their time-resolved RSA showed that audiovisual temporal binding involves separate circuits for audio-first vs. visual-first, implying the brain’s strategy for integrating signals can change with context.
-
MEG Multivariate Analysis for Spatial Integration: Park and Kayser (2019) recorded whole-brain MEG during an audio-visual ventriloquism task and used single-trial multivariate decoding to find where integrated spatial information was represented (Shared neural underpinnings of multisensory integration and trial-by-trial perceptual recalibration in humans | eLife). They observed that both temporal and parietal regions carried multisensory spatial “error” signals, but critically, only the medial superior parietal cortex (a dorsal parietal area) encoded both current audiovisual disparity and the recent past trial’s information, correlating with how perception recalibrated (Shared neural underpinnings of multisensory integration and trial-by-trial perceptual recalibration in humans | eLife). While they did not explicitly label this RSA, it is conceptually similar – identifying a parietal region that stably represents crossmodal spatial relationships over time. Their results point to parietal cortex as a site of perceptually relevant multisensory coding in the dorsal stream, with high temporal precision (MEG revealing that this coding occurs in specific time windows after stimulus onset).
-
fMRI RSA for Crossmodal Motion: In the hMT+ study mentioned earlier, RSA was employed to compare the neural representational spaces of visual vs. auditory motion directions. The authors reported that although overall activations differed, the relative pattern distinguishing motion directions in hMT+ had a significant correspondence between the two modalities (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). This crossmodal RSA finding is evidence of a modality-independent encoding of motion direction in a dorsal visual area. By using RSA, they could quantify the degree of overlap in representation (i.e. how similarly the same set of directions is encoded via vision and via audition), lending statistical support to the claim of shared coding. Such an approach is more sensitive to representational correspondence than univariate comparisons.
Applying RSA to MEG data is a powerful next step because it can track the dynamics of representational similarity. By computing RSA at each timepoint of MEG signals (or in sliding time windows), one can ask: when do auditory and visual representations become similar in dorsal stream regions? For instance, do parietal cortices show early divergence for different locations that later converge between modalities, indicating an emerging crossmodal code? Preliminary work with combined MEG-fMRI representational fusion suggests that dorsal stream dynamics can be mapped with such methods (Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal …). This approach can reveal stable crossmodal representations over time, meaning a representational format that persists and is reliable across milliseconds – a hallmark of true integration rather than transient co-activation.
In summary, MEG/EEG with RSA provides a novel window into multimodal processing in dorsal stream regions. It enables researchers to identify where in the brain and when in time the same spatial information is represented irrespective of whether it comes from sight or sound. This is crucial for confirming stable crossmodal coding (e.g., a location-specific pattern that generalizes across modality) as opposed to mere simultaneous activation. As this field progresses, we expect to see RSA pinpointing dorsal-stream regions (like IPS, superior parietal lobule, or frontal eye fields) that maintain invariant spatial representations that are used to guide behavior under multisensory conditions.
Novel or Understudied Dorsal Regions in Crossmodal Integration
Most studies of multisensory spatial integration have focused on known association areas (e.g., STS, IPS). However, there are dorsal stream regions that remain relatively understudied in terms of crossmodal processing. Identifying these can open new avenues:
-
Medial Superior Parietal Lobule (Precuneus): The precuneus (medial dorsal parietal cortex) is not typically highlighted in audiovisual integration, yet Park & Kayser (2019) found this area to be key for linking current sensory input with past input during ventriloquism adaptation (Shared neural underpinnings of multisensory integration and trial-by-trial perceptual recalibration in humans | eLife). The medial SPL encoded both auditory-visual spatial discrepancy and the subsequent recalibration, suggesting it forms a memory-like representation of multisensory spatial context (Shared neural underpinnings of multisensory integration and trial-by-trial perceptual recalibration in humans | eLife). This role in bridging time across trials indicates the precuneus might maintain a high-level supramodal spatial map that contributes to adjusting our internal model of space. Because most prior work emphasized lateral parietal (IPL/IPS), the medial parietal contribution is relatively novel. Future research could explore how precuneus handles crossmodal spatial information (for example, does it integrate visual, auditory, and even vestibular inputs into a stable sense of position/orientation?).
-
Dorsal Frontal Areas: Regions such as the Frontal Eye Fields (FEF) and the dorsolateral prefrontal cortex (DLPFC) lie at the downstream end of the dorsal visual pathway (involved in visuospatial working memory and gaze control). These areas likely receive convergent inputs from multisensory parietal regions. However, their role in crossmodal integration is not well characterized. It is plausible that FEF, for instance, could integrate auditory spatial cues when planning eye movements towards a sound, effectively acting as a multimodal orienting center. Similarly, DLPFC might maintain abstract spatial goals or rules that apply across modalities. These dorsal-frontal regions have been extensively studied for vision (and to some extent for auditory working memory), but direct evidence of audio-visual representational overlap in them is scarce. Examining them with RSA could reveal whether they encode location or motion in a common format (e.g., an intended target location, regardless of cue modality).
-
Secondary Visual Dorsal Areas: Beyond hMT+, the visual dorsal stream includes areas like V3A, V7, and MST in the dorsal occipital and parietal cortex. Some of these (e.g., area MST) specialize in processing complex motion (including optic flow). It’s worth asking if these areas also respond to auditory motion or spatial cues. So far, hMT+ was demonstrated to be multimodal (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed), but what about its neighbors? There is some evidence from animal studies that neurons in dorsal visual areas (like the dorsal medial superior temporal area) can align with auditory space, especially in tasks requiring coordination between senses (
Neuroimaging investigations of dorsal stream processing and effects of stimulus synchrony in schizophrenia - PMC
). However, systematic human data are sparse. These higher dorsal visual areas might be an unexplored substrate for crossmodal motion integration (for example, MST could integrate vestibular and auditory motion with visual motion to support navigation). -
Auditory-Dorsal Transitional Zones: Within the auditory dorsal stream, the zone at the interface of temporal and parietal lobes – sometimes called the Temporoparietal junction (TPJ) or specifically area TAE/Tpt – is thought to be a crucial hub for multisensory integration of spatial info. Monkey studies (e.g., Bruce, Andersen, & Colby) identified neurons in these areas that respond to auditory and visual stimuli in a common reference frame (
Neuroimaging investigations of dorsal stream processing and effects of stimulus synchrony in schizophrenia - PMC
). In humans, this might correspond to the border of the planum temporale and IPL. While ventriloquism studies implicate planum temporale in integration, the full extent of this auditory-visual convergent zone is not well mapped. It could harbor neurons that convert auditory coordinates to visual ones or vice versa. Novel imaging paradigms that separately localize these regions (e.g., using spatially combined stimuli in fMRI/MEG) could shed light on how auditory spatial streams feed into parietal cortex alongside visual streams. Essentially, the “handoff” point between auditory cortex and parietal cortex is an underexplored region for crossmodal coding.
Identifying and studying these less-examined regions is important because they might reveal new principles of multisensory dorsal processing. For instance, discovering a crossmodal code in precuneus could inform us about how the brain maintains a stable spatial environment model; finding audiovisual integration in FEF would link directly to action (gaze shifts); and characterizing MST/TPJ involvement could unify understanding across species. Each of these regions represents a node in the dorsal network that could contribute to crossmodal spatial behavior in ways we are only beginning to appreciate.
Novelty and Significance of This Research Direction
Investigating multimodal integration in the dorsal streams, especially using modern techniques like RSA with MEG/EEG, is a relatively new and cutting-edge direction. Traditionally, research on multisensory integration often centered on mid-level convergence zones (like the superior colliculus or STS) and on subcortical or ventral-stream processes. The notion that dorsal “where” pathways are themselves integrative has only gained traction in recent years. Key reasons this direction is novel and significant include:
-
Challenging Classic Models: The classic two-stream model from the 1980s assumed strictly segregated pathways (visual vs. auditory, ventral vs. dorsal) for different functions (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Demonstrating that dorsal stream areas encode information from multiple senses challenges this view and calls for an updated framework where spatial processing is inherently multisensory. This is a significant paradigm shift. As Rauschecker (2011) argued, the dorsal auditory stream’s function “generalizes” beyond audition to a more sensorimotor integration role (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Our research direction builds on this idea, providing empirical tests of crossmodal encoding in dorsal areas. -
Limited Prior Studies: While we have highlighted some pioneering studies (2014–2020) showing dorsal multimodal integration (e.g., ventriloquism fMRI, IPS integration, hMT+ crossmodal response, EEG RSA on AV timing), the literature is still relatively sparse. Each study typically examined a specific piece of the puzzle. No comprehensive picture yet exists of how both dorsal streams together form a multisensory network. For example, only one or two studies have explicitly used RSA to demonstrate shared representations across modalities in dorsal areas (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). Our approach – using MEG with RSA to track stable crossmodal representations over time – would be among the first to do so in a full-brain, time-resolved manner. This represents a methodological advancement and a novel application of RSA in the multisensory domain, as most RSA work so far has been fMRI-based or within one modality.
-
Focus on Artificial Motion Stimuli: By concentrating on simple, artificial stimuli (like moving dots or noise bursts), this research avoids confounds of object recognition and semantic processing, which often engage ventral streams. Surprisingly few studies have systematically done this across both modalities. The Curr Biol 2020 study (Rezk et al.) is a rare example focusing on non-semantic motion in both senses (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed). Our study’s emphasis on such stimuli to isolate dorsal processing is relatively novel and should yield cleaner evidence of dorsal-stream integration. It also means any crossmodal effects we observe can be confidently attributed to spatial/motion processing networks rather than high-level cognitive factors.
-
Temporal Dynamics and “Stable” Representations: Most existing evidence for dorsal multimodality comes from fMRI, which is static and slow. By leveraging MEG, we introduce a new angle: examining the temporal evolution of multisensory integration. This could reveal, for instance, how quickly the brain fuses auditory and visual spatial information and whether the integration is sustained or transient. Observing a stable crossmodal representation means seeing consistent patterns or decoding performance that persists across a time window for both modalities – something that has not been shown before with high temporal resolution. This dynamic perspective is largely novel.
-
Interdisciplinary Impact: Establishing the dorsal streams as multimodal has implications beyond basic neuroscience. It can inform theories in psychology (e.g., how attention selects information across senses), improve computational models of multisensory integration (by adding constraints from neural data, such as reliability-weighted encoding in IPS (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed)), and even guide clinical research (for instance, understanding dorsal-stream multisensory deficits in conditions like dyslexia or autism, where sensory integration or spatial orientation is atypical). It also links to technology – for example, AR/VR systems or assistive devices can be designed to align with how dorsal pathways combine sights and sounds.
Overall, while the idea of a “multisensory dorsal stream” is emerging, it is still a novel frontier. Our proposed research will significantly extend current knowledge by mapping where, when, and how auditory and visual spatial information coalesce in the brain’s dorsal networks. By quantifying crossmodal representational overlap, we move from simply saying “both modalities activate region X” to demonstrating that region X encodes a common spatial representation across modalities. This level of analysis is cutting-edge and positions our work at the forefront of multisensory cognitive neuroscience.
Dorsal Stream Integration with Motor Systems (Future Directions)
(
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) Illustration of dual-pathway models highlighting dorsal stream (red) connections into parietal (IPS/IPL) and frontal premotor areas. In both non-human primates (A) and humans (B), the dorsal auditory pathway (red) projects through parietal regions and interfaces with frontal motor regions (PMC), supporting sensorimotor integration (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). The ventral pathway (green) projects into inferior frontal areas (IFC) for object recognition. Dashed arrows indicate efference copies from frontal motor areas to parietal cortex, enabling comparison of motor commands with sensory feedback (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). This reflects a proposed general role of dorsal streams in coordinating perception and action.
While our primary focus is on sensory integration within the dorsal streams, an exciting future direction is to explore how these multisensory dorsal pathways interact with motor cortex and action planning. The dorsal streams are often described as “perception-for-action” pathways – in vision, the dorsal stream is critical for guiding movements (visuomotor control), and in audition, the dorsal stream has been linked to sensorimotor functions like vocal articulation and sound localization by head movements (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Key ideas and findings that motivate integrating the motor aspect include:
-
Sensorimotor Loop in the Dorsal Stream: The dorsal stream doesn’t stop at the parietal cortex – it extends into frontal premotor areas. In fact, parietal and frontal regions form loops for coordinating actions based on sensory input. Rauschecker and Scott (2009) proposed that the auditory dorsal stream involves the inferior parietal lobule (IPL) comparing sensory input with an efference copy of motor plans (from frontal cortex) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Essentially, the dorsal stream can act as a forward model that predicts expected sensory consequences of movement and matches them with actual input (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). For example, if you plan a saccade or a reach, a copy of that motor command could be sent to parietal areas to update the expected locations of objects/sounds. This concept generalizes across modalities and is a core sensorimotor integration function of the dorsal pathway (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). -
Auditory–Motor Integration: The auditory dorsal stream in humans (often called the “phonological loop” or articulatory loop in language) directly connects to motor speech areas. A region in the Sylvian parietal area (often termed Spt) has been found to interface between auditory representations and speech motor programs. Lip-reading studies show that watching mouth movements can suppress activity in auditory cortex via a motor prediction signal (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) – evidence of a crossmodal predictive mechanism. This suppression (an MEG N100m reduction) occurred whether subjects were silently articulating or just lip-reading, consistent with an efference copy from motor speech areas to auditory areas (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). In other words, the dorsal stream uses visual input (lip movements) to inform auditory processing in a way that is linked to motor systems (speech production) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). Studying spatial processing, one might find analogous interactions: e.g., seeing a flash might prepare a motor orienting response that influences auditory cortex activity. -
Orienting Movements: For spatial tasks, the end goal is often an action – turning the head toward a sound, reaching for a seen object, navigating around an obstacle, etc. The dorsal spatial stream outputs to motor effectors to accomplish this. The integration of auditory and visual space likely continues into motor planning regions. For example, the superior colliculus (midbrain) receives inputs from both the visual and auditory dorsal streams and triggers eye-head orienting movements. At the cortical level, one could investigate if dorsal premotor cortex contains neurons that fire for a target location regardless of it being seen or heard, effectively a multimodal motor plan. This has implications for understanding how the brain achieves coordinated responses to multisensory stimuli (like catching a ball you both see and hear whooshing).
-
Future Experiments: In a future outlook, experiments could be designed to see how motor preparation or execution modulates the dorsal stream’s multisensory representations. For instance, one might use MEG/EEG to measure if preparing to point to a target strengthens the similarity of auditory and visual target representations in parietal cortex (indicating the motor system “binds” them together). Another angle is training participants on audio-visual-motor tasks (e.g., hit a button where a sound appears, with or without a light) and seeing if dorsal stream representations become more unified or predictive of motor output after training. This could connect to the idea of dorsal stream plasticity and learning.
In summary, while motor integration is not the core focus of the current study, it is an important extension. The dorsal streams are inherently positioned at the interface of sensing and acting. They likely provide the multisensory information that the motor cortex needs to generate appropriate responses. Our research on audiovisual integration in the dorsal streams will lay the groundwork, and then future studies can ask how these integrated representations are utilized by (or even shaped by) the motor system to guide behavior. As one review succinctly put it, the dorsal stream “plays a more general role in sensorimotor integration and control” (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) – understanding the sensory part now will enable us to tackle the sensorimotor loop next.
Key Research Questions
Based on the above synthesis, we can formulate several specific research questions that this study aims to address:
-
Supramodal Spatial Coding: Do dorsal stream regions (e.g. IPS or SPL) encode spatial locations in a modality-independent way? For instance, is there a common neural pattern for “left vs. right” location whether the stimulus is visual or auditory, indicating a supramodal representation of space?
-
Comparative Multisensory Integration: Are there differences between the visual and auditory dorsal streams in the extent or timing of multisensory integration? For example, does the visual dorsal stream integrate auditory input earlier or more robustly than the auditory stream integrates visual input, or vice versa? This question examines whether one stream is functionally “more multimodal” than the other in terms of neural coding or influence.
-
Temporal Dynamics: How do crossmodal spatial representations emerge and evolve over time in the dorsal pathways? Using MEG, we can ask: at what post-stimulus latency do patterns for auditory and visual stimuli become similar in parietal cortex? Is there a specific time window (e.g., around 100–200 ms) where audiovisual convergence occurs in the dorsal stream, and does this timing differ across regions (occipital vs parietal vs frontal)?
-
Stable vs. Transient Representations: Are the multisensory spatial representations in dorsal regions stable over time (e.g., maintained throughout a trial or delay) or are they transient reactions to stimulus onset? This question relates to whether the dorsal stream can hold an integrated representation (important for tasks like memory-guided localization or delayed responses) and can be probed by RSA during sustained periods or across sequential stimuli.
-
Regional Specificity: Which specific sub-regions of the dorsal streams are involved in crossmodal integration? Do we find that certain areas (e.g., hMT+, lateral intraparietal cortex, precuneus, frontal eye fields) show significant crossmodal representational similarity, whereas others remain modality-specific? Mapping this would tell us if integration is widespread along the dorsal pathway or concentrated in particular “hub” areas.
-
Behavioral Correlates: How do the neural measures of multisensory integration (such as representational similarity or decoding accuracy across modalities) relate to behavioral outcomes? For example, if a participant shows a stronger ventriloquist effect (visual bias on sound localization), do they also show higher crossmodal similarity in their IPS representations? This question connects neural integration to perceptual experience and performance.
-
Attention and Task Modulation: Does directing attention or changing task goals affect the crossmodal integration in dorsal stream regions? Prior work suggests attention can modulate integration (Reliability-Weighted Integration of Audiovisual Signals Can Be Modulated by Top-down Attention - PubMed). We can ask if making participants attend to one modality or requiring an overt motor response will increase or decrease the observed crossmodal representational similarity in dorsal areas. This addresses the flexibility of dorsal stream integration under different cognitive states.
-
Extension to Sensorimotor Integration: (Looking ahead) Do the multisensory representations in the dorsal stream interface with motor planning in a meaningful way? For instance, if we include a motor response, does the motor cortex or premotor cortex start to reflect the integrated spatial representation present in parietal cortex? This bridges into the future direction of how dorsal stream integration informs action.
By investigating these questions, the study will clarify whether and how the brain’s dorsal “where” pathways create unified spatial maps from multiple sensory inputs, and it will shed light on the novelty and breadth of multisensory processing in regions traditionally viewed as unisensory. This contributes fundamentally to our understanding of perceptual organization and its link to action.
References:
- Benevento et al. (1977) – Early evidence of polysensory neurons in STP (monkey).
- Andersen et al. (1997, 2009) – Multisensory integration in parietal reach regions (animal studies).
- Murray et al. (2005); Schroeder & Foxe (2005) – Crossmodal influences on early sensory ERPs (human EEG/MEG).
- Kayser & Logothetis (2009) – Audiovisual feedforward and feedback connectivity in primate cortex.
- Stephen et al. (2010) – MEG evidence of multisensory interactions in dorsal stream regions.
- Butler & Javitt (2005) – Review on dorsal stream deficits (schizophrenia) and multisensory context.
- Rohe & Noppeney (2015, 2016) – fMRI studies showing parietal cortex performs multisensory integration and causal inference (ventriloquism paradigm). (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
) - Callan et al. (2015) – fMRI ventriloquism study: visual modulation of auditory spatial coding in planum temporale (
An fMRI Study of the Ventriloquism Effect - PMC
). - Bonath et al. (2014) – Neuroimaging separating auditory regions for congruent vs. incongruent AV stimuli (ventriloquism).
- Cecere et al. (2017) – EEG RSA study demonstrating distinct networks for AV temporal order (audio-first vs visual-first) ((PDF) Being First Matters: Topographical Representational Similarity Analysis of ERP Signals Reveals Separate Networks for Audiovisual Temporal Binding Depending on the Leading Sense).
- Rezk et al. (2020) – fMRI MVPA/RSA study showing hMT+ represents both visual and auditory motion directions (Curr. Biol.) (Shared Representation of Visual and Auditory Motion Directions in the Human Middle-Temporal Cortex - PubMed).
- Park & Kayser (2019) – MEG study on shared parietal mechanisms for integration and trial-by-trial recalibration (eLife) (Shared neural underpinnings of multisensory integration and trial-by-trial perceptual recalibration in humans | eLife).
- Bruns (2019) – Review on ventriloquism effect as multisensory tool, covering new findings (Front. in Integrative Neurosci.) (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
) (
The Ventriloquist Illusion as a Tool to Study Multisensory Processing: An Update - PMC
). - Rauschecker (2011) – “Expanded dorsal auditory pathway” review, proposing sensorimotor integration role (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
) (
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration - PMC
). - Others as cited in text (References [31–35†] in Bruns’ review table; and relevant portions of the above-cited articles).
see also
Type:
Tags:
Status:
Location:
Created: 10-03-25 09:29