IEEE ICASSP 2024

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

EFFICIENT VIDEO AND AUDIO PROCESSING WITH LOIHI 2

Read more about EFFICIENT VIDEO AND AUDIO PROCESSING WITH LOIHI 2
Log in to post comments

Loihi 2 is a fully event-based neuromorphic processor that supports a wide range of synaptic connectivity configurations and temporal neuron dynamics. Loihi 2's temporal and event-based paradigm is naturally well-suited to processing data from an event-based sensor, such as a Dynamic Vision Sensor (DVS) or a Silicon Cochlea. However, this begs the question: How general are signal processing efficiency gains on Loihi 2 versus conventional computer architectures?

ICASSP Presentation.pptx

ICASSP Presentation.pptx (29)

Categories:: Other

10 Views

BNMTRANS: A BRAIN NETWORK SEQUENCE-DRIVEN MANIFOLD-BASED TRANSFORMER FOR COGNITIVE IMPAIRMENT DETECTION USING EEG

Early identification of mild cognitive impairment (MCI) is crucial for the prevention of Alzheimer’s disease. As neurodegenerative
diseases progress, the synchronous activity observed in electroencephalography (EEG) - which indicates functional connectivity -

ICASSP2024_ID7226_Presentation.pptx

ICASSP2024_ID7226_Presentation.pptx (16)

Categories:: Biomedical signal processing

11 Views

Towards ASR robust spoken language understanding through in-context learning with word confusion networks

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world scenarios, prior to input into an LLM, an automated speech recognition (ASR) system generates an output transcript hypothesis, where inherent errors can degrade subsequent SLU tasks.

asr_wcn_kpever.pptx

asr_wcn_kpever.pptx (19)

Categories:: Spoken Language Processing

9 Views

Common-slope modeling of late reverberation

Read more about Common-slope modeling of late reverberation
Log in to post comments

The decaying sound field in rooms is typically described by energy decay functions (EDFs). Late reverberation can deviate considerably from the ideal diffuse field, for example, in multiple connected rooms or non-uniform absorption material distributions. This paper proposes the common-slope model of late reverberation. The model describes spatial and directional late reverberation as linear combinations of exponential decays called common slopes.

ICASSP_Goetz_CommonSlope_Poster.pdf

ICASSP_Goetz_CommonSlope_Poster.pdf (14)

Categories:: Room Acoustics and Acoustic System Modeling

5 Views

VOXTLM: UNIFIED DECODER-ONLY MODELS FOR CONSOLIDATING SPEECH RECOGNITION, SYNTHESIS AND SPEECH, TEXT CONTINUATION TASKS

We propose a decoder-only language model, VoxtLM, that can perform four tasks: speech recognition, speech synthesis, text generation, and speech continuation. VoxtLM integrates text vocabulary with discrete speech tokens from self-supervised speech features and uses special tokens to enable multitask learning. Compared to a single-task model, VoxtLM exhibits a significant improvement in speech synthesis, with improvements in both speech intelligibility from 28.9 to 5.6 and objective quality from 2.68 to 3.90.

voxtlm_icassp2024_presentation.pptx

voxtlm_icassp2024_presentation.pptx (20)

Categories:: Language Modeling, for Speech and SLP (SLP-LANG)
Speech Synthesis and Generation, including TTS (SPE-SYNT)

6 Views

VOXTLM: UNIFIED DECODER-ONLY MODELS FOR CONSOLIDATING SPEECH RECOGNITION, SYNTHESIS AND SPEECH, TEXT CONTINUATION TASKS

Presentation.pptx

Presentation.pptx (41)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)
Language Modeling, for Speech and SLP (SLP-LANG)

11 Views

https://sigport.org/events/documents/ieee-icassp-2024

Read more about https://sigport.org/events/documents/ieee-icassp-2024
Log in to post comments

Presentation.pptx

Presentation.pptx (48)

Categories:: Audio and Acoustic Signal Processing

7 Views

MLSP-L18.6 presentation

Read more about MLSP-L18.6 presentation
Log in to post comments

Efficient training of large-scale graph neural networks (GNNs) has been studied with a specific focus on reducing their memory consumption. Work by Liu et al. (2022) proposed extreme activation compression (EXACT) which demonstrated drastic reduction in memory consumption by performing quantization of the intermediate activation maps down to using INT2 precision. They showed little to no reduction in performance while achieving large reductions in GPU memory consumption.

ICASSP24___oral.pdf

MLSP-L18.6 presentation (19)

Categories:: Neural network learning (MLR-NNLR)

13 Views

Covariance Matrix Recovery From One-Bit Data With Non-Zero Quantization Thresholds: Algorithm and Performance Analysis

Covariance matrix recovery is a topic of great significance in the field of one-bit signal processing and has numerous practical applications. Despite its importance, the conventional arcsine law with zero threshold is incapable of recovering the diagonal elements of the covariance matrix. To address this limitation, recent studies have proposed the use of non-zero clipping thresholds. However, the relationship between the estimation error and the sampling threshold is not yet known.

presentation_ICASSP2024.pdf

presentation_ICASSP2024.pdf (23)

Categories:: Signal Processing Theory and Methods

8 Views

Binary Signal Alignment: Optimal Solution is Polynomial-time and Linear-time Solution is Quasi-optimal

In this paper we revisit a recently proposed underlay communication scheme which relies on repetition of the secondary signal at the transmitter and canonical correlation analysis CCA at the (multi-antenna) receiver. In this setting, CCA can provably extract the underlay signal in the presence of potentially strong and time-varying primary interference, without any channel knowledge.

ICASSP_2024_presentation.pdf

ICASSP_2024_presentation.pdf (17)

Categories:: Communication Systems and Applications

10 Views

IEEE ICASSP 2024

Pages