IEEE ICASSP 2024

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

SPASE: SPAtial Saliency Explanation for time series models

Read more about SPASE: SPAtial Saliency Explanation for time series models
1 comment
Log in to post comments

We have seen recent advances in the fields of Machine Learning (ML), Deep Learning (DL), and Artificial intelligence (AI) that the models are becoming increasingly complex and large in terms of architecture and parameter size. These complex ML/DL models have beaten the state of the art in most fields of computer science like computer vision, NLP, tabular data prediction and time series forecasting, etc. With the increase in models’ performance, model explainability and interpretability has become essential to explain/justify model outcome, especially for business use cases.

ICASSP_SPASE.pdf

Paper pre-print (27)

Categories:: Pattern recognition and classification (MLR-PATT)

34 Views

HEAR-YOUR-ACTION: HUMAN ACTION RECOGNITION BY ULTRASOUND ACTIVE SENSING

Read more about HEAR-YOUR-ACTION: HUMAN ACTION RECOGNITION BY ULTRASOUND ACTIVE SENSING
Log in to post comments

Action recognition is a key technology for many industrial applications. Methods using visual information such as images are very popular. However, privacy issues prevent widespread usage due to the inclusion of private information, such as visible faces and scene backgrounds, which are not necessary for recognizing user action. In this paper, we propose a privacy-preserving action recognition by ultrasound active sensing.

ICASSP2024_poster_tanigawa.pdf

ICASSP2024_poster_tanigawa.pdf (41)

Categories:: Applications in Music and Audio Processing (MLR-MUSI)

45 Views

Embedded Feature Similarity Optimization with Specific Parameter Initialization for 2D/3D Medical Image Registration

We present a novel deep learning-based framework: Embedded Feature Similarity Optimization with Specific Parameter Initialization (SOPI) for 2D/3D medical image registration which is a most challenging problem due to the difficulty such as dimensional mismatch, heavy computation load and lack of golden evaluation standard. The framework we design includes a parameter specification module to efficiently choose initialization pose parameter and a fine-registration module to align images.

poster.pdf

BISP-P10.1 poster (43)

Categories:: Medical image analysis

61 Views

EMORED: A DATASET FOR RELATION EXTRACTION IN TEXTS WITH EMOTICONS

Read more about EMORED: A DATASET FOR RELATION EXTRACTION IN TEXTS WITH EMOTICONS
Log in to post comments

Relation extraction (RE) is a vital task within natural language processing. Previous works predominantly focus on extracting relations from plain text. However, with the evolution of communication habits, many individuals employ symbolic representations, e.g. emoticons, to convey nuanced information. This shift in communication prompts a pertinent question: How do emoticons impact the performance of RE models?

poster_icassp.pdf

poster_icassp.pdf (21)

Categories:: Spoken language resources and annotation (SLP-REAN)

16 Views

EFFICIENT FUSION OF DEPTH INFORMATION FOR DEFOCUS DEBLURRING

Read more about EFFICIENT FUSION OF DEPTH INFORMATION FOR DEFOCUS DEBLURRING
1 comment
Log in to post comments

Defocus deblurring is a classic problem in image restoration tasks. The formation of its defocus blur is related to depth. Recently, the use of dual-pixel sensor designed according to depth-disparity characteristics has brought great improvements to the defocus deblurring task. However, the difficulty of real-time acquisition of dual-pixel images brings difficulties to algorithm deployment. This inspires us to remove defocus blur by single image with depth information.

icassp2024-Efficient_Fusion_of_Depth_Information_for_Defocus_Deblurring.pdf

paper of EFFICIENT FUSION OF DEPTH INFORMATION FOR DEFOCUS DEBLURRING (59)

Categories:: Image/Video Storage, Retrieval

95 Views

ADAPTIVE PROMPT CONSTRUCTION METHOD FOR RELATION EXTRACTION

Read more about ADAPTIVE PROMPT CONSTRUCTION METHOD FOR RELATION EXTRACTION
Log in to post comments

Prompt learning was proposed to solve the problem of inconsistency between the upstream and downstream tasks and has achieved State-Of-The-Art (SOTA) results in various Natural Language Processing (NLP) tasks. However, Relation Extraction (RE) is more complex than other text classification tasks, which makes it more difficult to design a suitable prompt template for each dataset manually. To solve this issue, we propose a Adaptive Prompt Construction method (APC) for relation extraction.

Chen 等 - 2024 - Adaptive Prompt Construction Method for Relation E.pdf

Chen 等 - 2024 - Adaptive Prompt Construction Method for Relation E.pdf (56)

Categories:: Other

86 Views

Improving Cross-domain Few-shot Classification with Multilayer Perceptron

Read more about Improving Cross-domain Few-shot Classification with Multilayer Perceptron
Log in to post comments

Cross-domain few-shot classification (CDFSC) is a challenging and tough task due to the significant distribution discrepancies across different domains. To address this challenge, many approaches aim to learn transferable representations. Multilayer perceptron (MLP) has shown its capability to learn transferable representations in various downstream tasks, such as unsupervised image classification and supervised concept generalization. However, its potential in the few-shot settings has yet to be comprehensively explored.

1966_poster.pdf

vertical version of the poster (70)

1966_poster_horizontal-version.pdf

horizontal version of the poster (39)

Categories:: Other applications of machine learning (MLR-APPL)

121 Views

HIM: DISCOVERING IMPLICIT RELATIONSHIPS IN HETEROGENEOUS SOCIAL NETWORKS

Read more about HIM: DISCOVERING IMPLICIT RELATIONSHIPS IN HETEROGENEOUS SOCIAL NETWORKS
Log in to post comments

To date, research on relation mining has typically focused on analyzing explicit relationships between entities, while ignoring the underlying connections between entities, known as implicit relationships. Exploring implicit relationships can reveal more about social dynamics and potential relationships in heterogeneous social networks to better explain complex social behaviors. The research presented in this paper explores implicit relationships discovery methods in the context of heterogeneous social networks.

HIM-Supplementary.pdf

Supplementary materials for the paper "HIM: DISCOVERING IMPLICIT RELATIONSHIPS IN HETEROGENEOUS SOCIAL NETWORKS" (49)

Xu.pdf

Paper for the paper "HIM: DISCOVERING IMPLICIT RELATIONSHIPS IN HETEROGENEOUS SOCIAL NETWORKS" (11)

ICASSP-HIM.pptx

ICASSP-HIM.pptx (17)

Categories:: Knowledge and Data Engineering
Other
Other

119 Views

INVERTIBLE VOICE CONVERSION WITH PARALLEL DATA

Read more about INVERTIBLE VOICE CONVERSION WITH PARALLEL DATA
Log in to post comments

This paper introduces an innovative deep learning framework for parallel voice conversion to mitigate inherent risks associated with such systems. Our approach focuses on developing an invertible model capable of countering potential spoofing threats. Specifically, we present a conversion model that allows for the retrieval of source voices, thereby facilitating the identification of the source speaker. This framework is constructed using a series of invertible modules composed of affine coupling layers to ensure the reversibility of the conversion process.

zexin_poster.pdf

zexin_poster.pdf (54)

Categories:: Audio Processing Systems

76 Views

Joint Multi-Band DOA Estimation Using Low-Rank Matrix Recovery

Read more about Joint Multi-Band DOA Estimation Using Low-Rank Matrix Recovery
Log in to post comments

To address wideband direction of arrival (DOA) estimation problems, this paper proposes a gridless and covariance-free joint multi-band (JMB) DOA estimation method using low-rank matrix recovery. In contrast with subspace methods and sparse array-based methods, a unified frequency grid is established based on the concept of the greatest common divisor (GCD) to solve the nonlinearity of steering matrices from multiple frequencies. With the unified frequency grid, a low-rank master matrix is formed as a combination of the truncated Hankel matrices from different subbands and snapshots.

GuoZ_ICASSP24_poster.pdf

GuoZ_ICASSP24_poster.pdf (83)

Categories:: Signal and System Modeling, Representation and Estimation

113 Views

IEEE ICASSP 2024

Pages