Image, Video, and Multidimensional Signal Processing

Appendix

Read more about Appendix
Log in to post comments

To evaluate the generalization of RIS in the context of human-robot interaction, we generate referring expressions for a subset of images from GraspNet using Shikra.

Appendix.pdf

Appendix.pdf (27)

Categories:: Image, Video, and Multidimensional Signal Processing

9 Views

QVRF: A QUANTIZATION-ERROR-AWARE VARIABLE RATE FRAMEWORK FOR LEARNED IMAGE COMPRESSION

Learned image compression has exhibited promising compression performance, but variable bitrates over a wide range remain a challenge. State-of-the-art variable rate methods compromise the loss of model performance and require numerous additional parameters. In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model.

eposter_ICIP2023_tongkedeng.pptx

eposter_ICIP2023_tongkedeng.pptx (94)

Categories:: Image, Video, and Multidimensional Signal Processing

42 Views

SEMANTIC-EMBEDDED KNOWLEDGE ACQUISITION AND REASONING FOR IMAGE SEGMENTATION

Read more about SEMANTIC-EMBEDDED KNOWLEDGE ACQUISITION AND REASONING FOR IMAGE SEGMENTATION
Log in to post comments

Image segmentation is a difficult and challenging task because of the complex object appearance and diverse object categories. Traditional methods directly use visual features for segmentation but ignore the correlation between objects. We introduce a knowledge reasoning module (KRM) for external knowledge aggregation and leverage a graphic neural network to aggregate the knowledge feature, which is concatenated with a visual feature for semantic segmentation. To this end, we use word embedding of category names as semantic feature and establish the relationship between categories.

Semantic-embedded knowledge acquisition and reasoning for image segmentation.pdf

semantic segmentation (72)

ICIP_demo.pdf

ICIP_demo.pdf (75)

Categories:: Image, Video, and Multidimensional Signal Processing

23 Views

IMAGE SEGMENTATION FOR IMPROVED LOSSLESS SCREEN CONTENT COMPRESSION

Read more about IMAGE SEGMENTATION FOR IMPROVED LOSSLESS SCREEN CONTENT COMPRESSION
Log in to post comments

In recent years, it has been found that screen content images (SCI) can be effectively compressed based on appropriate probability modelling and suitable entropy coding methods such as arithmetic coding. The key objective is determining the best probability distribution for each pixel position. This strategy works particularly well for images with synthetic (textual) content. However, usually screen content images not only consist of synthetic but also pictorial (natural) regions. These images require diverse models of probability distributions to be optimally compressed.

presentationICASSP_pdf.pdf

presentationICASSP_pdf.pdf (95)

Categories:: Image, Video, and Multidimensional Signal Processing

10 Views

Image Generation is MAY All You Need for VQA

Read more about Image Generation is MAY All You Need for VQA
Log in to post comments

Visual Question Answering (VQA) stands to benefit from the boost of increasingly sophisticated Pretrained Language Model (PLM) and Computer Vision-based models. In particular, many language modality studies have been conducted using image captioning or question generation with the knowledge ground of PLM in terms of data augmentation. However, image generation of VQA has been implemented in a limited way to modify only certain parts of the original image in order to control the quality and uncertainty.

Image_Generation_is_May_All_You_Need_for_VQA.pdf

Image_Generation_is_May_All_You_Need_for_VQA.pdf (108)

Categories:: Image, Video, and Multidimensional Signal Processing

79 Views

GROUP-WISE CO-SALIENT OBJECT DETECTION WITH SIAMESE TRANSFORMERS VIA BROWNIAN DISTANCE COVARIANCE MATCHING

ICASSP海报.pdf

ICASSP海报.pdf (161)

Categories:: Image, Video, and Multidimensional Signal Processing

11 Views

Compression Noise Reduction via Non-local Filtering with Rectified Regularity for Urban Building Scenes

In this paper, we propose a novel low-rank based non-local image denoising method for HEVC video compression with the strategy of gathering non-local patches in the rectified domain. Owing to the irreversible quantization, image compression can be considered as adding noises into the original image, causing the distortion between the original image and the de-compressed image.