KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim; Yiyang Su; Feng Liu; Anil Jain; Xiaoming Liu

doi:10.1109/CVPR52733.2024.00031

Back

KeyPoint Relative Position Encoding for Face Recognition

Conference proceeding

Open access

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Yiyang Su, Feng Liu, Anil Jain, Xiaoming Liu and Feng Liu

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 244-255

16 Jun 2024

DOI: https://doi.org/10.1109/CVPR52733.2024.00031

Files and links (1)

url

http://arxiv.org/abs/2403.14852View

SubmittedarXiv.org - Non-exclusive license to distribute, Open

Abstract

Affine Transformation

Computational modeling

Face recognition

Facial Landmarks

Gait Recognition

Image recognition

Keypoints

Recognition

Relative Position Encoding

Robustness

Throughput

Transforms

Computer Vision

In this paper, we address the challenge of making ViT models more robust to unseen affine transformations. Such robustness becomes useful in various recognition tasks such as face recognition when image alignment failures occur. We propose a novel method called KP-RPE, which leverages key points (e.g. facial landmarks) to make ViT more resilient to scale, translation, and pose variations. We begin with the observation that Relative Position Encoding (RPE) is a good way to bring affine transform generalization to ViTs. RPE, however, can only inject the model with prior knowledge that nearby pixels are more important than far pixels. Keypoint RPE (KP-RPE) is an extension of this principle, where the significance of pixels is not solely dictated by their proximity but also by their relative positions to specific keypoints within the image. By anchoring the significance of pixels around keypoints, the model can more effectively retain spatial relationships, even when those relationships are disrupted by affine transformations. We show the merit of KP-RPE inface and gait recognition. The experimental results demonstrate the effectiveness in improving face recognition performance from low-quality images, particularly where alignment is prone to failure. Code and pre-trained models are available.

Metrics

6 Record Views

8 citations in Web of Science

8 citations in Scopus

Details

Title: KeyPoint Relative Position Encoding for Face Recognition
Creators: Minchul Kim - Michigan State University
Yiyang Su - Michigan State University
Feng Liu - Michigan State University
Anil Jain - Michigan State University
Xiaoming Liu - Michigan State University
Feng Liu - Drexel University, Computer Science
Publication Details: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 244-255
Publisher: IEEE
Number of pages: 12
Grant note: Office of the Director of National Intelligence (ODNI) (10.13039/100011038) 2022-21102100004 / Intelligence Advanced Research Projects Activity (IARPA) (10.13039/100011039)
Resource Type: Conference proceeding
Language: English
Academic Unit: Computer Science
Web of Science ID: WOS:001322555900022
Scopus ID: 2-s2.0-85208588876
Other Identifier: 991022008296204721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas: Computer Science, Artificial Intelligence; Computer Science, Interdisciplinary Applications; Computer Science, Theory & Methods

KeyPoint Relative Position Encoding for Face Recognition

Files and links (1)

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media