Dataset
Wallpaper Group Symmetry Benchmark (ImageNetSYM, NoiseSYM, AtomSYM)
19 Feb 2026
Abstract
Wallpaper Group Symmetry Benchmark (ImageNetSYM, NoiseSYM, AtomSYM)
Overview
This dataset collection provides a large-scale benchmark for symmetry recognition in computer vision and scientific machine learning. It is designed to systematically evaluate whether modern deep learning architectures can internalize two-dimensional crystallographic symmetry as a transferable geometric abstraction rather than as domain-specific visual heuristics.
The benchmark consists of three procedurally generated datasets:
ImageNetSYM
NoiseSYM
AtomSYM
Each dataset is constructed around the 17 wallpaper group symmetries and together they contain over 10 million labeled images.
The primary goal of this benchmark is to enable rigorous in-domain and cross-domain evaluation of symmetry recognition and generalization across distinct visual modalities.
Scientific Motivation
Symmetry governs structure–property relationships across physical systems in materials science, condensed matter physics, chemistry, and crystallography. Despite the rapid adoption of deep learning models in scientific domains, current architectures often fail to capture symmetry as a fundamental geometric concept.
Instead, models tend to rely on texture statistics, color cues, or dataset-specific visual patterns. This benchmark was developed to:
Quantify symmetry recognition performance
Test cross-domain generalization
Identify architectural limitations
Provide a foundation for symmetry-aware model development
The datasets are designed to decouple symmetry structure from visual appearance, forcing models to confront the underlying geometric transformations.
Dataset Components
1. ImageNetSYM
ImageNetSYM consists of natural image textures transformed to obey one of the 17 wallpaper group symmetries. Base images are procedurally tiled and symmetrized to enforce exact group operations including rotations, reflections, glide reflections, and translations.
Purpose:
Test symmetry recognition in naturalistic visual domains
Evaluate reliance on texture and semantic content
2. NoiseSYM
NoiseSYM contains purely synthetic, noise-based patterns generated algorithmically and symmetrized according to the 17 wallpaper groups.
Purpose:
Remove semantic cues
Isolate geometric symmetry recognition
Provide a texture-agnostic evaluation setting
3. AtomSYM
AtomSYM contains atomistic lattice-like renderings designed to mimic crystalline structures. These images are generated using procedural atomic motif placement consistent with wallpaper group operations.
Purpose:
Bridge abstract symmetry and materials-inspired representations
Evaluate relevance to crystallographic and materials science workflows
Structure and Organization
Each dataset follows a hierarchical directory structure with:
- 17 classes corresponding to the 17 wallpaper groups- Data with rich metadata- Balanced class distributions- Procedurally generated samples with controlled randomness
Scale
Total images across the complete dataset collection:
- 10,000,000+
Due to file size limitations of the hosting platform, the files deposited here represent a curated portion of the full benchmark.
Researchers who require access to the full dataset may also contact the authors directly to arrange data transfer.
Benchmark Tasks
This benchmark supports:
In-domain classificationTrain and evaluate within the same dataset
Cross-domain generalizationTrain on one dataset, evaluate on another
Scaling studiesEvaluate performance as a function of dataset size
Attention and feature analysisStudy learned symmetry representations using:
Attention maps
Confusion matrices
Feature embedding analysis
Intended Use
This dataset is intended for:
Machine learning researchers studying geometric inductive biases
Materials informatics researchers
Computer vision researchers
Crystallography and symmetry modeling studies
Benchmarking symmetry-aware architectures
It is particularly suited for testing:
CNNs such as ResNet-50
Multi-scale architectures such as Feature Pyramid Networks
Transformer-based models including Cross-Covariance Image Transformers
Equivariant or symmetry-aware neural networks
Key Findings Enabled by This Dataset
Using this benchmark, we observe:
High in-domain classification accuracy across architectures
Significant degradation in cross-domain performance
Improved robustness from global attention mechanisms
Persistent failure to encode symmetry as a fully transferable abstraction
These findings highlight the need for explicit geometric priors and symmetry-aware model design.
Data Generation
All datasets are procedurally generated to ensure:
Exact enforcement of wallpaper group operations
Reproducibility
Controlled random seeds
Balanced class sampling
Absence of labeling noise
Generation scripts are included in the associated repository when applicable.
Limitations
Synthetic generation may not capture all real-world symmetry imperfections
Models trained on these datasets may still rely on statistical shortcuts
Recognition does not imply physical understanding
Metrics
1 Record Views
Details
- Title
- Wallpaper Group Symmetry Benchmark (ImageNetSYM, NoiseSYM, AtomSYM)
- Creators
- Yichen GuoJoshua Agar
- Publisher
- Zenodo
- Resource Type
- Dataset
- Language
- English
- Academic Unit
- Mechanical Engineering and Mechanics
- Other Identifier
- 991022171853804721