Dataset Torrent

Dataset Card for PaperDemon

Dataset containing PaperDemon artwork archives.

  • image-classification
  • image-to-text
  • art
  • English

Dataset Summary

This dataset contains artwork collected from PaperDemon. The dataset includes artwork images along with associated metadata such as titles, posting dates, descriptions, tags, and user comments.

Languages

The dataset is primarily monolingual. Most image descriptions and metadata are in English, though some artists may include multilingual content in their descriptions or comments.

Dataset Structure

Data Files

The dataset consists of artwork image files stored across multiple ZIP archives, corresponding metadata in JSONL format, and an archive index CSV file mapping image filenames to their respective archive files.

Data Fields

Fields include ID, title, posted date, image URL, description, tags, characters, and comments.

Data Splits

All artworks and metadata are in a single split with 45,970 entries.