v1, v2, v3: a field guide to character card formats
How character cards really work: data hidden in PNG tEXt chunks, what each spec generation added (alternate greetings, character books, v3 assets), why cards break in transit, and a distribution checklist for creators.

A character card looks like a portrait and behaves like a program. Inside that innocent PNG sits a JSON document deciding how a character speaks, what she remembers, and which greeting opens the scene. Three spec generations in, the format is genuinely good — and almost nobody explains it end to end. Here is the map, whether you are debugging a card that will not import or deciding which fields your next release should use.
Where the data physically lives
PNG files allow named metadata chunks (tEXt) alongside the pixels. Card tooling writes the character JSON, base64-encoded, into a chunk named chara (v2) or ccv3 (v3). Image viewers ignore it; tavern clients read it. The elegance has a brittle edge: anything that re-encodes the image — messenger compression, a photo library's storage optimizer, a forum thumbnailer — rebuilds the pixels and drops the chunks. The portrait survives, the soul does not. That single fact explains the majority of “card imports as a blank character” reports.
v1: six fields and a dream
The original TavernAI-era card was just six flat strings: name, description, personality, scenario, first_mes, mes_example. No version marker, no room to grow — which is exactly why every client of the era parsed cards slightly differently, and why old v1 cards still float around forums confusing modern importers.
v2: the envelope that fixed the ecosystem
The v2 spec made one structural decision that mattered more than any field: it wrapped everything in a versioned envelope (spec, spec_version, data), so clients could finally negotiate capabilities instead of guessing. Into that envelope went the features creators had been improvising: alternate_greetings (multiple opening scenes per card), system_prompt and post_history_instructions overrides, creator_notes, tags, and the headliner — character_book, a lorebook embedded directly in the card so world facts travel with the character.
v3: assets, languages, and sharper lorebook semantics
chara_card_v3 keeps the envelope and pushes the edges: an assets array (portraits, emotion sprites, audio referenced by URI), multilingual creator_notes_multilingual, group-chat greetings, decorators for finer prompt-injection control, and a recommendation to also ship cards as .charx (a zip container) when embedding media. A well-built v3 card still degrades gracefully in older clients — unknown fields are ignored, the core six always parse.
The generations at a glance
| Capability | v1 | v2 | v3 |
|---|---|---|---|
| Core persona fields | ✓ | ✓ | ✓ |
| Versioned envelope | — | ✓ | ✓ |
| Alternate greetings | — | ✓ | ✓ (+ group greetings) |
| Embedded character book | — | ✓ | ✓ (richer insertion rules) |
| Assets (sprites, audio) | — | — | ✓ |
| Multilingual creator notes | — | — | ✓ |
What “full compatibility” should mean in 2026
Reading the six core fields is table stakes. The real test of a client is the long tail: does it surface all alternate greetings? Does the embedded character book actually inject, with its insertion order and scan depth respected? Are v3 group greetings available in group chats? When we built card import for Foreverse, we wrote conformance tests against the published specs for exactly these paths — and made the importer say why a file failed (no data chunk, malformed JSON, truncated upload) instead of failing silently. Anyone who has stared at a card that “just won't import” knows that an error message is a feature.
FAQ
How is character data stored inside a PNG?
As base64-encoded JSON inside the PNG's tEXt metadata chunks — 'chara' for v2 data and 'ccv3' for v3. The image you see is just the portrait; the character lives in metadata. Any pipeline that re-encodes the image (messenger compression, photo-library 'optimization') strips those chunks and kills the card.
What did v2 add over v1?
v1 was six flat fields: name, description, personality, scenario, first message, and example dialogue. v2 wrapped them in a versioned envelope and added the fields the community had been hacking in: alternate greetings, system prompt overrides, creator notes, tags, and most importantly an embeddable character book (mini-lorebook).
Is chara_card_v3 backwards compatible?
Largely yes by design — v3 keeps the v2 envelope shape, so a v3 card usually degrades gracefully in a v2-only client (new fields like assets, multilingual creator notes, and decorators are simply ignored). A v2 card in a v3 client just works.
Should creators distribute PNG or JSON?
Both. PNG for collectors and galleries — it carries the portrait. JSON for reliability — it survives chat apps, email, and re-uploads that would strip PNG metadata. Label the spec version and note whether a character book is embedded; it preempts most 'card won't import' reports.