v1, v2, v3: a field guide to character card formats

How character cards really work: data hidden in PNG tEXt chunks, what each spec generation added (alternate greetings, character books, v3 assets), why cards break in transit, and a distribution checklist for creators.

Deng BinjieMay 20, 20268 min readCharacter cardsSpecCreators

Three stacked translucent character cards, each layer revealing more embedded structure than the last

A character card looks like a portrait and behaves like a program. Inside that innocent PNG sits a JSON document deciding how a character speaks, what she remembers, and which greeting opens the scene. Three spec generations in, the format is genuinely good — and almost nobody explains it end to end. Here is the map, whether you are debugging a card that will not import or deciding which fields your next release should use.

Where the data physically lives

PNG files allow named metadata chunks (tEXt) alongside the pixels. Card tooling writes the character JSON, base64-encoded, into a chunk named chara (v2) or ccv3 (v3). Image viewers ignore it; tavern clients read it. The elegance has a brittle edge: anything that re-encodes the image — messenger compression, a photo library's storage optimizer, a forum thumbnailer — rebuilds the pixels and drops the chunks. The portrait survives, the soul does not. That single fact explains the majority of “card imports as a blank character” reports.

v1: six fields and a dream

The original TavernAI-era card was just six flat strings: name, description, personality, scenario, first_mes, mes_example. No version marker, no room to grow — which is exactly why every client of the era parsed cards slightly differently, and why old v1 cards still float around forums confusing modern importers.

v2: the envelope that fixed the ecosystem

The v2 spec made one structural decision that mattered more than any field: it wrapped everything in a versioned envelope (spec, spec_version, data), so clients could finally negotiate capabilities instead of guessing. Into that envelope went the features creators had been improvising: alternate_greetings (multiple opening scenes per card), system_prompt and post_history_instructions overrides, creator_notes, tags, and the headliner — character_book, a lorebook embedded directly in the card so world facts travel with the character.

v3: assets, languages, and sharper lorebook semantics

chara_card_v3 keeps the envelope and pushes the edges: an assets array (portraits, emotion sprites, audio referenced by URI), multilingual creator_notes_multilingual, group-chat greetings, decorators for finer prompt-injection control, and a recommendation to also ship cards as .charx (a zip container) when embedding media. A well-built v3 card still degrades gracefully in older clients — unknown fields are ignored, the core six always parse.

The generations at a glance

Capability	v1	v2	v3
Core persona fields	✓	✓	✓
Versioned envelope	—	✓	✓
Alternate greetings	—	✓	✓ (+ group greetings)
Embedded character book	—	✓	✓ (richer insertion rules)
Assets (sprites, audio)	—	—	✓
Multilingual creator notes	—	—	✓

What “full compatibility” should mean in 2026

Reading the six core fields is table stakes. The real test of a client is the long tail: does it surface all alternate greetings? Does the embedded character book actually inject, with its insertion order and scan depth respected? Are v3 group greetings available in group chats? When we built card import for Foreverse, we wrote conformance tests against the published specs for exactly these paths — and made the importer say why a file failed (no data chunk, malformed JSON, truncated upload) instead of failing silently. Anyone who has stared at a card that “just won't import” knows that an error message is a feature.

One more habit worth stealing from card authors: know what each field costs. Description, personality and scenario re-enter the context on every single turn; the greeting appears once; creator notes never ship at all. Drop any PNG, JSON or .charx onto the free character card token counter and it weighs the card field by field — permanent versus one-off versus keyword-triggered — before your users pay for the difference.

And if you want to study well-formed cards rather than specs, the community card showcase is a live corpus: 90+ public cards, every one shipping an embedded character book (11–30 entries is typical) plus example dialogue, with per-card pages that state the greeting and lorebook counts. Inspecting a real v3 structure beats re-reading the spec a third time.

FAQ

How is character data stored inside a PNG?

As base64-encoded JSON inside the PNG's tEXt metadata chunks — 'chara' for v2 data and 'ccv3' for v3. The image you see is just the portrait; the character lives in metadata. Any pipeline that re-encodes the image (messenger compression, photo-library 'optimization') strips those chunks and kills the card.

What did v2 add over v1?

v1 was six flat fields: name, description, personality, scenario, first message, and example dialogue. v2 wrapped them in a versioned envelope and added the fields the community had been hacking in: alternate greetings, system prompt overrides, creator notes, tags, and most importantly an embeddable character book (mini-lorebook).

Is chara_card_v3 backwards compatible?

Largely yes by design — v3 keeps the v2 envelope shape, so a v3 card usually degrades gracefully in a v2-only client (new fields like assets, multilingual creator notes, and decorators are simply ignored). A v2 card in a v3 client just works.

Should creators distribute PNG or JSON?

Both. PNG for collectors and galleries — it carries the portrait. JSON for reliability — it survives chat apps, email, and re-uploads that would strip PNG metadata. Label the spec version and note whether a character book is embedded; it preempts most 'card won't import' reports.

Questions or ideas? Join our Discord →