Recommended Evaluator Prompts
Start here: these v3.2 prompts have the strongest measured CRAFT-GC improvements and make the comparison easy to understand.
Published metrics (image-level CLIP)
| Method | CoFS | GP | SSA-BFS | GCFair |
| Base SD | 0.914 | 1.000 | +0.000 | 0.259 |
| PromptAug | 0.868 | 0.699 | +0.140 | 0.470 |
| PromptAug-Explicit | 0.874 | 0.852 | +0.464 | 0.926 |
| FairImagen-GC | 0.910 | 0.849 | -0.092 | 0.000 |
| DOPP-CRAFT-GC (validated) | 0.851 | 0.965 | +0.397 | 0.950 |
All 2,500 PNGs load from the public HF dataset. Use search and filters to narrow results; browse mode paginates through the full experiment grid.