Upload anything — food, coffee, a street corner, your pet, your desk.
AI adds white-line outlines and tiny handwritten thoughts to every object, turning a forgettable photo into your diary cover.
Free credits for new users — upload a photo and play
Handwritten photo annotation (Japanese: 手描き吹き出し) is a photo-doodle aesthetic that spread from Japanese Instagram in late 2025: take any everyday snapshot and overlay a thin hand-drawn layer of object outlines plus tiny diary-style notes. This page packages the effect into a tuned prompt and 5 ready-to-use scene templates — upload any photo and hit generate.
White strokes on dark backgrounds, dark strokes on light ones — the model picks the contrast automatically. The original viral prompt only worked on dark night-market scenes; we fixed that.
"so refreshing~", "soft and warm", "kind of happy today" — short inner-monologue lines, not dry product labels. The model writes them as if you're whispering to your journal.
The prompt explicitly asks for 8–12 distinct elements to be annotated. The result is dense and journal-rich, not three sparse labels on an empty image.
The prompt instructs the model to add an annotation layer only — your face, food, and composition stay exactly as shot. The model only draws on top.
This effect thrives on object-rich scenes. The more discrete things in the frame, the more the AI has to label.
Brunch plates, hot pot, milk tea, café desserts. Every dish gets a texture note, every drink a temperature & mood note — 'soft', 'just sweet enough', 'so refreshing~'.
Coffee cup, laptop, plant, notebook, headphones. A natural multi-object setup. Notes lean wistful: 'coffee's gone cold', 'cat's still asleep', 'this book is hard to get into'.
Neon signs, food stalls, crowds, menus. The original viral context — neon backgrounds make white pen lines pop with maximum contrast.
Cats, dogs, toys, blankets. Pet annotations are extra cute — 'soft ears', 'sleepy eyes', 'tiny bit better behaved today~'.
Hotel room, airline meal, train station, scenic spots. Turn a roll of trip photos into a hand-bound journal where every shot captures a passing mood.
Shopping cart, snack shelves, fridge cases. Mundane scenes loaded with annotation potential — every snack gets a name and a tiny opinion.
Each card maps to a typical input photo. Click to open the full prompt, then hit "Use this scene" to push it into the generator above. Upload your own photo and hit generate.
Annotated photos are natural content material — a regular snapshot becomes a publish-ready image-text card.
Portrait + handwritten captions are exactly what social cover formats love. The annotation already carries the caption — no extra design or typography needed.
9:16 portrait output drops straight into IG Story / TikTok vertical. Looks more elevated than IG's built-in stickers, more handcrafted than Canva templates.
Drop into your blog, Substack, or photo dump. The annotations already convey emotion — your accompanying caption can be a single line.
Food blogs, city walking essays, lifestyle newsletters — this style fills the 'human-feeling illustration' gap that stock photos can't.
GPT Image 2 + a tuned prompt + 5 scene templates + transparent credits. No prompt-writing or design skills required to reproduce the original viral feel.
Best-in-class for in-image text rendering. At high quality + larger sizes, handwritten English / Japanese / Chinese characters render legibly — the only model that makes this style work.
Upload a photo and hit generate immediately, no prompt-writing needed. Want a different style / language / density? Copy any of the 5 scene templates and overwrite.
The prompt explicitly tells the model not to modify faces or the main subject — only add an annotation layer. Your face stays. Food stays. Composition stays.
Run the same photo + same prompt 4–8 times and pick the variant with the most charming notes. The model writes new micro-thoughts each time.
Credit cost is shown above the generate button before you click. New users get free credits — enough to try all 5 scenes before deciding.
Common questions about this prompt and GPT Image 2 generation. Other questions: support@imagesv2.ai.
No. The prompt explicitly says 'do not modify faces or the main subject — only add an annotation layer'. The model draws around faces, not on them. If you're cautious, pick a photo where faces are small or off-frame.
Three traits: (1) Object-rich — 8+ labelable elements (a meal, a street, a desk). (2) Background contrast — dark neon or light wood both work; flat white walls don't. (3) Already-clear composition. The original viral hits were Changsha night markets, which checked all three boxes.
Yes. The 5 scenes ship with English prompts. For Japanese annotations, swap the [Text rules] block to 'Japanese hiragana / katakana text only'. For Chinese, switch to '简体中文 only, 汉字工整'. The Chinese version of this page has Chinese-language scene prompts ready to copy.
GPT Image 2 at medium quality + 1024×1024 occasionally renders pseudo-characters (looks like text, isn't). Fix: (1) bump quality to high in the generator, (2) pick 1024×1536 or 1536×1024 for larger glyphs, (3) regenerate 2–3 times and pick the cleanest variant.
The phrase 'Annotate 8–12 distinct elements' in the prompt is tunable. Change to '15–20 elements' for denser, or '3–5 key elements' for sparser. The model responds reliably to count instructions.
Scenes 1–3 (food, café, convenience store) show English-handwriting demos. Scenes 4–5 (indoor, pet) show Chinese-handwriting demos — the layout, density, and decoration patterns read the same either way. Click "Use this scene" and the prompt is copied in your own language.
Free credits for new users. Prompt pre-filled. Upload your photo → hit generate → walk away with a Japanese-style annotated diary cover.





Quality Comparison (gpt-image-2)
| Quality | Speed | Image Detail | Credits/Image | Best For |
|---|---|---|---|---|
| Low | Fastest (3-8s) | Good composition, less detail | 10 | Quick iterations, bulk generation, social media |
| Medium | Moderate (10-20s) | Rich details, good textures | 40 | Marketing images, presentations |
| High | Slower (20-40s) | Highest fidelity, finest details | 110 | Print, posters, premium assets |
| Auto | Model decides | Auto-selected by model | 40 | When unsure |
Model Comparison
| Model | Highlights | Low/Image | High/Image |
|---|---|---|---|
| gpt-image-2 | Latest, best results | 10 | 110 |
Cost-Saving Tips
Enter a text prompt describing the image you want. Adjust parameters like size (Square, Landscape, Portrait), quality (Low/Medium/High), output format (PNG/JPEG/WebP), and background (Opaque or Transparent). Click "Generate" and GPT Image 2 will create a brand-new image from your description.
Tips for Better GPT Image 2 Results
Upload a source image, write a prompt describing the changes you want, and GPT Image 2 will modify the image accordingly. Without a mask, GPT Image 2 decides which areas to change. With a mask, you can precisely control which regions are modified.
You can use Edit mode for GPT Image 2 image-to-image generation without a mask. Simply upload a reference image and describe the transformation you want in the prompt — for example, "Convert this photo to a watercolor painting" or "Reimagine this scene in a cyberpunk style". GPT Image 2 will use your image as a reference to generate a new version.
Example GPT Image 2 Prompts for Image-to-Image