We are showing image comparison test between Z Image Base and Z Image Turbo models released by Tongyi-MAI and Alibaba Team. We used the basic KSampler settings that are officially recommended by the developers. The results shown here are not cherry picked and we have listed whatever we got at our initial attempt.
KSampler Settings for Z Image Base [Base Variant]
Resolution- 1024x1024
CFG-3.0 to 5.0
Steps- 28 to 50
Sampler-Res_multistep
Shift-3
KSampler Settings for Z Image Turbo [Distilled Version]
CFG- 0 (for turbo mode), normal use -1.0
Steps-9
Resolution -1024x1024
Sampler-Euler
1.Ultra-Realistic Human Portrait (Extreme Skin & Micro Detail Test)
![]() |
Z Image Base
|
![]() |
| Z Image Turbo |
Prompt:
An ultra-photorealistic portrait of a 32-year-old male fisherman standing alone on a wooden pier at sunrise near a calm ocean harbor. His skin should show realistic pores, subtle acne scars, faint sun damage, fine wrinkles around the eyes, slightly chapped lips, and visible beard stubble with uneven density. His expression is calm but thoughtful, eyes slightly squinting from the early morning sunlight. Small droplets of seawater cling to his dark green waterproof jacket. The jacket fabric must show woven texture, stitching seams, slight wear, and salt stains.The lighting should be soft golden-hour sunlight coming from behind him, creating a warm rim light around his hair and shoulders, with gentle shadows on the face. Include subtle subsurface scattering in the skin and realistic catchlights in the eyes. Use shallow depth of field (85mm lens simulation, f/1.8), keeping the face tack sharp while the ocean and boats in the background are softly blurred.The color grading should be cinematic but natural — no plastic skin smoothing, no beauty retouching. Maintain realistic human proportions and accurate anatomy. High dynamic range, ultra high resolution, extremely sharp fine detail.
What this stresses: skin realism, micro-textures, lighting nuance.
2.Dense Urban Night Scene (Reflections + Crowd + Lighting Complexity)
![]() |
Z Image Base
|
![]() |
| Z Image Turbo |
Prompt:
A hyper-detailed cinematic wide-angle photograph of a crowded Tokyo street at night during heavy rainfall. The asphalt road is soaked, creating mirror-like reflections of neon signs written in Japanese kanji and katakana. The neon lights glow in pink, blue, and cyan tones, with accurate light spill and bloom effects reflecting on puddles.Dozens of pedestrians walk through the frame holding transparent umbrellas covered with visible raindrops. Each umbrella should show correct refraction and light diffusion from surrounding signage. People should have varied clothing styles, realistic faces (not duplicated), and natural body proportions.Cars pass through the street creating long exposure-style light trails in red and white. Steam rises from nearby street food stalls, interacting realistically with the neon lighting. Include atmospheric volumetric light beams, subtle fog, and realistic rain streaks.Composition should feel dynamic but physically believable. Ensure perspective lines converge naturally. Ultra-high detail across foreground, midground, and background.
Stresses: reflections, duplication errors, text rendering, complex lighting.
3.Epic Fantasy Floating Castle (Structural Logic Test)
![]() |
Z Image Base
|
![]() |
| Z Image Turbo |
Prompt:
An epic fantasy ultra-detailed scene of a massive floating castle suspended above a sea of clouds at sunset. The castle is built from white marble and gold-trimmed architecture, featuring gothic arches, flying buttresses, intricate carvings, stained glass windows, and layered towers. The structure must feel physically stable and architecturally coherent.Waterfalls cascade from multiple terraces of the castle, falling downward into the clouds below, creating mist where water meets air. The sunset sky should feature dramatic gradients of orange, magenta, and deep purple with realistic cloud scattering.Several large dragons circle the castle. Each dragon must have anatomically consistent wings, scales with individual texture detail, realistic muscle tension, and accurate shadow casting onto the castle walls.Use ultra-wide cinematic composition, high dynamic range lighting, crisp micro-detail in stone carvings, and correct atmospheric perspective (foreground sharp, distant elements slightly softened).
Stresses: geometry logic, fantasy realism, structural consistency.
4.Technical Scientific Diagram (Text + Precision Stress Test)
![]() |
Z Image Base |
![]() |
| Z Image Turbo |
Prompt:
A clean, textbook-quality scientific cross-sectional diagram of a lithium-ion battery. The diagram should be flat vector style on a white background with accurate proportions. Clearly label the following components in readable sans-serif typography: Anode, Cathode, Separator, Electrolyte, Current Collector, Positive Terminal, Negative Terminal.Include directional arrows showing lithium ion movement during charging and discharging phases. Text must be spelled correctly and evenly spaced. Lines should be sharp and precise with consistent stroke width.The layout should be symmetrical and uncluttered, suitable for a university-level engineering textbook. No distorted letters, no gibberish text, and no visual artifacts.
Stresses: spelling accuracy, typography alignment.
5.Surreal Concept Scene (Transparency + Lighting Blend)
![]() |
Z Image Base
|
![]() |
| Z Image Turbo |
Prompt:
A surreal, dreamlike scene of a giant transparent human head floating above a calm ocean at twilight. The head must be anatomically proportionate and made of clear glass-like material with realistic refraction and internal reflections.Inside the head, instead of a brain, there is a miniature spiral galaxy with glowing stars, nebula clouds in purple and blue hues, and faint cosmic dust. The light from the galaxy softly illuminates the interior of the skull.Bioluminescent jellyfish float around the head in the surrounding air, emitting soft blue light that interacts realistically with the transparent surface.Ensure physically accurate transparency, correct light bending, subtle caustics, and cinematic volumetric atmosphere. Ultra-high resolution with sharp detail.
Stresses: transparency physics, internal light behavior.
6.Commercial Product Photography (Material Realism Benchmark)
![]() |
Z Image Base |
![]() |
| Z Image Turbo |
Prompt:
A professional studio-quality product photograph of a matte black luxury smartwatch positioned upright on a reflective glass surface. The watch case should show brushed metal texture, subtle micro-scratches, and clean edges.The glass screen must display a minimal digital interface with sharp readable numbers (10:08 time display). No distorted text.Use softbox lighting from both sides with realistic reflections and soft shadows beneath the watch. The reflection on the glass surface must be accurate and slightly faded.The background should be pure white with a soft gradient. The overall image must look like a premium tech advertisement suitable for Apple or Samsung-level marketing.
Stresses: material textures, reflections, text clarity.
7.Multi-Person Realism Test (Hands + Anatomy)
![]() |
| Z Image Base |
![]() |
| Z Image Turbo |
Prompt:
A candid outdoor photograph of a family of four having a picnic in a green park during late afternoon. Two parents sit on a blanket while their two children run nearby laughing.The parents’ hands must look anatomically correct with five properly formed fingers, natural joints, and realistic proportions. Faces should be unique and expressive.Sunlight filters through tree leaves, casting natural dappled shadows across skin and clothing. Clothing fabric should show folds and realistic texture.Maintain correct perspective, natural posture, and no duplicated faces or distorted limbs.
Stresses: hands, anatomy consistency, duplication artifacts.
8.Typography Poster (Advanced Text Rendering Stress)
| Z Image Base |
| Z Image Turbo |
Prompt:
A minimalist motivational poster with bold centered typography that reads exactly: “DISCIPLINE BUILDS FREEDOM”.Use a modern sans-serif font with clean kerning and consistent letter spacing. The text should be perfectly spelled and aligned.The background should be matte black with subtle grain texture. The white text must be sharp, high contrast, and print-ready quality at 4K resolution.No extra words, no spelling errors, no distorted letters, and no warped baseline alignment.
Stresses: text reliability under long instructions.
9.Transparent Glass + Physics Logic Test
![]() |
| Z Image Base |
![]() |
| Z Image Turbo |
Prompt:
A macro studio photograph of a perfectly clear glass cube filled completely with water. Inside the cube sits a small white wax candle that is lit with a visible flame.The water should show realistic refraction and light distortion. The candle flame should appear physically accurate, and interactions between flame and water must be logically consistent.Studio lighting setup with soft shadows and high detail. Photorealistic physics simulation.
Stresses: logical reasoning, physical contradictions.
10.Extreme Macro Nature Shot
![]() |
| Z Image Base |
![]() |
| Z Image Turbo |
Prompt:
An extreme macro photograph of a honeybee collecting pollen from a sunflower. The bee’s tiny hairs must be individually visible with pollen grains attached.The sunflower center should show spiral seed patterns with natural imperfection.Use shallow depth of field (macro lens simulation), with only part of the bee in sharp focus. Natural sunlight with subtle shadow transitions. Ultra high resolution, photorealistic detail.
Stresses: micro-detail rendering.




















