ChatGPT Images 2.0 vs 1.5: Is this update really an upgrade?

Everytime some AI company launches a new image model, I compare it with other image models on the market because benchmark numbers are one thing and real world usage is a whole other. This time, however, I wanted to mix things up so I pulled some of the images I had from when I had compared ChatGPT Images 1.5 with Gemini Nano Banana and used the same prompts on the newer ChatGPT Images 2.0. These prompts test typography accuracy, simplicity, optical physics, interpretation, and hyperrealism. Same prompts, same models, different generation, side by side. Here’s what actually happened.

Also read: ChatGPT Images 2.0 is here with improved photorealism, better Hindi text rendering and more

Retro-futuristic typography

The prompt: “A vintage travel poster for ‘Neo-Tokyo 2050’. The main title ‘NEO-TOKYO’ should be in a bold, retro-futuristic chrome font at the top. At the bottom, in a smaller serif font, write ‘Where Tradition Meets Silicon’. In the center, a neon sign on a building must clearly read ‘OPEN 24/7’ in bright red. Ensure all text is spelled correctly and legible. “

Both models spelled everything correctly, so no regression there. But that’s about the only thing 1.5 had on par with 2.0. Its chrome title is flat and warm-toned, the neon sign looks like clip-art, and the overall scene feels like anime fan art. 2.0 came back with an iridescent, purple-chrome title that looks more retro-futuristic, a neon sign that’s on a building like a real sign, wet streets, cherry blossoms, Tokyo Tower, Mt. Fuji in the background, and a flying police car with actual Japanese text on it. Both look like decent posters and you can see the similarities but 2.0 just does everything better. 2.0 wins.

Product photography

Prompt: “A red ceramic coffee mug on a wooden table. No logo, no text, no spoon, no steam. Neutral daylight, realistic shadows, product photography style 16:9 “

ChatGPT Images 1.5

Like we had seen against Gemini, 1.5 ignored the 16:9 aspect ratio entirely and gave a 3:2 image. It also added a blurred plant and window in the background, neither of which I had asked for. 2.0 hit 16:9 correctly, kept the background clean, and just had more accurate shadows with better ceramic gloss. 2.0 wins.

ChatGPT Images 2.0

Optical physics

Prompt: “A hyper-realistic close-up of a clear glass sphere sitting on a wooden table. Inside the glass sphere is a precise analog clock face showing exactly 10:10. Through the glass, the wooden table grain should be visible but optically distorted and inverted by the refraction. Soft morning sunlight hitting from the right. ”

ChatGPT Images 1.5

Also read: ChatGPT Images vs Google Nano Banana: 10 exact prompts, who makes better pics?

1.5 had rendered what looks like a glass-cased desk clock. No clear refraction or distortion. The wood grain around the sphere shows no bending whatsoever. 2.0 actually attempted physics. The wood grain visibly warps and bends as it passes through the sphere’s curve, and there are caustic light patches cast on the table beneath it. It doesn’t fully nail the inversion, but it is getting closer. Maybe ChatGPT Images 2.5 gets this entirely right. 2

ChatGPT Images 2.0

Infographic design

Prompt:  “A clean, modern infographic illustration of a coffee bean’s lifecycle. Three distinct stages labeled clearly: 1. ‘Harvest’, 2. ‘Roast’, and 3. ‘Brew’. Connected by simple directional arrows. White background, vector art style.”

1.5 followed the brief. Three stages, correct labels, clean arrows, white background. But it doesn’t look like a proper infographic. it showed a berry branch, two raw beans, and a finished coffee cup. That’s not a lifecycle. That’s three product icons. 2.0 showed coffee cherries at varying ripeness being harvested, beans tumbling in a roasting pan with heat and steam, and a pour-over setup brewing coffee. 2.0 illustrated an actual sequence.

2.0 wins.

Extreme macro portrait

Prompt: “Macro photography of an elderly fisherman’s eye and upper cheek. Extreme detail on the weathered skin texture, sunspots, and deep crow’s feet wrinkles. The eye is bright blue, reflecting a calm ocean horizon. Individual eyelashes and pores must be visible. 85mm lens, f/1.8 aperture. ”

1.5 looks dramatic. Heavily warm, high-contrast, almost cinematic.  the kind of image that gets likes on Instagram. But the eyelashes are soft and blended, and the skin has an over-processed HDR quality that looks AI generated, which it is but still important to note.

2.0 is quieter but more accurate. Individual eyelashes are resolved cleanly. The bokeh falloff is gradual in a way that matches real 85mm glass. Visible capillaries under the lower lid. It looks like a photograph. Still AI generated but it closes that gap a lot better.

2.0 wins.

Verdict

Five prompts. Five wins for 2.0. 2.0 didn’t just produce prettier images. It understood what the prompts were actually asking for. It followed constraints. It reasoned about optical physics. It knew how to show a lifecycle. 1.5 made attractive images. But attractive isn’t the same as correct. So ChatGPT Images 2.0 is an upgrade over 1.5, just not enough for it to be better than Gemini for me atleast.

Also read: Google has a Sergey Brin problem, and it’s called Claude Code

Vyom Ramani

A journalist with a soft spot for tech, games, and things that go beep. While waiting for a delayed metro or rebooting his brain, you’ll find him solving Rubik’s Cubes, bingeing F1, or hunting for the next great snack.

Connect On :