Skip to main content

Gemini Image Editing: A Guide With 10 Practical Examples

Learn how to edit images using Gemini 2.0 Flash with practical examples and understand its strengths and limitations.
Apr 2, 2025  · 10 min read

Gemini 2.0 Flash now has the ability to generate and edit images directly in its chat interface. The interface is fully text-based and doesn’t require masking the image to specify which parts we want to modify.

In this article, I’ll explain step-by-step how to edit images using Gemini 2.0 Flash and show you through 10 practical examples how to achieve professional-quality results without the need for extensive technical know-how.

How to Access Gemini Image Editing

To access Google’s image editing tools, we need to go to the Google AI Studio. Here, we can access the “Gemini 2.0 Flash (Image Generation) Experimental” model, which is available for free. Note that this might become available in the main Gemini app in the future, but for now, Google AI Studio is the only option.

To get started, we select this model from the model dropdown located on the right-hand side of the UI:

how to select gemini 2.0 flash image generation in google ai studio

After selecting the model, we need to click on the "Create Prompt" button located on the left-hand side to begin a new conversation:

how to use gemini 2.0 flash image generation in google ai studio

We can use it for various image-generation tasks. For example, we can create vibrant landscapes, modify existing images to add a creative twist, or generate entirely new visual elements for our projects.  Here’s an example:

example image generation with gemini 2.0 flash image generaiton

While Gemini 2.0 Flash is equipped with robust capabilities for generating new images, in this article, we will primarily focus on its image editing features. The goal is to explore how we can use the tool to refine and enhance existing visuals.

In my exploration of Gemini 2.0 Flash, I tested the tool on 10 practical examples to assess its capabilities. From transforming an ordinary photo into a professional CV headshot to creating eye-catching YouTube thumbnails, the goal was to understand the strengths and limitations of the model to better identify what it can be most useful for.

1. Moving Elements in a Photo

In this example, I explored how the model handles moving elements within a photo. This feature can be especially handy when we want to reposition parts of an image to achieve better composition or adjust the focus.

For instance, if we have a group photo where one person is slightly out of place, we can use this tool to shift them closer to the center.

Moving elements in a photo using gemini image editing

Image source: Pexels

2. Product Photography With High-Quality Images

In this scenario, let's imagine we are a shoe company aiming to generate a photo of a model wearing our latest footwear. Traditionally, shooting new promotional content requires coordinating a photo shoot, which can be costly and time-consuming. 

This example shows how we can combine an existing photo of a model with a photo of our product:

Combing two high quality photos to showcase how the product looks like

Image sources: first image (Pexels), second image (Pexels).

I also tried to combine a person and an object in a more complex scenario by asking Gemini to make a person from a photo holding a toy car from another one. This worked nicely:

Combining people with object in a complex way

Image sources: first image (Pexels), second image (Pexels).

We don’t necessarily need two images for this use case. If we only have the product image, we can also let the model generate the rest:

Placing a product in an AI generated background

Image source: Pexels

3. Enhancing Product Photography

The previous example shows how the model handles combining existing quality photos. But what if we have a bad photo of our product and want to elevate it?

In this example, I took a photo of a cake and asked the model to turn it into an image that could be used in a food-ordering app. Enhancing food photography can be quite a challenge, especially when aiming to create mouth-watering visuals that captivate viewers on online platforms.

This time, the model couldn’t produce anything useful. I tried numerous times and kept getting poor outcomes. Here’s one example:

Enhancing an existing product photo

4. Modifying Poses

Next, I tried to see how effectively Gemini 2.0 Flash can alter a person's pose within an image. Modifying poses is a useful feature, especially in fashion photography, where the impact of a scene can dramatically change with subtle shifts in body language. 

First, I asked it to make the subject face the camera:

Making someone face the camera

Image source: Pexels

The result is quite good, taking into account that part of the person’s face wasn’t visible in the original image so the model has to generate it in a consistent way.

Then, I was curious to see if it could make the man turn his back at the camera, and it also worked very well:

Making someone turn their back at the camera

This example demonstrates the model’s proficiency in handling pose modifications. It blends the adjustments with the existing elements of the image, ensuring that the final result looks coherent and professional.

5. Modifying Facial Expressions

We know the model is good at changing body poses. How about facial expressions?

Changing facial expressions

Image source: Pexels

Imagine having a photo of a team meeting where everyone looks a bit serious, but we want to convey a more relaxed or joyful atmosphere. With Gemini 2.0 Flash, we can subtly adjust facial expressions to add smiles or soften frowns, transforming the overall feel of the image.

This example highlights the model’s ability to manage expression changes efficiently, allowing us to tailor images to fit the narrative we wish to present. 

6. YouTube Thumbnails

In the following example, we explore how Gemini can be used to create an engaging YouTube thumbnail by merging two images and incorporating text. Thumbnails are crucial for capturing viewers' attention, and a well-designed thumbnail can significantly impact the success of a video.

Making youtube thumbnails

Image sources: first image (Pexels), second image (Pexels).

I don’t have experience with what makes a good thumbnail, so I can’t judge that aspect of the result, but the model did exactly what it was asked for and even handled the text very well, which is something AI usually struggles with.

I’m convinced that someone with the know-how on what makes a captivating thumbnail can use this to very quickly generate thumbnails for their videos.

7. Generating Diagrams From Sketches

In my attempts to create diagrams from paper sketches, I encountered some challenges. While the model is equipped with advanced features for image editing, converting sketches into polished diagrams is a specific task that may require additional finesse.

Enhancing a complex diagram

In this example, the AI struggled to accurately interpret the hand-drawn elements, leading to a less-than-ideal result.

I gave it a second chance with a simpler diagram. This time, the diagram it generated was accurate with the instructions, but the design was poor:

Enhancing a simple diagram

8. Transforming Photos for CV Use

Imagine having a photo where the attire is too casual or doesn't quite fit the professional image we're aiming for. With Gemini 2.0 Flash, we can alter clothing in the image to better match the formal look required for professional settings.

Making a casual photo into a professional one

Image source: Pexels

This approach makes it significantly easier to update our professional image, helping us make a great first impression.

9. Modifying Elements in a Photo

Let's say we have a product photo with a background that doesn’t quite match the rest of our marketing materials. With Gemini 2.0 Flash, we can effectively replace or alter the background to better suit our requirements without losing the essence of the product itself.

Modifying elements in a photo

Image source: Pexels

This feature offers us a practical solution for refining visual content, adapting it to suit diverse scenarios, and maximizing the impact of our images.

10. Adding Objects to an Existing Photo

In this final example, I wanted to see how the model can handle adding new elements to a photo. 

Adding elements to a photo

Image source: Pexels

This capability allows us to flexibly enhance our photos, adapt images for varied uses, and align visuals with our creative ideas.

Limitations of Gemini Image Editing

Refusal to generate images

While experimenting with Gemini Flash, the model often refused to generate the image:

Model refusal to generate images

I assume it happens because they have an algorithm blocking certain requests to prevent unethical usage. However, I don’t know exactly why and when it happens because it seemed to happen quite randomly and for content that, in my view, was harmless, like the example above.

Regeneration and temperature

I didn’t cherry-pick the examples by modifying my prompts until it worked but I did have to ask Gemini to try again very often.

Here’s how to re-run the image generation:

How to rerun the image generation in gemini google ai sutdio

A critical factor that can significantly influence the quality of image editing results with Gemini 2.0 Flash is the 'temperature' setting. This setting, often found in creative tools, essentially controls how deterministic or random the model's output will be. 

In many cases, the default temperature is set to 1, which can sometimes be too random and lead to outcomes that may not align with our expectations. A higher temperature value allows for more variation and creativity, producing diverse results that might stray too far from the specific look we’re aiming for. Conversely, a lower temperature setting reduces randomness, making the output more predictable and consistent.

Setting the output temperature

Through my experience, I found that adjusting the temperature setting can better tailor the results to my needs. If the results seem too unpredictable, lowering the temperature can help bring the output closer to my intended vision. On the other hand, if I seek more creative or varied alternatives, a slightly higher temperature might be beneficial.

Image quality

When using the model, a common issue I encountered was the reduction in image resolution, particularly as I generated multiple iterations. Each time we regenerate or apply edits, the resolution can degrade, leading to a loss of sharpness and detail.

This happens because each edit or regeneration process can introduce small artifacts or distortions. Over multiple iterations, these can accumulate, resulting in an overall decline in image quality. The phenomenon is akin to making a copy of a copy, where each subsequent version tends to lose clarity compared to the original.

Even after a single iteration, the quality is reduced drastically:

Image quality degradation

Conclusion

Overall, Gemini 2.0 Flash proves to be a functional tool with a variety of useful features, though my experience with it has not always been without challenges. While it generally performs well, there were instances where it turned out to be quite frustrating, especially when the desired results required multiple attempts or when image quality decreased over continuous editing.

That said, I found the tool particularly adept when dealing with images of people. Tasks such as modifying poses, combining individuals with objects, and altering facial expressions stood out as areas where the model excelled. When it came to adjusting someone's pose or introducing a new object into a scene with a person, Gemini 2.0 Flash handled these tasks with precision, maintaining a natural look and feel.

Similarly, tweaking facial expressions to change the mood of a photo was effectively managed, allowing us to create the atmosphere we envisioned effortlessly.

While some tasks may require patience and additional refinement, the model's strengths, particularly in editing images with people, make it a valuable resource for enhancing visual projects.


François Aubry's photo
Author
François Aubry
LinkedIn
Full-stack engineer & founder at CheapGPT. Teaching has always been my passion. From my early days as a student, I eagerly sought out opportunities to tutor and assist other students. This passion led me to pursue a PhD, where I also served as a teaching assistant to support my academic endeavors. During those years, I found immense fulfillment in the traditional classroom setting, fostering connections and facilitating learning. However, with the advent of online learning platforms, I recognized the transformative potential of digital education. In fact, I was actively involved in the development of one such platform at our university. I am deeply committed to integrating traditional teaching principles with innovative digital methodologies. My passion is to create courses that are not only engaging and informative but also accessible to learners in this digital age.
Topics

Learn AI with these courses!

Track

AI Fundamentals

10hrs hr
Discover the fundamentals of AI, dive into models like ChatGPT, and decode generative AI secrets to navigate the dynamic AI landscape.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

Gemini 2.0 Flash Thinking Experimental: A Guide With Examples

Learn about Gemini 2.0 Flash Thinking Experimental, including its features, benchmarks, limitations, and how it compares to other reasoning models.
Alex Olteanu's photo

Alex Olteanu

8 min

Tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.
François Aubry's photo

François Aubry

12 min

Tutorial

Imagen 3: A Guide With Examples in the Gemini API

Learn how to generate images using Google’s Imagen 3 API with Python, including setting up your environment and adjusting options like aspect ratio and safety filters.
François Aubry's photo

François Aubry

12 min

Tutorial

Gemini 2.5 Pro API: A Guide With Demo Project

Learn how to use the Gemini 2.5 Pro API to build a web app for code analysis, taking advantage of the model's large context window.
Abid Ali Awan's photo

Abid Ali Awan

12 min

Tutorial

Gemini Code Assist: A Guide With Examples

Learn how to install and use Google’s Gemini Code Assist in Visual Studio Code, with Python examples.
François Aubry's photo

François Aubry

12 min

Tutorial

Building Multimodal AI Application with Gemini 2.0 Pro

Build a chat app that can understand text, images, audio, and documents, as well as execute Python code. Truly a multimodal application closer to AGI.
Abid Ali Awan's photo

Abid Ali Awan

11 min

See MoreSee More