Imagen 3: A Guide With Examples in the Gemini API

Learn how to generate images using Google’s Imagen 3 API with Python, including setting up your environment and adjusting options like aspect ratio and safety filters.

Feb 26, 2025 · 12 min read

Imagen 3 is a text-to-image generation model which can accurately render detailed scenes in various styles and even incorporate text within images, making it suitable for applications like advertisement and media content creation.

We can use Imagen 3 through the Vertex AI Studio interface or integrate it directly into applications via an API.

In this tutorial, I’ll explain how to generate images with Imagen 3 using the Google Generating API with Python. I'll walk you through practical examples that show how to set up the environment, write the necessary code, and integrate this technology into your projects. Whether you are looking to enhance our applications with dynamic imagery or simply satisfy your curiosity, this guide will provide a straightforward path to understanding and using Imagen 3 programatically.

Develop AI Applications

Learn to build AI applications using the OpenAI API.

Start Upskilling For Free

Setting Up the Google Generative AI API

To get started with Imagen 3 in Python, we need to perform a couple of essential steps. First, we'll create a Google Cloud project. Once our project is ready, we need to generate an API key. This key will allow our Python code to interact with the Imagen 3 service.

Creating a Google Cloud project

Creating a Google Cloud project is the first step for using Google's AI API. Let's walk through the process together:

Access the Google Cloud console: We begin by navigating to the Google Cloud Console. You’ll need to sign in with your Google account.
Select or create a new project: In the console, look for the project dropdown menu on the top navigation bar. Clicking on it will reveal options to select an existing project or create a new one. We choose "New Project" to continue.
Provide project details: A simple form will appear asking for basic information. We can choose any name for the project, but it's helpful to pick something descriptive. This can be changed later if needed. We'll use Imagen-tutorial. The organization can be left blank.

Creating the API key

Now that the Google Cloud project is created, we can create an API key by navigating to the API key page in Google AI Studio.

To create the key, click the "Create API key" button:

On the popup, type the name of the project we created above, select it, and click "Create API key in existing project":

Copy the key and create a file named .env located in the same folder where we’ll write the Python script. The content of the file .env should be:

GEMINI_API_KEY=<paste_your_key_here>

Billing account

If we tried using the API key now, we would get an error saying "Imagen API is only accessible to billed users at this time." This happens because the API isn't free, and we need to associate a billing account with the project before being able to use it.

At the time of writing, the price of generating an image with Imagen 3 is $0.03. Check their pricing page for more information.

To add a billing account, click the "Set up Billing" button next to the API key in Google AI Studio.

This will redirect us to the Google Cloud website, where we can either select an existing billing account by clicking "Link a billing account" or create a new one by clicking "Manage billing accounts."

Let's assume we don't have one, so we create "Manage billing accounts." On the top left of the page, there's a "Create account" button. To create the account, we need to fill in our personal information and a credit card to process the payments.

Generating an Image with Imagen 3 Using Python

Environment setup with Anaconda

In this tutorial, we use Anaconda, a popular platform that simplifies package management and project setup, making it easier to run Python scripts.

We can install Anaconda from their official website.

To set up the Anaconda environment, open a terminal and:

Create an environment: conda create -n imagen python=3.9. This command created an environment named imagen that uses version 3.9 of Python.
Activate the environment: conda activate imagen.
Install the Google generative AI package: pip install -q -U google-genai.
Install an image processing package to process the generated images: pip install pillow.
Install a package to load the API key: pip install python-dotenv

We can also install the packages directly without using Anaconda, but there’s a risk that some of the currently installed packages might conflict with some of the new packages or that our Python installation uses a different version of Python. Using Anaconda avoids these potential problems.

Generating our first image

We're now ready to start using Imagen 3. Create a new Python script, for example, gen_image.py, in the same folder as the .env. file.

First, we import the necessary packages:

# Google generative AI:
from google import genai
from google.genai import types

# Packages to process the generated image:
from PIL import Image
from io import BytesIO

# Packages to load the .env file:
import os
from dotenv import load_dotenv

Next, we load the API key from the .env file:

load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")

Then, we initialize the Google generative AI client. This is the object that allows us to communicate with the Google API:

client = genai.Client(api_key=api_key)

To generate an image, we use the client.models.generate_images() function:

prompt="""
A dog surfing at the beach
"""
response = client.models.generate_images(
    model="imagen-3.0-generate-002",
    prompt=prompt,
    config=types.GenerateImagesConfig(
        number_of_images=1,
    )
)

Finally, we display the generated image using the Image object:

for generated_image in response.generated_images:
  image = Image.open(BytesIO(generated_image.image.image_bytes))
  image.show()

Here's the result:

Text generation

One of the interesting features of Imagen 3, when compared to other text-to-image model,s is the ability to generate text. Let’s try it out by having it generate the word “Tea” using fresh tea leaves:

prompt="""
Word "tea" made from fresh tea leaves, white background
"""
response = client.models.generate_images(
    model="imagen-3.0-generate-002",
    prompt=prompt,
)

Here’s the result:

Image Generation Options With Imagen 3

The image generation options are provided using the types.GenerateImagesConfig options. In the above example, we only specified the number of images to generate:

config=types.GenerateImagesConfig(
    number_of_images=1,
)

In this section, we explore the other options the Imagen 3 API provides. Check the official documentation for more information.

Generating multiple images

We can use the number_of_images parameter to generate multiple images with a single prompt. By default, four images are generated.

Let's try generating two images for a comic book.

prompt="""
Single comic book panel of two people overlooking a destroyed city. 
A speech bubble points from one of them and says: I guess this is the end.
"""
response = client.models.generate_images(
    model="imagen-3.0-generate-002",
    prompt=prompt,
    config=types.GenerateImagesConfig(
        number_of_images=2,
    )
)

Here's the result:

This is another example of generating text in images. Even though the image on the left has some extra unwanted text, because of the ability to generate several images we can increase our chances of getting the desired output. The image on the right almost completely complies with our prompt.

Controlling the aspect ratio

By default, the generated images are squares with a 1:1 aspect ratio. The model supports the following aspect ratios: 1:1, 3:4, 4:3, 9:16, and 16:9.

Let's generate a 9:16 image to use as a phone background:

prompt="""
A drone shot of a river flowing between mountains with a stormy sky.
"""
response = client.models.generate_images(
    model="imagen-3.0-generate-002",
    prompt=prompt,
    config=types.GenerateImagesConfig(
        aspect_ratio="9:16",
    )
)

Here's the result:

Safety level filter

The documentation mentions that we can use the safety_filter_level to specify the image filtering level. Each generated image gets a probability score that measures the probability that the image is unsafe (for example, inappropriate content).

Setting the safety level filter is important because it helps ensure the generated content is appropriate and aligns with user preferences, thereby maintaining a safe and respectful environment for various applications.

The documentation says that it supports three levels:

BLOCK_LOW_AND_ABOVE: Block image even with a low probability score.
BLOCK_MEDIUM_AND_ABOVE : Block only images with medium and high probability scores.
BLOCK_ONLY_HIGH : Block only images with a high probability score

However, after experimentation, the API now only supports the BLOCK_LOW_AND_ABOVE option. Providing anything else will result in an error.

Person generation

We can control whether the model is allowed to generate people using the person_generation option. It provides two options:

DONT_ALLOW: Images with people in them will be blocked.
ALLOW_ADULT: This will allow us to generate images with people (adults only) in them.

The default option is to allow people. For example, if we set the option to not allow people and try to generate an image of someone cooking, we won't get any images.

prompt="""
A person cooking.
"""
response = client.models.generate_images(
    model="imagen-3.0-generate-002",
    prompt=prompt,
    config=types.GenerateImagesConfig(
        person_generation="DONT_ALLOW",
    )
)
print(response.generated_images)

Here’s the output:

None

Making Good Prompt for Imagen 3

The official documentation provides a comprehensive prompt guide for Imagen 3, so I won't repeat it here. Here are the key ideas on how to make a good prompt:

Prompt writing basics:

Subject: Focus on the main object, person, or scene you want to depict.
Context and background: Describe the setting or environment where the subject is placed.
Style: Specify the artistic or photographic style you want (e.g., sketch, painting, photograph).

Use descriptive language: Employ detailed adjectives and context to clarify the desired output.
Reference specific styles: Use well-known artists or art movements to guide the aesthetic.
Text in images: Limit text to 25 characters or less and use distinct phrases to provide additional information.
Prompt parameterization: If we need to generate multiple images in the same style, it's a good idea to create a reusable prompt template and only provide as input the part that changes.
Photography: Specify camera settings, lens types, and lighting to influence the result.
Art: Use descriptors like "a painting of..." or specific techniques like "watercolor painting of...".
Image Quality Modifiers: Use keywords like "high-quality" or "4K" to improve the output's quality.
Photorealistic Images: Include technical details such as lens type and focal lengths to enhance realism.

Image Editing and Customization

Imagen 3 also has the capability of editing images and performing image customization. Unfortunately, these features are still locked and only accessible to specific users.

For example, the customization feature allows us to customize a reference image from a prompt. In the example, a photo of a woman is given, and the prompt modifies that image to one of the same person holding oranges.

More information on these and the access request form can be found on their website.

Conclusion

In this tutorial, we learned how to use Imagen 3 to create images using Python and Google's Generative API. Overall, I’m pleased with the output I got while experimenting with the API. I feel the results are high quality and contain fewer AI artifacts, making it harder to distinguish them from real images.

The ability to work with text in images is useful for branding and marketing. Overall, I feel this model performs well. I just wish all of its features were open because I think that image editing is even more useful than image generation.

Author

François Aubry

Full-stack engineer & founder at CheapGPT. Teaching has always been my passion. From my early days as a student, I eagerly sought out opportunities to tutor and assist other students. This passion led me to pursue a PhD, where I also served as a teaching assistant to support my academic endeavors. During those years, I found immense fulfillment in the traditional classroom setting, fostering connections and facilitating learning. However, with the advent of online learning platforms, I recognized the transformative potential of digital education. In fact, I was actively involved in the development of one such platform at our university. I am deeply committed to integrating traditional teaching principles with innovative digital methodologies. My passion is to create courses that are not only engaging and informative but also accessible to learners in this digital age.

Topics

Artificial Intelligence

Large Language Models

Learn AI with these courses!

Track

Developing AI Applications

0 min

Learn to create AI-powered applications with the latest AI developer tools, including the OpenAI API, Hugging Face, and LangChain.

See Details

Start Course

Course

Developing LLM Applications with LangChain

3 hr

35K

Discover how to build AI-powered applications using LLMs, prompts, chains, and agents in LangChain.

See Details

Start Course

Course

Fine-Tuning with Llama 3

2 hr

2.9K

Fine-tune Llama for custom tasks using TorchTune, and learn techniques for efficient fine-tuning such as quantization.

See Details

Start Course

Tutorial

Introducing Google Gemini API: Discover the Power of the New Gemini AI Models

Learn how to use Gemini Python API and its various functions to build AI-enabled applications for free.

Abid Ali Awan

Tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.

François Aubry

Tutorial

Gemini 1.5 Pro API Tutorial: Getting Started With Google's LLM

To connect to the Gemini 1.5 Pro API, obtain your API key from Google AI for Developers, install the necessary Python libraries, and send requests and receive responses from the Gemini 1.5 Pro model.

Natasha Al-Khatib

Tutorial

Building Multimodal AI Application with Gemini 2.0 Pro

Build a chat app that can understand text, images, audio, and documents, as well as execute Python code. Truly a multimodal application closer to AGI.

Abid Ali Awan

Man getting out of a monitor to represent the realistic images generated with Flux AI

Tutorial

Flux AI Image Generator: A Guide With Examples

Learn how to use Flux AI to generate images and explore the features, applications, and use cases of each model in the Flux family: Flux Pro, Flux Dev, and Flux Schnell.

Bhavishya Pandit

Tutorial

Beginner's Guide to Google's Vision API in Python

Learn what Vision API is and what are all the things that it offers. By the end of this tutorial, you will also learn how you can call Vision API from your Python code.

Sayak Paul

See More See More

Develop AI Applications

Setting Up the Google Generative AI API

Creating a Google Cloud project

Creating the API key

Billing account

Generating an Image with Imagen 3 Using Python

Environment setup with Anaconda

Generating our first image

Text generation

Image Generation Options With Imagen 3

Generating multiple images

Controlling the aspect ratio

Safety level filter

Person generation

Making Good Prompt for Imagen 3

Image Editing and Customization

Conclusion

Introducing Google Gemini API: Discover the Power of the New Gemini AI Models

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini 1.5 Pro API Tutorial: Getting Started With Google's LLM

Building Multimodal AI Application with Gemini 2.0 Pro

Flux AI Image Generator: A Guide With Examples

Beginner's Guide to Google's Vision API in Python

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Developing AI Applications

Developing LLM Applications with LangChain

Fine-Tuning with Llama 3

Introducing Google Gemini API: Discover the Power of the New Gemini AI Models

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini 1.5 Pro API Tutorial: Getting Started With Google's LLM

Building Multimodal AI Application with Gemini 2.0 Pro

Flux AI Image Generator: A Guide With Examples

Beginner's Guide to Google's Vision API in Python

Developing AI Applications