Imagen 3 arrives in the Gemini API

narcolepticnerd March 16, 2025 0 comments

Developers can now access Imagen 3, Google’s state-of-the-art image generation model, through the Gemini API. The model will be initially accessible to paid users, with a rollout to the free tier coming soon.

Imagen 3 excels in producing visually appealing, artifact-free images in a wide variety of styles from hyperrealistic images to impressionistic landscapes, abstract compositions to anime characters. Improved prompt following makes it easy to convert great ideas into high-quality images. Overall, Imagen 3 achieves state-of-the-art performance on the variety of benchmarks. Imagen 3 achieves this while being priced at $0.03 per image on the Gemini API, with control over aspect ratios, the number of options to generate, and more.

To help combat misinformation and misattribution, all images generated by Imagen 3 include a non-visible digital SynthID watermark, identifying them as AI-generated.

See Imagen 3 in Action

The gallery below highlights Imagen 3’s capabilities across a range of styles.

Prompt: Group of people looking happy, natural light, 8k

Prompt: Hyperrealistic portrait of a person dressed in 1920s flapper fashion, vintage style, black and white photograph, elegant pose, 8k

Prompt: Imagine a close-up of a vintage watch. Generate a realistic depiction with a detailed mechanism

Prompt: Impressionistic landscape painting of a sunset over a field of sunflowers, vibrant colors, thick brushstrokes, inspired by Monet

Prompt: A surreal dreamscape featuring a giant tortoise with a lush forest growing on its back, floating through a starry sky, glowing mushrooms, bioluminescent plants, ethereal atmosphere

Prompt: Lifestyle image of freshly roasted coffee beans spilling out of a burlap sack onto a rustic wooden table, steam rising from a nearby cup of coffee, ‘Awaken Your Senses’ is written on the cup in cursive, warm and inviting atmosphere, morning sunlight, product photography

Prompt: Hyperrealistic portrait of a woman with piercing blue eyes, laughing, freckles, dramatic lighting, detailed skin texture, 8k

Prompt: A panoramic view of a majestic mountain range at dawn.

Prompt: Show a scene from a game where the player needs to find a specific object by looking into drawers in a messy desk.

Prompt: A cityscape painted in the style of Van Gogh, with swirling brushstrokes and vibrant colors.

Get Started with Imagen 3 in the Gemini API

This Python code snippet demonstrates how to generate an image with Imagen 3 using the Gemini API.

from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO

client = genai.Client(api_key='GEMINI_API_KEY')

response = client.models.generate_images(
    model='imagen-3.0-generate-002',
    prompt='a portrait of a sheepadoodle wearing cape',
    config=types.GenerateImagesConfig(
        number_of_images=1,
    )
)
for generated_image in response.generated_images:
  image = Image.open(BytesIO(generated_image.image.image_bytes))
  image.show()

You can explore more prompting advice and image styles in the Gemini API developer docs, with further details available on scores, methodology, and performance improvement in Appendix D of our updated technical report.

We are excited to take the first step of expanding availability of our generative media models into the Gemini API and plan to make more available in the near future so that developers can bridge generative media and language models together.

Source link

Category: software

Imagen 3 arrives in the Gemini API

See Imagen 3 in Action

Get Started with Imagen 3 in the Gemini API

Leave the first comment (Cancel Reply)