{"id":195793,"date":"2025-03-15T11:33:03","date_gmt":"2025-03-15T16:33:03","guid":{"rendered":"https:\/\/narcolepticnerd.com\/2025\/03\/15\/introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks\/"},"modified":"2025-03-15T11:33:03","modified_gmt":"2025-03-15T16:33:03","slug":"introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks","status":"publish","type":"post","link":"https:\/\/narcolepticnerd.com\/2025\/03\/15\/introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks\/","title":{"rendered":"Introducing PaliGemma 2 mix: A vision-language model for multiple tasks"},"content":{"rendered":"

<\/p>\n

\n

This past December, we launched PaliGemma 2<\/a>, an upgraded vision-language model in the Gemma<\/a> family. The release included pretrained checkpoints of different sizes (3B, 10B, and 28B parameters) that can be easily fine-tuned on a wide range of vision-language tasks and domains, such as image segmentation, short video captioning, scientific question answering and text-related tasks with high performance.<\/p>\n

Now, we\u2019re thrilled to announce the launch of PaliGemma 2 mix checkpoints. PaliGemma 2 mix are models tuned to a mixture of tasks that allow directly exploring the model capabilities and using it out-of-the-box for common use cases.<\/p>\n

What\u2019s new in PaliGemma 2 mix?<\/b><\/h2>\n