{"id":195793,"date":"2025-03-15T11:33:03","date_gmt":"2025-03-15T16:33:03","guid":{"rendered":"https:\/\/narcolepticnerd.com\/2025\/03\/15\/introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks\/"},"modified":"2025-03-15T11:33:03","modified_gmt":"2025-03-15T16:33:03","slug":"introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks","status":"publish","type":"post","link":"https:\/\/narcolepticnerd.com\/2025\/03\/15\/introducing-paligemma-2-mix-a-vision-language-model-for-multiple-tasks\/","title":{"rendered":"Introducing PaliGemma 2 mix: A vision-language model for multiple tasks"},"content":{"rendered":"
<\/p>\n
This past December, we launched PaliGemma 2<\/a>, an upgraded vision-language model in the Gemma<\/a> family. The release included pretrained checkpoints of different sizes (3B, 10B, and 28B parameters) that can be easily fine-tuned on a wide range of vision-language tasks and domains, such as image segmentation, short video captioning, scientific question answering and text-related tasks with high performance.<\/p>\n Now, we\u2019re thrilled to announce the launch of PaliGemma 2 mix checkpoints. PaliGemma 2 mix are models tuned to a mixture of tasks that allow directly exploring the model capabilities and using it out-of-the-box for common use cases.<\/p>\n If you were already using the original PaliGemma mix checkpoints, you can directly upgrade to PaliGemma 2 without needing to do any changes. The model performs different tasks depending on how it\u2019s prompted. You can review the different prompt task syntax in the official documentation<\/a> and learn more about how PaliGemma 2 was developed in our technical report<\/a>.<\/p>\n Result:<\/b> Optical Character Recognition (OCR)<\/b><\/p>\n<\/div>\n Result:<\/b> Result:<\/b><\/p>\n Ready to discover the potential of PaliGemma 2? Here is how you can explore the mix model capabilities:<\/p>\n While PaliGemma 2 mix has strong performance across multiple tasks, you will get the best results by fine-tuning PaliGemma 2 in your own task or domain. To learn how to do it, dive into our comprehensive documentation<\/a>, check our official example notebooks for Keras and JAX<\/a>, or use the Hugging Face transformers example<\/a>. We\u2019re looking forward to seeing what you build with it!<\/p>\n<\/div>\nWhat\u2019s new in PaliGemma 2 mix?<\/b><\/h2>\n
\n
\n
Detection<\/b><\/h3>\n\n
a cow standing on a beach next to a sign that says warning dangerous rip current.<\/code><\/p>\n
A cow standing on a beach next to a warning sign.<\/code><\/p>\n<\/div>\n
WARNING DANGEROUS<\/code><\/p>\n
RIP CURRENT<\/code><\/p>\n
Get Started Today<\/b><\/h2>\n\n
\n