fix(models): preserve media blocks in _flatten_ollama_content#5296
Open
saivedant169 wants to merge 1 commit intogoogle:mainfrom
Open
fix(models): preserve media blocks in _flatten_ollama_content#5296saivedant169 wants to merge 1 commit intogoogle:mainfrom
saivedant169 wants to merge 1 commit intogoogle:mainfrom
Conversation
_flatten_ollama_content was stripping image_url, video_url, and audio_url blocks when flattening multipart content for ollama_chat. This meant LiteLLM never received the image data, so Ollama's native images field was always empty. The fix checks for media block types before flattening. When any media block is present, the full multipart list is returned so LiteLLM can convert it to Ollama's format. Text-only content is still flattened to a plain string as before. Fixes google#4975
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #4975
Problem
When sending multimodal messages (text + image) through the
/runendpoint with anollama_chatmodel, the image data is silently dropped. The model responds as if no image was attached, and LiteLLM's debug output showsimages: [].The root cause is in
_flatten_ollama_content()(src/google/adk/models/lite_llm.py). It iterates multipart content blocks and keeps onlytype == "text"entries, joining them into a plain string. Anyimage_url,video_url, oraudio_urlblocks are discarded before LiteLLM ever sees them.LiteLLM's Ollama handler already knows how to convert multipart arrays containing
image_urlblocks into Ollama's nativeimagesfield. But it needs the list to do that — once ADK flattens everything to a string, the conversion path is unreachable.Fix
Before flattening to a string, check whether any block has a media type (
image_url,video_url,audio_url). If so, return the original list instead of flattening. Text-only content is still flattened to a plain string for compatibility.The return type of
_flatten_ollama_contentchanges fromstr | NonetoOpenAIMessageContent | str | Noneto reflect that it may now return the list unchanged.Testing plan
Unit tests (all pass):
test_flatten_ollama_content_preserves_image_url_blocks— image-only content returns as listtest_flatten_ollama_content_preserves_mixed_text_and_image— text + image returns full listtest_flatten_ollama_content_preserves_video_url_blocks— video_url also preservedtest_flatten_ollama_content_serializes_non_media_non_text_blocks_to_json— unknown types still serialize to JSONtest_generate_content_async_ollama_chat_preserves_multimodal_content— integration test confirms multimodal content reaches LiteLLM as a listtest_generate_content_async_custom_provider_preserves_multimodal— same for custom_llm_provider path