Change theme
Help
Press space for more information.
Show links for this issue (Shortcut: i, l)
Copy issue ID
Previous Issue (Shortcut: k)
Next Issue (Shortcut: j)
Sign in to use full features.
Vote: I am impacted
Notification menu
Refresh (Shortcut: Shift+r)
Go home (Shortcut: u)
Pending code changes (auto-populated)
View issue level access limits(Press Alt + Right arrow for more information)
Request for new functionality
View staffing
Description
Problem you have encountered: When using the vertexai interface for gemini from python, it is possible to specify response_logprobs (boolean indicating whether to return logprobs) and logprobs (number of logprobs to return per generated token, on [0,5]). gemini-1.5-flash does this, but gemini-2.0-flash returns only the first few tokens and then quits.
What you expected to happen: I expect the model to return one chosen_logprobs and top_logprobs entry per generated token in gemini-2.0-flash.
Steps to reproduce:
If I replace gemini-2.0-flash with gemini-1.5-flash, I instead get logprobs for the full response.
Other information (workarounds you have tried, documentation consulted, etc): I'm using the grpc transport method on Python 3.10. Here are some library versions: