WAI
Status Update
Comments
na...@google.com <na...@google.com>
na...@google.com <na...@google.com> #2
Bernd, the above prompt doesn't make the test name clear but 1.5 model produces a function call (http://sherlog/_4a5ABHWPYs ) where as 2.0 model asks for clarification(http://sherlog/_3oICmIdJoZ ). Is this WAI?
ba...@google.com <ba...@google.com> #3
The prompt "Run a test on the cobalt project" doesn't include a test name, since cobalt is clearly labeled as the project name. The clarifying question is a good answer.
Tested the alternative prompts
- Run test cobalt
- Run the cobalt test
- Run the test cobalt
- Run the test named cobalt
- Run the test "cobalt"
- Run the test named "cobalt"
For all of them, the model makes a function call as expected.
Working as intended.
Description
Problem you have encountered:
When going through the regression tests for LangChain.js, running the function calling test for Gemini 2.0 would consistently fail while the identical test passes with the Gemini 1.5 models.
Here is the code. If necessary, I can see exactly what REST is being sent:
What you expected to happen:
The Gemini 1.5 models properly respond with function calling information with the parameter "testName" set to "cobalt". (As verified by the last line in the test.)
In general, I expected relatively vague phrases or human-like references to be handled well. What is most surprising is that these work well in the 1.5 models, but less well in 2.0.
What happened:
The test fails because there is text content saying "I need the name of the test that you want to run.". Attempting other variants on the prompt (such as "Run the cobalt test") gave similar results.
It isn't until I give it a very very specific prompt ("Run a test named cobalt") that it works.
Other information (workarounds you have tried, documentation consulted, etc):