Assigned
Status Update
Comments
xq...@google.com <xq...@google.com> #2
This feature request has been forwarded to the Data Fusion engineering team so that they may evaluate it. Note that there are no ETAs or guarantees of implementation for feature requests. All communication regarding this feature request is to be done here.
Description
Please describe your requested enhancement. Good feature requests will solve common problems or enable new use cases.
What you would like to accomplish:
The customer is experiencing an issue with their Dataflow job that uses the BigQuery storage streaming API (STORAGE_WRITE_API) to write from PubSub to BigQuery. The issue is that schema updates are not being registered, and the job has to be restarted to fetch the most recent BigQuery schema.
How this might work:
While using BQ storage streaming api (STORAGE_WRITE_API) to write from
PubSub to BigQuery and .withAutoSchemaUpdate(true) schema updates are not
registered and job has to be restarted to fetch most recent BQ schema. Job
is using dynamic destinations to write to BQ. I'm fetching correct schemas
as a side input every minute and passing them to storage write api dynamic
destinations. Still even after > 1h writes reject messages and mark them as
schema mismatch.
If applicable, reasons why alternative solutions are not sufficient:
The product team found the RCA and worked out a solution, you can also track it here:
Please note that if it's needed to use Beam 2.62.0, ETA will be in Jan, 2025
Other information (workarounds you have tried, documentation consulted, etc):