Status Update
Comments
me...@appinventiv.com <me...@appinventiv.com> #2
Hello,
I understand your issue is that the upload using the Python libraries to GS buckets and loading a table to BigQuery is comparatively slower than the upload using the commands gsutil and bq. Please let me know if I have misunderstood.
Let me clarify that the comparision should be done this way: Python's blob.upload_from_file() vs gsutil command and Python's load_table_from_file() vs bq command. Once that is clear, I would like to ask you for the codes that you use so I can reproduce the situation myself and get further insights. Please remove all the personal information from your codes before sharing them.
I will wait for your response,
Manuel Alaman
Google Cloud Big Data Support Barcelona
I understand your issue is that the upload using the Python libraries to GS buckets and loading a table to BigQuery is comparatively slower than the upload using the commands gsutil and bq. Please let me know if I have misunderstood.
Let me clarify that the comparision should be done this way: Python's blob.upload_from_file() vs gsutil command and Python's load_table_from_file() vs bq command. Once that is clear, I would like to ask you for the codes that you use so I can reproduce the situation myself and get further insights. Please remove all the personal information from your codes before sharing them.
I will wait for your response,
Manuel Alaman
Google Cloud Big Data Support Barcelona
ni...@google.com <ni...@google.com> #3
You are correct.
Attached python script will generate a test csv file and conduct the python client test. Please find and replace all occurrences of `UPDATE_THIS` text.
It also has the DDL query you'll need to use to create the BQ table before you run the script.
Additionally, it has the exact bq command you'll need to test the bq CLI utility against the same file.
I just tested again after creating this using python 3.6.9, google-cloud-bigquery 2.20.0, and BigQuery CLI 2.0.69 (most recent versions). I still see the same performance difference (~ 4MBps upload from the python client, vs ~70MBps upload for the same file to the same table using BigQuery CLI.
Let me know if you need anything else.
Attached python script will generate a test csv file and conduct the python client test. Please find and replace all occurrences of `UPDATE_THIS` text.
It also has the DDL query you'll need to use to create the BQ table before you run the script.
Additionally, it has the exact bq command you'll need to test the bq CLI utility against the same file.
I just tested again after creating this using python 3.6.9, google-cloud-bigquery 2.20.0, and BigQuery CLI 2.0.69 (most recent versions). I still see the same performance difference (~ 4MBps upload from the python client, vs ~70MBps upload for the same file to the same table using BigQuery CLI.
Let me know if you need anything else.
er...@google.com <er...@google.com> #4
Hey there any update on this?
er...@google.com <er...@google.com> #5
Hi Kevin,
We are still investigating the issue. At this point we obtained [1] for the script and [2] for the bq command, where the “Upload complete” was achieved in about 11 seconds.
Further updates will be published here.
[1]
2021-06-30 06:55:01,496 root test_uploads INFO: Beginning load job...
2021-06-30 06:57:08,662 root test_uploads INFO: Job ID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
2021-06-30 06:57:08,662 root test_uploads INFO: BQ load job complete without error!
[2]
Upload complete.
Waiting on bqjob_XXXXXXXXXXXXXXXXX_XXXXXXXXXXXXXXXX_X ... (48s) Current status: DONE
We are still investigating the issue. At this point we obtained [1] for the script and [2] for the bq command, where the “Upload complete” was achieved in about 11 seconds.
Further updates will be published here.
[1]
2021-06-30 06:55:01,496 root test_uploads INFO: Beginning load job...
2021-06-30 06:57:08,662 root test_uploads INFO: Job ID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
2021-06-30 06:57:08,662 root test_uploads INFO: BQ load job complete without error!
[2]
Upload complete.
Waiting on bqjob_XXXXXXXXXXXXXXXXX_XXXXXXXXXXXXXXXX_X ... (48s) Current status: DONE
me...@appinventiv.com <me...@appinventiv.com> #6
Hi there, has there been any progress on this? Should I move this over to an Issue at https://github.com/googleapis/google-cloud-python ?
Description
CAMERAX VERSION (ex - 1.0.0-alpha09)
CAMERA APPLICATION NAME AND VERSION: (camerax)
ANDROID OS BUILD NUMBER: (Settings > About > Build number)
DEVICE NAME: (Samsung, redmi)
DESCRIPTION:
LIST ANY EXPERIMENTAL FEATURES: (As an example - @ExperimentalCamera2Interop)
STEPS TO REPRODUCE:
1.
2.
3.
OBSERVED RESULTS:
EXPECTED RESULTS:
REPRODUCIBILITY: (5 of 5, 1 of 100, etc)
ADDITIONAL INFORMATION:
CODE FRAGMENTS (this will help us troubleshoot your issues):