Status Update
Comments
hj...@google.com <hj...@google.com> #2
Hi Josh, friendly ping for an update on progress
ad...@extendcomm.com <ad...@extendcomm.com> #3
i think the TL;DR is that this is stuck behind investigating the current lab outages, which are more urgent.
pending CL is
hj...@google.com <hj...@google.com> #4
ad...@extendcomm.com <ad...@extendcomm.com> #5
hj...@google.com <hj...@google.com> #6
Hmm, gerrit-watcher didn't post the CLs on this bug, so I guess I'll do so manually:
Throughput numbers from a device:
none brotli lz4
USB 3.0 120 110 190
USB 2.0 38 75 63
I'm seeing identical throughput with brotli quality 0 and 1 (110MB/s end to end), although there's some giant low hanging fruit to compress multiple files at once.
zstd isn't in the tree yet (I started the ball rolling on that, but no idea when it'll end up getting merged), but I'll take a look at it eventually, probably when I get around to implementing compression for generic streams in adb (
This is probably good enough to call this fixed for now.
se...@google.com <se...@google.com> #8
For now, I close the case, however feel free to create another one in case you'll want to move forward.
ad...@extendcomm.com <ad...@extendcomm.com> #9
It seems like gsutil uses the one under the platform folder, and not the one under the lib folder as the errors still occur but the errors stop if I apply the patch to the transfer.py under the platform folder.
Description
Please provide as much information as possible. At least, this should include a description of your issue and steps to reproduce the problem. If possible please provide a summary of what steps or workarounds you have already tried, and any docs or articles you found (un)helpful.
Problem you have encountered:
When uploading an email (.eml) file containing non-ascii characters in it's content, gsutil auto-detects the Content-Type as message/rfc822 but then fails encoding the file contents:
Copying file://test.eml [Content-Type=message/rfc822]...
'ascii' codec can't encode character '\udca0' in position 2281: ordinal not in range(128)
CommandException: 1 file/object could not be transferred.
What you expected to happen:
File should upload regardless of its contents encoding.
Steps to reproduce:
gsutil cp test.eml gs://<bucket>
Copying file://test.eml [Content-Type=message/rfc822]...
Traceback (most recent call last):
File "/usr/src/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
gsutil.RunMain()
File "/usr/src/google-cloud-sdk/platform/gsutil/gsutil.py", line 122, in RunMain
sys.exit(gslib.__main__.main())
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 444, in main
user_project=user_project)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 780, in _RunNamedCommandAndHandleExceptions
_HandleUnknownFailure(e)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 639, in _RunNamedCommandAndHandleExceptions
user_project=user_project)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
return_code = command_inst.RunCommand()
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1121, in RunCommand
seek_ahead_iterator=seek_ahead_iterator)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1525, in Apply
arg_checker, should_return_results, fail_on_error)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1596, in _SequentialApply
worker_thread.PerformTask(task, self)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2316, in PerformTask
results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 706, in _CopyFuncWrapper
preserve_posix=cls.preserve_posix_attrs)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 921, in CopyFunc
preserve_posix=preserve_posix)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3948, in PerformCopy
gzip_encoded=gzip_encoded)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2239, in _UploadFileToObject
parallel_composite_upload, logger)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2055, in _DelegateUploadFileToObject
elapsed_time, uploaded_object = upload_delegate()
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2216, in CallNonResumableUpload
gzip_encoded=gzip_encoded_file)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 1751, in _UploadFileToObjectNonResumable
gzip_encoded=gzip_encoded)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 388, in UploadObject
gzip_encoded=gzip_encoded)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1712, in UploadObject
gzip_encoded=gzip_encoded)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1534, in _UploadObject
global_params=global_params)
File "/usr/src/google-cloud-sdk/platform/gsutil/gslib/third_party/storage_apitools/storage_v1_client.py", line 1182, in Insert
upload=upload, upload_config=upload_config)
File "/usr/src/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 703, in _RunMethod
download)
File "/usr/src/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 679, in PrepareHttpRequest
upload.ConfigureRequest(upload_config, http_request, url_builder)
File "/usr/src/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 763, in ConfigureRequest
self.__ConfigureMultipartRequest(http_request)
File "/usr/src/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 823, in __ConfigureMultipartRequest
g.flatten(msg_root, unixfrom=False)
File "/usr/lib64/python3.6/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/lib64/python3.6/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib64/python3.6/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib64/python3.6/email/generator.py", line 272, in _handle_multipart
g.flatten(part, unixfrom=False, linesep=self._NL)
File "/usr/lib64/python3.6/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/lib64/python3.6/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib64/python3.6/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib64/python3.6/email/generator.py", line 361, in _handle_message
payload = self._encode(payload)
File "/usr/lib64/python3.6/email/generator.py", line 412, in _encode
return s.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\udca0' in position 2281: ordinal not in range(128)
Other information (workarounds you have tried, documentation consulted, etc):
Works if you add -h Content-Type:application/octet-stream to the gsutil command. I don't want to do that as I'm uploading muitiple folders with thousands of different types of files and this will apply the metadata to all of them.
Was not a problem in gcloud version 293.0.0, broke sometime between 293 and 325.0.0 and still broken in 326.0.0