WAI
Status Update
Comments
gs...@google.com <gs...@google.com> #2
fi...@fexco.com <fi...@fexco.com> #3
Hi, this error only affects Cloud SDK version 186. It was previously reported in Issue 72407295 and a fix for it should be released in Cloud SDK version 187.
In the mean time, you can downgrade to Cloud SDK version 185 as a workaround by running the following command:
gcloud components update --version 185.0.0
In the mean time, you can downgrade to Cloud SDK version 185 as a workaround by running the following command:
gcloud components update --version 185.0.0
gs...@google.com <gs...@google.com> #4
What is the real-life - naturally arising out of necessity - use-case requiring two identical exports at the same time to the same bucket? You mention you wanted to see what needs to be done to protect against two gcloud.beta.firestore.export operations overwriting each other's output, in case the export job got run twice at the same time. How do you imagine these two operations being generated? Exports are initiated by a user, who may choose to export simultaneously to the same bucket, or not.
Have you checked if any of the two identical exports at same time to the same bucket succeeded? As yet, our information is limited to "probably producing a corrupt export".
Have you checked if any of the two identical exports at same time to the same bucket succeeded? As yet, our information is limited to "probably producing a corrupt export".
fi...@fexco.com <fi...@fexco.com> #5
> What is the real-life - naturally arising out of necessity - use-case requiring two identical exports at the same time to the same bucket?
There is none. We currently trigger exports from a gke cron job and I wanted to make sure that nothing weird happens if somebody triggered an export manually at the same time.
Because of the behavior shown in #1 I went with generating unique export paths to avoid this risk.
> Have you checked if any of the two identical exports at same time to the same bucket succeeded?
They did (as pasted in #1), I have not examined the resulting export for correctness because of the workaround with unique paths.
I have filed this bug to let you know that there is a potential race condition around checking the existence of the output path and perhaps you care - this is currently not an issue for us, I have worked around it.
There is none. We currently trigger exports from a gke cron job and I wanted to make sure that nothing weird happens if somebody triggered an export manually at the same time.
Because of the behavior shown in #1 I went with generating unique export paths to avoid this risk.
> Have you checked if any of the two identical exports at same time to the same bucket succeeded?
They did (as pasted in #1), I have not examined the resulting export for correctness because of the workaround with unique paths.
I have filed this bug to let you know that there is a potential race condition around checking the existence of the output path and perhaps you care - this is currently not an issue for us, I have worked around it.
gs...@google.com <gs...@google.com> #6
Glad to read you have found a workaround, for now. Thank you for pointing out this possible source of errors, we are most grateful. I'll keep on investigating the issue, to determine if a race condition exists, and keep you informed.
fi...@fexco.com <fi...@fexco.com> #7
Differences of the output file listings and sizes for single and concurrent operation:
============================================
$ gcloud beta firestore export gs://bucket/single --project project
Waiting for [projects/project/databases/(default)/operations/ASA3MzAwMzI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':type.googleapis.com/google.firestore.admin.v1beta1.ExportDocumentsMetadata
operationState: PROCESSING
outputUriPrefix: gs://bucket/single
startTime: '2019-09-19T13:15:15.497590Z'
name: projects/project/databases/(default)/operations/ASA3MzAwMzI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
$ gcloud beta firestore export gs://bucket/concurrent --project project & gcloud beta firestore export gs://bucket/concurrent --project project
[2] 5472
Waiting for [projects/project/databases/(default)/operations/ASA3MzAwMDMyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':type.googleapis.com/google.firestore.admin.v1beta1.ExportDocumentsMetadata
operationState: PROCESSING
outputUriPrefix: gs://bucket/concurrent
startTime: '2019-09-19T13:17:02.150034Z'
name: projects/project/databases/(default)/operations/ASA3MzAwMDMyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
Waiting for [projects/project/databases/(default)/operations/ASA1MzAwOTIyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':type.googleapis.com/google.firestore.admin.v1beta1.ExportDocumentsMetadata
operationState: PROCESSING
outputUriPrefix: gs://bucket/concurrent
startTime: '2019-09-19T13:17:02.078917Z'
name: projects/project/databases/(default)/operations/ASA1MzAwOTIyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
[2]- Done gcloud beta firestore export gs://bucket/concurrent --project project
$ gsutil du gs://bucket/single
98 gs://bucket/single/single.overall_export_metadata
139 gs://bucket/single/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata
327680 gs://bucket/single/all_namespaces/all_kinds/output-0
360448 gs://bucket/single/all_namespaces/all_kinds/output-1
458752 gs://bucket/single/all_namespaces/all_kinds/output-2
98304 gs://bucket/single/all_namespaces/all_kinds/output-3
589824 gs://bucket/single/all_namespaces/all_kinds/output-4
1048576 gs://bucket/single/all_namespaces/all_kinds/output-5
163840 gs://bucket/single/all_namespaces/all_kinds/output-6
163840 gs://bucket/single/all_namespaces/all_kinds/output-7
3211403 gs://bucket/single/all_namespaces/all_kinds/
3211403 gs://bucket/single/all_namespaces/
3211501 gs://bucket/single/
$ gsutil du gs://bucket/concurrent
98 gs://bucket/concurrent/concurrent.overall_export_metadata
98 gs://bucket/concurrent/concurrent_1.overall_export_metadata
155 gs://bucket/concurrent/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata
98304 gs://bucket/concurrent/all_namespaces/all_kinds/output-0
1048576 gs://bucket/concurrent/all_namespaces/all_kinds/output-1
327680 gs://bucket/concurrent/all_namespaces/all_kinds/output-2
458752 gs://bucket/concurrent/all_namespaces/all_kinds/output-3
589824 gs://bucket/concurrent/all_namespaces/all_kinds/output-4
360448 gs://bucket/concurrent/all_namespaces/all_kinds/output-5
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output-6
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output-7
1048576 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-0
589824 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-1
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-2
458752 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-3
360448 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-4
98304 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-5
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-6
327680 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-7
6422683 gs://bucket/concurrent/all_namespaces/all_kinds/
6422683 gs://bucket/concurrent/all_namespaces/
6422879 gs://bucket/concurrent/
$
============================================
There seem to be some awareness of concurrent operations, as the file set for the concurrent operations has some files named with additional _1 in them. Not all though - there is one metadata export file and it refers only the files with _1:
============================================
$ gsutil cat gs://bucket/concurrent/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata | strings
export_entities
__all__
output_1-0
output_1-1
output_1-2
output_1-3
output_1-4
output_1-5
output_1-6
output_1-7
__all__
$
============================================
============================================
$ gcloud beta firestore export gs://bucket/single --project project
Waiting for [projects/project/databases/(default)/operations/ASA3MzAwMzI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/single
startTime: '2019-09-19T13:15:15.497590Z'
name: projects/project/databases/(default)/operations/ASA3MzAwMzI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
$ gcloud beta firestore export gs://bucket/concurrent --project project & gcloud beta firestore export gs://bucket/concurrent --project project
[2] 5472
Waiting for [projects/project/databases/(default)/operations/ASA3MzAwMDMyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/concurrent
startTime: '2019-09-19T13:17:02.150034Z'
name: projects/project/databases/(default)/operations/ASA3MzAwMDMyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
Waiting for [projects/project/databases/(default)/operations/ASA1MzAwOTIyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/concurrent
startTime: '2019-09-19T13:17:02.078917Z'
name: projects/project/databases/(default)/operations/ASA1MzAwOTIyMDMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
[2]- Done gcloud beta firestore export gs://bucket/concurrent --project project
$ gsutil du gs://bucket/single
98 gs://bucket/single/single.overall_export_metadata
139 gs://bucket/single/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata
327680 gs://bucket/single/all_namespaces/all_kinds/output-0
360448 gs://bucket/single/all_namespaces/all_kinds/output-1
458752 gs://bucket/single/all_namespaces/all_kinds/output-2
98304 gs://bucket/single/all_namespaces/all_kinds/output-3
589824 gs://bucket/single/all_namespaces/all_kinds/output-4
1048576 gs://bucket/single/all_namespaces/all_kinds/output-5
163840 gs://bucket/single/all_namespaces/all_kinds/output-6
163840 gs://bucket/single/all_namespaces/all_kinds/output-7
3211403 gs://bucket/single/all_namespaces/all_kinds/
3211403 gs://bucket/single/all_namespaces/
3211501 gs://bucket/single/
$ gsutil du gs://bucket/concurrent
98 gs://bucket/concurrent/concurrent.overall_export_metadata
98 gs://bucket/concurrent/concurrent_1.overall_export_metadata
155 gs://bucket/concurrent/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata
98304 gs://bucket/concurrent/all_namespaces/all_kinds/output-0
1048576 gs://bucket/concurrent/all_namespaces/all_kinds/output-1
327680 gs://bucket/concurrent/all_namespaces/all_kinds/output-2
458752 gs://bucket/concurrent/all_namespaces/all_kinds/output-3
589824 gs://bucket/concurrent/all_namespaces/all_kinds/output-4
360448 gs://bucket/concurrent/all_namespaces/all_kinds/output-5
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output-6
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output-7
1048576 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-0
589824 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-1
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-2
458752 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-3
360448 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-4
98304 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-5
163840 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-6
327680 gs://bucket/concurrent/all_namespaces/all_kinds/output_1-7
6422683 gs://bucket/concurrent/all_namespaces/all_kinds/
6422683 gs://bucket/concurrent/all_namespaces/
6422879 gs://bucket/concurrent/
$
============================================
There seem to be some awareness of concurrent operations, as the file set for the concurrent operations has some files named with additional _1 in them. Not all though - there is one metadata export file and it refers only the files with _1:
============================================
$ gsutil cat gs://bucket/concurrent/all_namespaces/all_kinds/all_namespaces_all_kinds.export_metadata | strings
export_entities
__all__
output_1-0
output_1-1
output_1-2
output_1-3
output_1-4
output_1-5
output_1-6
output_1-7
__all__
$
============================================
gs...@google.com <gs...@google.com> #8
The issue is reproducible to some extent: repeating exactly the same command at time intervals results in the "path already exists" error, as you noticed as well.
Repeating two exports in one operation, the way you did with "gcloud beta firestore export gs://bucket/concurrent --project project & gcloud beta firestore export gs://bucket/concurrent --project project" results in two valid, non-corrupted exports. The system prefixes the file resulting from the second part of the export command: "& gcloud beta firestore export" with an underscore and a post-fix number "_1". The two exports, let's say output-0 and output_1-0, are identical and of the same size. Your fear of corrupted exports is understandable and well-justified, but not happening practically in the end.
Repeating two exports in one operation, the way you did with "gcloud beta firestore export gs://bucket/concurrent --project project & gcloud beta firestore export gs://bucket/concurrent --project project" results in two valid, non-corrupted exports. The system prefixes the file resulting from the second part of the export command: "& gcloud beta firestore export" with an underscore and a post-fix number "_1". The two exports, let's say output-0 and output_1-0, are identical and of the same size. Your fear of corrupted exports is understandable and well-justified, but not happening practically in the end.
Description
===================================
$ gcloud beta firestore export gs://bucket/iamtest3 --project project
Waiting for [projects/project/databases/(default)/operations/ASA1MTAwNTI2MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/iamtest3
startTime: '2019-09-18T12:33:28.786968Z'
name: projects/project/databases/(default)/operations/ASA1MTAwNTI2MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
$ gcloud beta firestore export gs://bucket/iamtest3 --project project
ERROR: (gcloud.beta.firestore.export) INVALID_ARGUMENT: Path already exists: /bucket/iamtest3/iamtest3.overall_export_metadata
$
===================================
But when I execute the same command at the ~same time, the check does not seem effective, probably producing a corrupt export:
===================================
$ gcloud beta firestore export gs://bucket/contest --project project & gcloud beta firestore export gs://bucket/contest --project project
[1] 66747
Waiting for [projects/project/databases/(default)/operations/ASA3MTAwNTI2MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/contest
startTime: '2019-09-18T13:00:56.373063Z'
name: projects/project/databases/(default)/operations/ASA3MTAwNTI2MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
Waiting for [projects/project/databases/(default)/operations/ASA2MTAwNDI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS] to finish...done.
metadata:
'@type':
operationState: PROCESSING
outputUriPrefix: gs://bucket/contest
startTime: '2019-09-18T13:00:56.264588Z'
name: projects/project/databases/(default)/operations/ASA2MTAwNDI0MTMJGnRsdWFmZWQHEjJ3LXVlLXNib2otbmltZGEQCigS
[1]+ Done gcloud beta firestore export gs://bucket/contest --project project
$
===================================
I would expect one of the ~concurrent export operations to fail - currently it seems that the "path already exists" check is not effective in such circumstances.