Problem you have encountered

I have a PostgreSQL instance in Google Cloud SQL in which PostGIS and PostGIS Raster extensions are installed. I am trying to query rasters that are stored in a private GCS bucket and registered as out-of-database. This requires that the GDAL Virtual File System driver is configured with access to GCS.

The configuration options are set using the postgis.gdal_vsi_options GUC variable. However, when attempting to set the necessary configuration I receive the following error:

WARNING:  'gs_access_key_id' is not a legal VSI network file option
ERROR:  invalid value for parameter "postgis.gdal_vsi_options": "GS_ACCESS_KEY_ID=xxxxxxxx"

As a result, I cannot access my raster data using the out-of-database access pattern.

What you expected to happen

I expect to be able to:

Successfully set the required GUCs:

SET postgis.enable_outdb_rasters='true';
SET postgis.gdal_enabled_drivers='ENABLE_ALL';
SET postgis.gdal_vsi_options='GS_ACCESS_KEY_ID=XXXX GS_SECRET_ACCESS_KEY=YYYY';

Make a successful query to retrieve data e.g.:

SELECT ST_Value(rast, 1, 1) from <table> limit 1;

Steps to reproduce

My Cloud SQL PostgreSQL instance has the following PostGIS and PostGIS Raster version information:

SELECT postgis_full_version();

POSTGIS="3.2.5 3.2.5" [EXTENSION] PGSQL="140" GEOS="3.5.0-CAPI-1.9.0 4392" PROJ="Rel. 6.3.1, February 10th, 2020" GDAL="GDAL 3.0.0dev, released 2018/99/99" LIBXML="2.9.10" LIBJSON="0.17" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" RASTER

Note the GDAL version: GDAL="GDAL 3.0.0dev.

I can register Cloud Optimised GeoTiffs that are stored in a private GCS Bucket as out-of-database rasters using the raster2pgsql utility. I run this from a VM on the same VPC as my database e.g.:

raster2pgsql -n filename -R -e -t 512x512 /vsigs/<bucket>/<path-to-cog.tif>  | psql -h <host> -p <port> -U <user> -d <database>

Following this step I can confirm that corresponding records exist in my database.

The next step is to query the raster data. This requires PostGIS to make a call to the out-of-database COGs using the VSI driver configuration.

I first set the required GUCs:

SET postgis.enable_outdb_rasters='true';
SET postgis.gdal_enabled_drivers='ENABLE_ALL';
SET postgis.gdal_vsi_options='GS_ACCESS_KEY_ID=XXXX GS_SECRET_ACCESS_KEY=YYYY';

At this point step 3 fails with:

WARNING:  'gs_access_key_id' is not a legal VSI network file option
ERROR:  invalid value for parameter "postgis.gdal_vsi_options": "GS_ACCESS_KEY_ID=xxxxxxxx"

As a result, the query SELECT ST_Value(rast, 1, 1) from <table> limit 1; fails with:

rt_band_load_offline_data: Cannot open offline raster: /vsigs ...

Note: Sometimes subsequent attempts to set the postgis.gdal_vsi_options will fail silently. A further request to get data will also result in the rt_band_load_offline_data error as above.

Other information

Public COGs

I carried out the above steps to ingest and access a COG file from a public AWS bucket. However, this failed in the same way; SET postgis.gdal_vsi_options='AWS_NO_SIGN_REQUEST=YES'; raised an invalid value error for AWS_NO_SIGN_REQUEST.

Alternate Approach

For comparison, I set up a compute instance running a docker image of PostGIS (postgis/postgis:14-3.3). I repeated all of the steps outlined above and was successful in querying my raster data and performing crop and union operations. The version information is as follows:

POSTGIS="3.3.4 3.3.4" [EXTENSION] PGSQL="140" GEOS="3.9.0-CAPI-1.16.2" PROJ="7.2.1" GDAL="GDAL 3.2.2, released 2021/03/05" LIBXML="2.9.10" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" RASTER

GDAL on Cloud SQL

When setting postgis.gdal_vsi_options, PostGIS will validate the options by reading the available options from GDAL's VSIGetFileSystemOptions() function. Therefore, the ability to set these options in PostGIS is directly dependent on the GDAL version in use. What relevant options are available and when did they become available in GDAL?

AWS
- AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY --> GDAL=2.3
- AWS_NO_SIGN_REQUEST --> GDAL=2.3
Google
- GS_ACCESS_KEY_ID and GS_SECRET_ACCESS_KEY --> GDAL=2.3
- GS_NO_SIGN_REQUEST --> GDAL=3.4

In summary:

Rasters in cloud storage can be accessed out-db from PostGIS >= 3.2.
The required options to enable access can be set via postgis.gdal_vsi_options.
If PostGIS is using GDAL >= 2.3, <3.4, users can set the necessary options to access:
- public and private AWS S3 buckets
- only private Google Storage buckets.
If GDAL >= 3.4 public Google Buckets can also be accessed.

Revisiting the PostGIS version installed on CloudSQL:

POSTGIS="3.2.5 3.2.5" [EXTENSION] PGSQL="140" GEOS="3.5.0-CAPI-1.9.0 4392" PROJ="Rel. 6.3.1, February 10th, 2020" GDAL="GDAL 3.0.0dev, released 2018/99/99" LIBXML="2.9.10" LIBJSON="0.17" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" RASTER

Checklist:

POSTGIS=3.2.5 >= 3.2 ✅
GDAL=3.0.0dev >= 2.3 ✅
GDAL=3.0.0dev >= 3.4 ❌

Given the above, the only issue should be that we cannot provide the GS_NO_SIGN_REQUEST=YES option to access public GCS buckets. My personal requirement of accessing data in a private GCS bucket should be achievable.

This leads me to believe the problem lies with the GDAL=3.0.0dev package used in Cloud SQL.