Status Update
Comments
da...@google.com <da...@google.com>
ja...@jamie-thomson.net <ja...@jamie-thomson.net> #2
According to some documentation that I have just read this is now supported:
Does BigQuery column-level security work for views?
Yes. Views are derived from an underlying table. The same column-level security on the table applies when the protected columns are accessed through a view.Note that there are two kinds of views in BigQuery, logical views and authorized views. Both types of views are derived from a source table, and both are consistent with the table's column-level security.
pd...@bendingspoons.com <pd...@bendingspoons.com> #3
That help page seems to suggest that if there a table A is referenced by view B, the policy tags set on A are enforced also when users query B. This is expected.
However, it is not possible to set additional policy tags on B fields independently, AFAIK.
mi...@google.com <mi...@google.com>
ja...@jamie-thomson.net <ja...@jamie-thomson.net> #4
Ignore my previous comment above. On reflection the documentation I referred to does not solve the problem that the original poster outlined. Moreover, we've recently discovered limitations of policy tags that makes me realise how vital this request is.
Take this scenario:
We have a table t containing emailAddress which is considered PII and hence we do not want to expose that to everyone, so we put a policy tag upon it to limit who can access it. However, people still want to be able to answer question like "How many distinct users visited our site?" which could be answered by issuing
select count(distinct emailAddress) from t
however, the policy tag prevents that.
We'd like to get around this problem by presenting a view which does this:
create view v
as
select SHA256("some-pepper-value" || emailAddress) as emailAddressHash
so that users can query it like so:
select count(distinct emailAddressHash) from v
however, the policy tag still prevents end users from accessing emailAddressHash. This is our problem, we know the hashing mechanism protects the PII data but still people cannot access the data.
Perhaps the original poster's suggestion solves this problem, perhaps it doesn't. But nevertheless, this is a problem that we need a solution for.
ar...@google.com <ar...@google.com> #5
Hello,
We understand the importance of this feature and the product team is already aware as well. I invite you to subscribe to this thread, if you haven't already done so, by clicking on the star at the top left, this way, you also help to bring more visibility to this feature.
Thanks.
pd...@bendingspoons.com <pd...@bendingspoons.com> #6
My use case is similar, but not exactly identical to the scenario described in #4.
My use case is the following:
- I have a source table
T1
containing sensitive personal data. It is in a dataset with restricted access. Analysts can't query that table directly. - I define an authorized view
V1
overT1
. This view does some data transformation and drops some columns. - I still want to expose some sensitive columns (like a user ID) through
V1
, but only to a selected subset of users and service accounts.
My goal is to be able to do so without resorting to SESSION_USER()
, which feels like a hack and requires to hardcode usernames in the view query of V1
.
Description
BigQuery Policy Tags are an important data governance feature to enforce access control policies at column-level granularity.
Currently, it is not possible to set Policy Tags in BigQuery views, as only tables are supported. As a consequence, the access policy to an authorized view can only be binary: either the user is authorized to read all the data from the view, or they are not.
By extending the Policy Tag support to (authorized) views, they would become an incredibly powerful feature for access control: authorized views would then allow both to manipulate/obfuscate the source data with an arbitrary SQL query, and to restrict the access to each column to different user groups.
I'm currently using Policy Tags to protect certain critical columns in a table derived from a transformation of production data, but I would like to switch to views to have access to real-time data. The lack of policy tags is preventing me to do so. (I can't touch the source tables.)