Status Update
Comments
ap...@google.com <ap...@google.com> #2
Thank you for writing in. You have filed this as a bug, but from what I can see this is more of a Feature Request, would you like me to change the category of the issue to a Feature Request?
Could you add in a few examples of the requests you are making and the types of responses you are dealing with?
Can you describe a real-life example in which you use these requests and the quantified drawbacks of the current system - i.e. You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
Can you also clarify if your sheet goes over the 1 million cell limit of Google Sheets? You mention it has 200K+ rows, so if you have more than 5 columns, you are at the limit of what Sheets is intended for and you may want to consider moving to a database.
Finally, can you elaborate in your words the business impact that this new feature (if it is a feature request), would have on your business.
Thank you.
ap...@google.com <ap...@google.com> #3
Hi,
You asked a lot of questions, I tried to answer everything one by one. Let me know if I missed something.
You have filed this as a bug, but from what I can see this is more of a Feature Request, would you like me to change the category of the issue to a Feature Request?
I am currently migrating from V3, if I am not able to keep doing what I am doing in some way in V4, I consider this a bug, I think it would be a stretch to call this a feature.
Could you add in a few examples of the requests you are making and the types of responses you are dealing with?
Here are some basic minimalistic examples. The theme is the same, I want to append a new row to the end of the spreadsheet. Simple with V3, impossible with V4 to do it correctly.
Example 1: Values.Append
Add a new row to the sheet. Given data in row A1 "<empty>, something", and using Values.Append and inputRange="A1", the new row will be inserted to A1 instead of A2.
To fix this let's change inputRange to 1:1. Given data in row A1 "<empty>, something, <empty>, something", the new will be inserted into D2 instead of A2.
Or I could use range A:Z, but that also has the same issue, any empty row and column would make the insert misaligned and not start from column A.
I couldn't find any solution in V4 that would work in all cases that might occur in our spreadsheets. We haven't had such problems with V3.
Example 2: AppendCellsRange
Let's assume you fixed
I want to add a new row and return it's position and the inserted data. AppendCellsRange will add the row to the end of the sheet, but does not return the data.
I would need to make Values.Get request to read the full sheet. But I have a sheet with 160K rows and 16 columns. Instead I can do mutliple Values.Get, let's say 16x10K rows and try to search for the data, but this takes 16x the roundtime + it will hit the quota limit very fast and lock out my user for a while.
Can you describe a real-life example in which you use these requests and the quantified drawbacks of the current system - i.e. You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
The real-life example is Google Sheets integration with our own system. This is a feature we provide to some of our customers who need this. Our customers manage records of data in their spreadsheet, one record per row. While they use our system, they add and update rows in their sheet in real time.
The workflow looks like this:
- User navigates to a website and uses our service to store data from this website to a row in their spreadsheet, a new row to the end of the sheet, the problem I described above applies here.
- User navigates to another site, and does the same.
- User comes back to the first site and expects to see their data loaded from their spreadsheet in our service. So we need to search the sheet for the identifier and read the row, with V4 this is almost impossible, so we had to add complicated logic to index the spreadsheet and trying to refresh this inde as needed to speed up searches. Otherwise, or during indexing, we read the full sheet in batches of 10k rows, but without the index (cache) this dies on quota limits pretty quickly even after they got bumped by quite a lot, plus it takes a lot of time.
- User edits the loaded data and saves it back to the sheet. With V4 this is also become worse, We have to calculate checksums and try to to do a Get just before the Update and check for any concurrent changes, while with V3 the concurrent changes where handled automatically using eTags.
So in a nutshell this is our business logic for these customers.
You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
I think I explained this, reading the full sheet takes too much time and Google APIs are also timing out quite a lot, so we try to make more reliable calls by splitting the sheet into more predictable batches of fixed sizes, but this increases the number of API calls, again quota and latency.
It takes too long? How long does it take?
I hope you are aware of your own APIs performance, but in case not here are some numbers.
This is only for retrieving 4 columns using BatchGetValuesByDataFilterRequest.
Batch Size | Memory | Latency | Total latency (100k rows) |
---|---|---|---|
1k | 4-5 MB | 0.4s | 40s |
5k | 6 MB | 0.6s | 12s |
10k | 11 MB | 0.7s | 7s |
20k | 20 MB | 1s | 5s |
30k | 28 MB | 1.2s | 4s |
50k | 44 MB | 1.3s | 2.6s |
100k | 84 MB | 2s | 2s |
We handle a lot of users, so latency is only part of the issue, the other is keeping memory usage reasonable. So we cannot read the full spreadsheet of all users into memory every time. Consider sheets with 16 columns and 100K+ rows.
Here is an alternative using Spreadsheets.Get
Batch Size | Memory | Latency | Total latency (100k rows) |
---|---|---|---|
1k | 40 MB | 1.5s | 150s |
10k | 400 MB | 8s | 80s |
This is unusable, the Google Java SDK creates loads of objects and it is very slow here too.
Can you also clarify if your sheet goes over the 1 million cell limit of Google Sheets?
Are you sure about the 1 million cell limit? We have spreadsheets with over 1M cells that work without problems (especially with V3).
Please see here
Up to 5 million cells or 18,278 columns (column ZZZ) for spreadsheets that are created in or converted to Google Sheets.
So your service seems to have a limit for 5 million cells, so my example sheet of 200K rows and 16 columns is way below the limits.
You mention it has 200K+ rows, so if you have more than 5 columns, you are at the limit of what Sheets is intended for and you may want to consider moving to a database.
Again, your comment about the limit is false. The limit is 5 million and we are well aware of it, but most of our users are well below this limit, but a 200K row, 16 column sheet, which is well within limits of Google Sheets, is still to large to be read in full on every operation.
I cannot ask my users who are individuals without technical background to swap their spreadsheets with a database. They want to keep records in their spreadsheets, Google Sheets V3 works perfectly fine, V4 breaks a lot of this integration.
Finally, can you elaborate in your words the business impact that this new feature (if it is a feature request), would have on your business.
From my POV this is a bug in V4 as it breaks functionality that used to work in V3 and as I mentioned above, to consider appending a single row to the end of the spreadsheet as a new feature sounds like a stretch to me.
The business impact of not having this and losing V3 is that we might need to abandon our Google Sheets integration and risk losing around 2K customers.
ap...@google.com <ap...@google.com> #4
Thank you very much for the extensive details on the context around this request, it is very helpful.
Apologies about the 1 million cells comment, I am not sure where I got that from but you are absolutely right about the 5 million cell limit.
That said, from what I can make out, there are three main things your issue is about.
-
spreadsheets.values.append
does not return the row in which it added the values just appended, and it does not include the values. -
spreadsheets.values.append
sometimes does not detect or input values in the right place. -
spreadsheets.values.append
does not support more than 4 concurrent requests.
For 2, this has already been raised here
For 3, as you have already found, has another issue that is being dealt with separately.
Ensure to "star" these issues so that we know more people are affected and this will also subscribe you to updates.
For number 1, could you please review this example:
- Create a new sheet.
- In range B6:E9 manually insert [[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
- Go to
https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/append and use the "Try this API" and input:
- the Spreadsheet ID
- the range string
Sheet1!B6
includeValuesInResponse
astrue
valueInputOption
asRAW
- and the request body as below:
{
"values": [
[
1,
2,
3
]
]
}
This for me returns a response:
{
"spreadsheetId": "[SS_ID]",
"tableRange": "Sheet1!B6:C9",
"updates": {
"spreadsheetId": "[SS_ID]",
"updatedRange": "Sheet1!B10:D10",
"updatedRows": 1,
"updatedColumns": 3,
"updatedCells": 3,
"updatedData": {
"range": "Sheet1!B10:D10",
"majorDimension": "ROWS",
"values": [
[
"1",
"2",
"3"
]
]
}
}
}
Which as you can see in updatedRange
contains the range of the new cells, and this information is repeated in updatedData
which also contains the values that were just appended.
Or I may be missing something?
ap...@google.com <ap...@google.com> #5
Hi,
Thank you very much for the quick response, you are close, but some of the points are not related to the issue I was describing.
For number 1. spreadsheets.values.append
does return the inserted values and its position, I already use that and works perfectly well. But I was also testing spreadsheets.batchUpdate
with AppendCellsRequest
which works well for inserting rows to the expected last row, but it does not return the inserted data.
Your point in number 2 is not exactly what I was reporting either. I was not even aware of those edge cases reported in spreadsheets.values.append
(talking about the expected behavior, not bugs) does not provide a way to specifically insert a row starting from the first column after the last data row regardless of what rows and columns are missing above it.
To summarize, Google Sheets V4 provides 2 ways to append data:
spreadsheets.values.append
that returns inserted values, but it is not possible to use it to insert data after the last data row starting from the first column without a suitable arrangement of the preceding rowsspreadsheets.batchUpdate
'sAppendCellsRequest
does insert rows after the last data row starting from the first column as we need, but it does not return the inserted values and range.
In the end this means that none of the options provided by V4 are correct alternatives to V3's insert
operation and for our use case.
From the above two requests I prefer to use spreadsheets.values.append
that is faster and returns the inserted data and range. If we could solve the problem of inserting rows into the first column after the last data row, it would be awesome. Alternatively, if AppendCellsRequest
would return the inserted data or range, that would work for us too.
To summarize my suggestions:
- A. Add an option to
spreadsheets.values.append
to allow inserting data from the first column after the last data row such asAppendCellsRequest
regardless of table arrangements. - B. Or add a new request e.g.
spreadsheets.values.appendCells
that behaves likeAppendCellsRequest
but returns the inserted data and range just likespreadsheets.values.append
- C. Or make
AppendCellsRequest
return anAppenCellsResponse
that returns the inserted data and range on request.
Basically, you have 2 types of append requests that behave differently, what we need is an append request that is the combination of the two which would behave similarly to V3's insert
operation.
This is not specifically a bug in the implementation. What I am reporting is a bug in the design of V4 that either does not allow deterministic append behavior if the arrangement of data is not know to the caller or does not return the inserted data and range, and thus in the end does not provide a good alternative for migration from V3's insert
operation.
ap...@google.com <ap...@google.com> #6
Another datapoint on why this is considered a bug is based on the Google announcement here
We want to make sure that projects originally built on the v3 API continue working long after it is gone.
This is the only thing we want too, but sometimes V4 makes it hard to do so and already forced us to make several workarounds, so that I cannot fully agree with the below statement from the same announcement.
As part of the migration to the Sheets v4 API, which provides a better developer and user experience, we will be retiring the Sheets v3 API
ap...@google.com <ap...@google.com> #7
Thank you for the further information.
Let me try and summarize this in my words to see if we are in agreement:
You are reliant on the behavior of appending cells in v3 (specifically the insert
behavior).
The task is, in a very large spreadsheet, to insert a row at the end of the sheet and have the response include the data inserted and its location in the sheet.
v4 offers two methods to do this, spreadsheet.values.append
and spreadsheets.batchUpdate
using the appendCells
request and both of these methods have slightly different behavior.
spreadsheet.values.append
will try to automatically detect the table in the range that is given as a non-optional parameter and will append the data after the last line of the "table". This does return the the data inserted and its location/
spreadsheets.batchUpdate
using the appendCells
request, does not seem to detect tables in this way, and will just detect the last row that has data and append it to that starting in row A
. This can return the position and values, but only in the form of the whole spreadsheet resource.
For your use case, neither is suitable. spreadsheet.values.append
does not work because you may have some empty rows and so it might not insert values at the end of the table. spreadsheets.batchUpdate
is not suitable because the spreadsheet resource is too large.
Is that accurate?
ap...@google.com <ap...@google.com> #8
Yes, your summary is spot on.
spreadsheets.batchUpdate
is not suitable because the spreadsheet resource is too large.
I would just add that it also does not return the appended row position and data and hence the only other option is to retrieve the sheet which is too large to do so on a regular basis.
ap...@google.com <ap...@google.com> #9
Thank you. I have raised this internally. Once we have some updates, we will post them here.
el...@google.com <el...@google.com>
na...@google.com <na...@google.com> #10
This bug was linked in a change in the following release(s):
androidx.room:room-ktx:2.5.0-beta01
androidx.room:room-runtime:2.5.0-beta01
androidx.sqlite:sqlite:2.3.0-beta01
androidx.sqlite:sqlite-framework:2.3.0-beta01
androidx.sqlite:sqlite-ktx:2.3.0-beta01
androidx.work:work-gcm:2.8.0-beta01
androidx.work:work-multiprocess:2.8.0-beta01
androidx.work:work-runtime:2.8.0-beta01
androidx.work:work-runtime-ktx:2.8.0-beta01
androidx.work:work-testing:2.8.0-beta01
Description
This involves converting the following sub-directories in the following order: sqlite sqlite-framework sqlite-ktx integration-tests