Status Update
Comments
jp...@google.com <jp...@google.com> #2
Thank you for writing in. You have filed this as a bug, but from what I can see this is more of a Feature Request, would you like me to change the category of the issue to a Feature Request?
Could you add in a few examples of the requests you are making and the types of responses you are dealing with?
Can you describe a real-life example in which you use these requests and the quantified drawbacks of the current system - i.e. You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
Can you also clarify if your sheet goes over the 1 million cell limit of Google Sheets? You mention it has 200K+ rows, so if you have more than 5 columns, you are at the limit of what Sheets is intended for and you may want to consider moving to a database.
Finally, can you elaborate in your words the business impact that this new feature (if it is a feature request), would have on your business.
Thank you.
bl...@google.com <bl...@google.com> #3
Hi,
You asked a lot of questions, I tried to answer everything one by one. Let me know if I missed something.
You have filed this as a bug, but from what I can see this is more of a Feature Request, would you like me to change the category of the issue to a Feature Request?
I am currently migrating from V3, if I am not able to keep doing what I am doing in some way in V4, I consider this a bug, I think it would be a stretch to call this a feature.
Could you add in a few examples of the requests you are making and the types of responses you are dealing with?
Here are some basic minimalistic examples. The theme is the same, I want to append a new row to the end of the spreadsheet. Simple with V3, impossible with V4 to do it correctly.
Example 1: Values.Append
Add a new row to the sheet. Given data in row A1 "<empty>, something", and using Values.Append and inputRange="A1", the new row will be inserted to A1 instead of A2.
To fix this let's change inputRange to 1:1. Given data in row A1 "<empty>, something, <empty>, something", the new will be inserted into D2 instead of A2.
Or I could use range A:Z, but that also has the same issue, any empty row and column would make the insert misaligned and not start from column A.
I couldn't find any solution in V4 that would work in all cases that might occur in our spreadsheets. We haven't had such problems with V3.
Example 2: AppendCellsRange
Let's assume you fixed
I want to add a new row and return it's position and the inserted data. AppendCellsRange will add the row to the end of the sheet, but does not return the data.
I would need to make Values.Get request to read the full sheet. But I have a sheet with 160K rows and 16 columns. Instead I can do mutliple Values.Get, let's say 16x10K rows and try to search for the data, but this takes 16x the roundtime + it will hit the quota limit very fast and lock out my user for a while.
Can you describe a real-life example in which you use these requests and the quantified drawbacks of the current system - i.e. You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
The real-life example is Google Sheets integration with our own system. This is a feature we provide to some of our customers who need this. Our customers manage records of data in their spreadsheet, one record per row. While they use our system, they add and update rows in their sheet in real time.
The workflow looks like this:
- User navigates to a website and uses our service to store data from this website to a row in their spreadsheet, a new row to the end of the sheet, the problem I described above applies here.
- User navigates to another site, and does the same.
- User comes back to the first site and expects to see their data loaded from their spreadsheet in our service. So we need to search the sheet for the identifier and read the row, with V4 this is almost impossible, so we had to add complicated logic to index the spreadsheet and trying to refresh this inde as needed to speed up searches. Otherwise, or during indexing, we read the full sheet in batches of 10k rows, but without the index (cache) this dies on quota limits pretty quickly even after they got bumped by quite a lot, plus it takes a lot of time.
- User edits the loaded data and saves it back to the sheet. With V4 this is also become worse, We have to calculate checksums and try to to do a Get just before the Update and check for any concurrent changes, while with V3 the concurrent changes where handled automatically using eTags.
So in a nutshell this is our business logic for these customers.
You mention that it is prohibitively expensive, how exactly do you mean? It takes too long? How long does it take?
I think I explained this, reading the full sheet takes too much time and Google APIs are also timing out quite a lot, so we try to make more reliable calls by splitting the sheet into more predictable batches of fixed sizes, but this increases the number of API calls, again quota and latency.
It takes too long? How long does it take?
I hope you are aware of your own APIs performance, but in case not here are some numbers.
This is only for retrieving 4 columns using BatchGetValuesByDataFilterRequest.
Batch Size | Memory | Latency | Total latency (100k rows) |
---|---|---|---|
1k | 4-5 MB | 0.4s | 40s |
5k | 6 MB | 0.6s | 12s |
10k | 11 MB | 0.7s | 7s |
20k | 20 MB | 1s | 5s |
30k | 28 MB | 1.2s | 4s |
50k | 44 MB | 1.3s | 2.6s |
100k | 84 MB | 2s | 2s |
We handle a lot of users, so latency is only part of the issue, the other is keeping memory usage reasonable. So we cannot read the full spreadsheet of all users into memory every time. Consider sheets with 16 columns and 100K+ rows.
Here is an alternative using Spreadsheets.Get
Batch Size | Memory | Latency | Total latency (100k rows) |
---|---|---|---|
1k | 40 MB | 1.5s | 150s |
10k | 400 MB | 8s | 80s |
This is unusable, the Google Java SDK creates loads of objects and it is very slow here too.
Can you also clarify if your sheet goes over the 1 million cell limit of Google Sheets?
Are you sure about the 1 million cell limit? We have spreadsheets with over 1M cells that work without problems (especially with V3).
Please see here
Up to 5 million cells or 18,278 columns (column ZZZ) for spreadsheets that are created in or converted to Google Sheets.
So your service seems to have a limit for 5 million cells, so my example sheet of 200K rows and 16 columns is way below the limits.
You mention it has 200K+ rows, so if you have more than 5 columns, you are at the limit of what Sheets is intended for and you may want to consider moving to a database.
Again, your comment about the limit is false. The limit is 5 million and we are well aware of it, but most of our users are well below this limit, but a 200K row, 16 column sheet, which is well within limits of Google Sheets, is still to large to be read in full on every operation.
I cannot ask my users who are individuals without technical background to swap their spreadsheets with a database. They want to keep records in their spreadsheets, Google Sheets V3 works perfectly fine, V4 breaks a lot of this integration.
Finally, can you elaborate in your words the business impact that this new feature (if it is a feature request), would have on your business.
From my POV this is a bug in V4 as it breaks functionality that used to work in V3 and as I mentioned above, to consider appending a single row to the end of the spreadsheet as a new feature sounds like a stretch to me.
The business impact of not having this and losing V3 is that we might need to abandon our Google Sheets integration and risk losing around 2K customers.
sh...@gmail.com <sh...@gmail.com> #4
Thank you very much for the extensive details on the context around this request, it is very helpful.
Apologies about the 1 million cells comment, I am not sure where I got that from but you are absolutely right about the 5 million cell limit.
That said, from what I can make out, there are three main things your issue is about.
-
spreadsheets.values.append
does not return the row in which it added the values just appended, and it does not include the values. -
spreadsheets.values.append
sometimes does not detect or input values in the right place. -
spreadsheets.values.append
does not support more than 4 concurrent requests.
For 2, this has already been raised here
For 3, as you have already found, has another issue that is being dealt with separately.
Ensure to "star" these issues so that we know more people are affected and this will also subscribe you to updates.
For number 1, could you please review this example:
- Create a new sheet.
- In range B6:E9 manually insert [[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
- Go to
https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/append and use the "Try this API" and input:
- the Spreadsheet ID
- the range string
Sheet1!B6
includeValuesInResponse
astrue
valueInputOption
asRAW
- and the request body as below:
{
"values": [
[
1,
2,
3
]
]
}
This for me returns a response:
{
"spreadsheetId": "[SS_ID]",
"tableRange": "Sheet1!B6:C9",
"updates": {
"spreadsheetId": "[SS_ID]",
"updatedRange": "Sheet1!B10:D10",
"updatedRows": 1,
"updatedColumns": 3,
"updatedCells": 3,
"updatedData": {
"range": "Sheet1!B10:D10",
"majorDimension": "ROWS",
"values": [
[
"1",
"2",
"3"
]
]
}
}
}
Which as you can see in updatedRange
contains the range of the new cells, and this information is repeated in updatedData
which also contains the values that were just appended.
Or I may be missing something?
Description
Before opening a new issue, please search for other related issues , click the ★ to subscribe to updates, and click
+1
to vote.Description
We need to create docs in pageless format. There is a similar issue in Apps Scripts: https://issuetracker.google.com/issues/227875469
Impact
At the moment, all docs created must be manually switched to pageless format.