Fix: Large table causes sql error#2242
Conversation
410c7ff to
e497d0b
Compare
4e43530 to
f792683
Compare
3eb630d to
491ee41
Compare
f599f40 to
8ca7e61
Compare
8ca7e61 to
5ce5c0f
Compare
b26b1e0 to
696cb1a
Compare
006839e to
75d3196
Compare
691a826 to
45ae426
Compare
45ae426 to
2dfe1f2
Compare
2a8feb5 to
deb1979
Compare
deb1979 to
c6912b7
Compare
c86e7ec to
cfc722b
Compare
There was a problem hiding this comment.
This import path writes cells directly via the cell mapper, bypassing Row2Mapper, so tables_row_sleeves.cached_cells is never populated for imported rows?
There was a problem hiding this comment.
I didn't get the context. Is it the same as #2242 (comment) ? Otherwise can you provide affected method names?
There was a problem hiding this comment.
we should update the migration name. It's been so long since this was started, new migrations have been added now
There was a problem hiding this comment.
updated. Also bumped version from 2.2 to 2.3
| @@ -768,9 +670,12 @@ public function update(Row2 $row, ?string $userId = null): Row2 { | |||
| $this->columnMapper->preloadColumns(array_column($changedCells, 'columnId')); | |||
|
|
|||
| // write all changed cells to its db-table | |||
| $cachedCells = $sleeve->getCachedCellsArray(); | |||
| foreach ($changedCells as $cell) { | |||
| $this->insertOrUpdateCell($sleeve->getId(), $cell['columnId'], $cell['value']); | |||
| $cachedCells[$cell['columnId']] = $this->insertOrUpdateCell($sleeve->getId(), $cell['columnId'], $cell['value']); | |||
| } | |||
| $sleeve->setCachedCellsArray($cachedCells); | |||
| $this->rowSleeveMapper->update($sleeve); | |||
There was a problem hiding this comment.
update() writes the sleeve twice per row edit. Can we fold the metadata change and the cached_cells update into a single update() call.
There was a problem hiding this comment.
I did some changes here, testing now.
TBH would be nice to wrap all this db changes into a transaction. But this is completely separate topic
|
|
||
| /** | ||
| * @param int[] $ids | ||
| * @return RowSleeve[] |
There was a problem hiding this comment.
confused why raw data is better performance
Signed-off-by: Kostiantyn Miakshyn <molodchick@gmail.com>
Signed-off-by: Kostiantyn Miakshyn <molodchick@gmail.com>
cfc722b to
0d16156
Compare
Signed-off-by: Kostiantyn Miakshyn <molodchick@gmail.com>
e974ce9 to
7f43314
Compare
Signed-off-by: Kostiantyn Miakshyn <molodchick@gmail.com>
7f43314 to
a150684
Compare
blizzz
left a comment
There was a problem hiding this comment.
I have a few questions and scaling concerns (also nitpicks added 😇 )
|
|
||
| ]]></description> | ||
| <version>2.2.0</version> | ||
| <version>2.3.0</version> |
There was a problem hiding this comment.
| <version>2.3.0</version> | |
| <version>2.3.0-dev.0</version> |
Would go for a pre-release number, unless we'd immediately release afterwards.
| @@ -0,0 +1,92 @@ | |||
| <?php | |||
|
|
|||
There was a problem hiding this comment.
| declare(strict_types=1); | |
|
|
||
| $table = $schema->getTable('tables_row_sleeves'); | ||
| if (!$table->hasColumn('cached_cells')) { | ||
| $table->addColumn('cached_cells', Types::JSON, [ |
There was a problem hiding this comment.
We should avoid JSON types, as they work poorly across different DB vendors and caused trouble in the past. Use Text instead. Types::JSON is also officially deprecated for these reasons since NC 33.
| } | ||
|
|
||
| $sleeve = $this->rowSleeveMapper->find($rowId); | ||
| $sleeve->setCachedCellsArray($cachedCells); |
There was a problem hiding this comment.
Correct me if I am wrong. This stores all cell values of a how table in a single row? On MySQL this would be limited to whopping 4GB, but generally, this sounds risky and could lead to transferal of a huge amount of data on really big tables. What do you think?
There was a problem hiding this comment.
Also as a post-migration step, this might take a real long time to accomplish on instanced with a big amount of data.
| foreach ($cells as $cell) { | ||
| $cachedCells[$cell->getColumnId()] = $mapper->toArray($cell); | ||
| } | ||
| } catch (\Exception $e) { |
There was a problem hiding this comment.
It's a too broad catch, isn't it?
| $cachedCells = []; | ||
|
|
||
| /** @var RowCellMapperSuper[] $cellMappers */ | ||
| $cellMappers = [ |
There was a problem hiding this comment.
doesn't need to be within the loop?
| } | ||
|
|
||
| private function rebuildCachedCells(array $rowIdMap): void { | ||
| foreach ($rowIdMap as $newRowId) { |
There was a problem hiding this comment.
Especially if this is running in user facing scripts, this may run into a timeout on big tables and fail.
| $sleeveAlias . '.id', | ||
| $sleeveAlias . '.table_id', | ||
| $sleeveAlias . '.cached_cells', | ||
| $sleeveAlias . '.created_by', |


This PR built on top of #2238.
Closes #1490.
It fixes loading of the large tables. Right now it is not possible to load a table with 30k rows and 8 columns due to an error:
PR contains 2 commits: fix itself and performance improvement.
I've measured latency for a various scenarios for
/row/table/{tableId}endpoint:📹 Video for performance comparison
Next step: move filtering/sorting/pagination to BE side and speedup FE. But this is completely separate topic which will be implemented in upcoming PRs.