Add BLB_write function to allow modification of data in a BLOB#9066
Add BLB_write function to allow modification of data in a BLOB#9066Noremos wants to merge 7 commits into
Conversation
|
Whats wrong or not enough with seek + putSegment ? |
|
So why not make it work with |
Initially, I wanted to implement the same read/write interface as for TempSpace. But I can do this within the |
You still speak about blobs here ?
I'm lost your point here, sorry. How it answers on my question ? |
|
BTW, code in blb.cpp (at least) requires comments. It have |
|
Why is |
I meant that the TempSpace class has convenient read/write functions with the ability to specify a position, so it would be great to have the same function for blobs at some point. As for So now |
Sorry, forgot to delete this file after the branch checkout. |
|
If you say |
|
Do new methods of |
Added description of modification algorithm: |
Added as suggested |
| // Write data at any position in a temporally (new) blob | ||
| // The position of the new buffer must start inside the blob range, but its length may extend beyond it | ||
| // Existing data will be overwritten | ||
| void BLB_write(thread_db* tdbb, const offset_t position, const void* buffer, ULONG length); |
There was a problem hiding this comment.
Are these two functions supposed to be a public API ? I see no changes in interfaces.
If not - it should not start with BLB_
| void destroy(const bool purge_flag); | ||
|
|
||
| // Modify only existing data. Throw error on side violation | ||
| void modifyExistingData(thread_db* tdbb, offset_t position, const void* buffer, const ULONG length); |
There was a problem hiding this comment.
Sorry, am I missed a way to modify non-existing data ? :)
| // false: the input range is extends beyond existing data. Modify `buffer` and `length` to return only non-written data | ||
| template<class BufferType, class SizeType> | ||
| requires((std::is_same_v<BufferType, void> || std::is_same_v<BufferType, UCHAR>) && std::is_integral_v<SizeType>) | ||
| bool modifyDataMoveBuffer(thread_db* tdbb, const offset_t position, const BufferType*& buffer, SizeType& length) |
There was a problem hiding this comment.
DataMoveBuffer ? What is it ? We already have MoveBuffer used in MOV\CVT, and it is completely different thing.
Why template here ? Why it is inline routine ?
| m_offset = position % m_pageDataLength; // Position in the page | ||
| } | ||
|
|
||
| // Get data from blob data page and replace data on it |
| m_offset = 0; // Offset only in the first page | ||
| }; | ||
|
|
||
| // Move child page Id from level 1 to level 2 |
There was a problem hiding this comment.
This is completely unclear.
There is no way to "move" "page Id" (which "Id" ???) between "levels".
| } | ||
|
|
||
| // Get level 1 or level 2 page | ||
| inline ULONG getNextLevel1PageId() noexcept |
| // 5. If no more level 2 pages are available, advance to the next level 1 page, | ||
| // read its first level 2 page, and continue modifying subsequent level 2 pages. | ||
| // 6. If all pages have been processed but there is still unmodified data, update the <buffer>. | ||
|
|
There was a problem hiding this comment.
Thanks, but.. where you get the terminology you used ? I'm sorry, but it is very, very hard to read - I need constantly translate into familiar terms.
There is no level of blob pages. There is level of blob.
There are blob pointer pages and blob data pages, and blob record contains blob data (for level-0 blobs) or array of pointers (for non-0 level blobs).
| auto releasePage = [&tdbb, &window](const bool mark) | ||
| { | ||
| if (mark) | ||
| CCH_MARK(tdbb, &window); // Mark as dirty |
There was a problem hiding this comment.
Page must be marked before any attempt to change the contents!
| void blb::BLB_write(thread_db* tdbb, const offset_t position, const void* buffer, ULONG length) | ||
| { | ||
| if (!(blb_flags & BLB_temporary) || (blb_flags & BLB_closed)) | ||
| ERR_post(Arg::Gds(isc_cannot_update_old_blob)); // Cannot update existing blob |
There was a problem hiding this comment.
This check is duplicated many times, worth to move it in to separate routine.
| FB_IMPL_MSG(JRD, 1018, dsql_agg_param_not_accum, -204, "42", "000", "Aggregate function input parameters may be referenced only in ON ACCUMULATE DO") | ||
| FB_IMPL_MSG(JRD, 1019, dsql_agg_exit_group, -204, "42", "000", "EXIT is not allowed in ON GROUP DO section of aggregate function") | ||
| FB_IMPL_MSG(JRD, 1020, dsql_agg_return, -204, "42", "000", "RETURN is not allowed in ON START DO, ON ACCUMULATE DO or ON FINISH DO sections of aggregate function; use EXIT instead") | ||
| FB_IMPL_MSG(JRD, 1021, blob_out_of_length_write, -204, "42", "000", "Cannot write to blob. Position @1 is out of blob length @2") |
There was a problem hiding this comment.
blob_write_out_of_bounds or blob_write_after_the_end ?
Same for the message text.
|
Don't get me wrong - the code itself is not bad, but it is very hard to read, understand and verify. |
This PR is part of the JSON implementation. Efficiently writing binary JSON requires modifying a blob. Otherwise, TempSpace is required to use it and convert the data to a blob, significantly increasing overhead.
Additions: