Skip to content

Add BLB_write function to allow modification of data in a BLOB#9066

Open
Noremos wants to merge 7 commits into
FirebirdSQL:masterfrom
Noremos:blob_data_modification_function
Open

Add BLB_write function to allow modification of data in a BLOB#9066
Noremos wants to merge 7 commits into
FirebirdSQL:masterfrom
Noremos:blob_data_modification_function

Conversation

@Noremos

@Noremos Noremos commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

This PR is part of the JSON implementation. Efficiently writing binary JSON requires modifying a blob. Otherwise, TempSpace is required to use it and convert the data to a blob, significantly increasing overhead.

Additions:

  1. New blob functions: BLB_write. It works the same as TempSpace::write
  2. New blob functions: BLB_read. It works the same as BLB_seek + BLB_get_data
  3. BLB_lseek now works with BLB_put_data (to modify and append data)
  4. BLB_lseek now works with BLB_put_segment (to modify and append data)

@hvlad

hvlad commented Jun 17, 2026

Copy link
Copy Markdown
Member

Whats wrong or not enough with seek + putSegment ?

@Noremos

Noremos commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Whats wrong or not enough with seek + putSegment ?

BLB_lseek works only for BLB_get_segment

@hvlad

hvlad commented Jun 17, 2026

Copy link
Copy Markdown
Member

Whats wrong or not enough with seek + putSegment ?

BLB_lseek works only for BLB_get_segment

So why not make it work with put too? I suppose it's been done in your PR anyway.

@Noremos

Noremos commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Whats wrong or not enough with seek + putSegment ?

BLB_lseek works only for BLB_get_segment

So why not make it work with put too? I suppose it's been done in your PR anyway.

Initially, I wanted to implement the same read/write interface as for TempSpace. But I can do this within the BLB_get_segment function.

@hvlad

hvlad commented Jun 17, 2026

Copy link
Copy Markdown
Member

Whats wrong or not enough with seek + putSegment ?

BLB_lseek works only for BLB_get_segment

So why not make it work with put too? I suppose it's been done in your PR anyway.

Initially, I wanted to implement the same read/write interface as for TempSpace.

You still speak about blobs here ?

But I can do this within the BLB_get_segment function.

I'm lost your point here, sorry. How it answers on my question ?

@hvlad

hvlad commented Jun 17, 2026

Copy link
Copy Markdown
Member

BTW, code in blb.cpp (at least) requires comments. It have none too small useful comments, imo.

@sim1984

sim1984 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Why is JsonTempParsingTests.cpp needed in this PR?

@Noremos

Noremos commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Whats wrong or not enough with seek + putSegment ?

BLB_lseek works only for BLB_get_segment

So why not make it work with put too? I suppose it's been done in your PR anyway.

Initially, I wanted to implement the same read/write interface as for TempSpace.

You still speak about blobs here ?

But I can do this within the BLB_get_segment function.

I'm lost your point here, sorry. How it answers on my question ?

I meant that the TempSpace class has convenient read/write functions with the ability to specify a position, so it would be great to have the same function for blobs at some point.

As for BLB_put_data and BLB_put_segment, I initially didn't want to change them, as the code in them is quite complex and modifying it would be quite difficult. But now I see that the integration is quite simple, so I did it.

So now BLB_write works the same as BLB_lseek + BLB_put_data (or BLB_lseek + BLB_put_segment).

@Noremos

Noremos commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Why is JsonTempParsingTests.cpp needed in this PR?

Sorry, forgot to delete this file after the branch checkout.

@dyemanov

Copy link
Copy Markdown
Member

If you say BLB_write() is convenient, I'd also expect BLB_read() to be added for symmetry.

@aafemt

aafemt commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Do new methods of blb really need this ancient BLB_ prefix?..

@Noremos

Noremos commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

BTW, code in blb.cpp (at least) requires comments. It have none too small useful comments, imo.

Added description of modification algorithm:
https://github.com/FirebirdSQL/firebird/pull/9066/changes#diff-b38a2d18c3f2097dc3697386a2d4d4cb6c708408ac2cdad4169aca72bdf2479dR2051-R2079

@Noremos

Noremos commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

If you say BLB_write() is convenient, I'd also expect BLB_read() to be added for symmetry.

Added as suggested

Comment thread src/jrd/blb.h
// Write data at any position in a temporally (new) blob
// The position of the new buffer must start inside the blob range, but its length may extend beyond it
// Existing data will be overwritten
void BLB_write(thread_db* tdbb, const offset_t position, const void* buffer, ULONG length);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two functions supposed to be a public API ? I see no changes in interfaces.
If not - it should not start with BLB_

Comment thread src/jrd/blb.h
void destroy(const bool purge_flag);

// Modify only existing data. Throw error on side violation
void modifyExistingData(thread_db* tdbb, offset_t position, const void* buffer, const ULONG length);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, am I missed a way to modify non-existing data ? :)

Comment thread src/jrd/blb.h
// false: the input range is extends beyond existing data. Modify `buffer` and `length` to return only non-written data
template<class BufferType, class SizeType>
requires((std::is_same_v<BufferType, void> || std::is_same_v<BufferType, UCHAR>) && std::is_integral_v<SizeType>)
bool modifyDataMoveBuffer(thread_db* tdbb, const offset_t position, const BufferType*& buffer, SizeType& length)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataMoveBuffer ? What is it ? We already have MoveBuffer used in MOV\CVT, and it is completely different thing.

Why template here ? Why it is inline routine ?

Comment thread src/jrd/blb.cpp
m_offset = position % m_pageDataLength; // Position in the page
}

// Get data from blob data page and replace data on it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get data ?

Comment thread src/jrd/blb.cpp
m_offset = 0; // Offset only in the first page
};

// Move child page Id from level 1 to level 2

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely unclear.
There is no way to "move" "page Id" (which "Id" ???) between "levels".

Comment thread src/jrd/blb.cpp
}

// Get level 1 or level 2 page
inline ULONG getNextLevel1PageId() noexcept

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear as above.

Comment thread src/jrd/blb.cpp
// 5. If no more level 2 pages are available, advance to the next level 1 page,
// read its first level 2 page, and continue modifying subsequent level 2 pages.
// 6. If all pages have been processed but there is still unmodified data, update the <buffer>.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but.. where you get the terminology you used ? I'm sorry, but it is very, very hard to read - I need constantly translate into familiar terms.

There is no level of blob pages. There is level of blob.

There are blob pointer pages and blob data pages, and blob record contains blob data (for level-0 blobs) or array of pointers (for non-0 level blobs).

Comment thread src/jrd/blb.cpp
auto releasePage = [&tdbb, &window](const bool mark)
{
if (mark)
CCH_MARK(tdbb, &window); // Mark as dirty

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page must be marked before any attempt to change the contents!

Comment thread src/jrd/blb.cpp
void blb::BLB_write(thread_db* tdbb, const offset_t position, const void* buffer, ULONG length)
{
if (!(blb_flags & BLB_temporary) || (blb_flags & BLB_closed))
ERR_post(Arg::Gds(isc_cannot_update_old_blob)); // Cannot update existing blob

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is duplicated many times, worth to move it in to separate routine.

FB_IMPL_MSG(JRD, 1018, dsql_agg_param_not_accum, -204, "42", "000", "Aggregate function input parameters may be referenced only in ON ACCUMULATE DO")
FB_IMPL_MSG(JRD, 1019, dsql_agg_exit_group, -204, "42", "000", "EXIT is not allowed in ON GROUP DO section of aggregate function")
FB_IMPL_MSG(JRD, 1020, dsql_agg_return, -204, "42", "000", "RETURN is not allowed in ON START DO, ON ACCUMULATE DO or ON FINISH DO sections of aggregate function; use EXIT instead")
FB_IMPL_MSG(JRD, 1021, blob_out_of_length_write, -204, "42", "000", "Cannot write to blob. Position @1 is out of blob length @2")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blob_write_out_of_bounds or blob_write_after_the_end ?

Same for the message text.

@hvlad

hvlad commented Jun 18, 2026

Copy link
Copy Markdown
Member

Don't get me wrong - the code itself is not bad, but it is very hard to read, understand and verify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants