Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VARINT and numerics type conversion #626

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/pgduckdb_types.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,16 @@ ConvertNumeric(const duckdb::Value &ddb_value, idx_t scale, NumericVar &result)
static Datum
ConvertNumericDatum(const duckdb::Value &value) {
auto value_type_id = value.type().id();

// Special handle duckdb VARINT type.
if (value.type().id() == duckdb::LogicalTypeId::VARINT) {
std::string value_str = value.ToString();
Datum pg_numeric = DirectFunctionCall3(numeric_in, CStringGetDatum(value_str.c_str()),
ObjectIdGetDatum(InvalidOid), Int32GetDatum(-1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this conversion I appreciate the simplicity that the string conversion gives us. Especially since VARINTs are fairly rare types and already a pretty expensive anyway. So converting making the conversion to Postgres a bit more expensive than strictly necessary seems acceptable. This does need a PostgresFunctionGuard though, to keep Postgres C code a bit more physically separated from the C++ code let's move this to logic to the src/pg/types.cpp file. (we're steadily moving more code to the src/pg directory, but we're not done with that migration yet)

Copy link
Contributor Author

@dentiny dentiny Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does need a PostgresFunctionGuard though, to keep Postgres C code a bit more physically separated from the C++ code

Make sense! Function guard added.

Just to confirm, I don't think we need type conversion for numerics <-> VARINT, right?
Because the entrypoint is in pg and we have other mapping (double) in duckdb.

let's move this to logic to the src/pg/types.cpp file.

I could understand the motivation (i.e. place pg related code under pg directory for better maintainability, namespace, etc), but I'm not sure which part to place, should I move all functions there?

As you might be aware of, I'm also working on other type conversions; do you think it ok after I finish them? (I promise I will prioritize them). I'm willing to help migration later on :)

Copy link
Contributor Author

@dentiny dentiny Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this conversion I appreciate the simplicity that the string conversion gives us. Especially since VARINTs are fairly rare types and already a pretty expensive anyway

Yeah I implement via pg native function mostly for code simplicity (but seems other types like TIME could be more easily implement, another story). I leave a comment for the motivation, and what we could improve.

return pg_numeric;
}

// Special handle duckdb DOUBLE TYPE.
if (value_type_id == duckdb::LogicalTypeId::DOUBLE) {
return ConvertDoubleDatum(value);
}
Expand Down Expand Up @@ -1112,6 +1122,8 @@ GetPostgresDuckDBType(const duckdb::LogicalType &type) {
return NUMERICOID;
case duckdb::LogicalTypeId::UUID:
return UUIDOID;
case duckdb::LogicalTypeId::VARINT:
return NUMERICOID;
case duckdb::LogicalTypeId::LIST: {
const duckdb::LogicalType *duck_type = &type;
while (duck_type->id() == duckdb::LogicalTypeId::LIST) {
Expand Down
10 changes: 6 additions & 4 deletions test/regression/expected/test_all_types.out
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ SELECT * FROM duckdb.query($$
FROM test_all_types()
SELECT * exclude(
tinyint, -- PG14 outputs this differently currently
varint,
TIME,
time_tz,
bit,
Expand All @@ -28,7 +27,7 @@ SELECT * exclude(
nested_int_array, -- The nested array has different lengths, which is not possible in PG
)
$$)
-[ RECORD 1 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 1 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool | f
smallint | -32768
int | -2147483648
Expand All @@ -39,6 +38,7 @@ utinyint | 0
usmallint | 0
uint | 0
ubigint | 0
varint | -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
date | 07-14-5881580
timestamp | Sun Jan 10 08:01:49.551616 294247
timestamp_s | Sun Jan 10 08:01:49.551616 294247
Expand All @@ -60,7 +60,7 @@ double_array | {}
date_array | {}
timestamp_array | {}
varchar_array | {}
-[ RECORD 2 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 2 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool | t
smallint | 32767
int | 2147483647
Expand All @@ -71,6 +71,7 @@ utinyint | 255
usmallint | 65535
uint | 4294967295
ubigint | 18446744073709551615
varint | 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
date | 07-10-5881580
timestamp | Sun Jan 10 04:00:54.775806 294247
timestamp_s | Sun Jan 10 04:00:54 294247
Expand All @@ -92,7 +93,7 @@ double_array | {42,NaN,Infinity,-Infinity,NULL,-42}
date_array | {01-01-1970,07-11-5881580,07-13-5881580,NULL,05-12-2022}
timestamp_array | {"Thu Jan 01 00:00:00 1970","Sun Jan 10 04:00:54.775807 294247","Sun Jan 10 04:00:54.775809 294247",NULL,"Thu May 12 16:23:45 2022"}
varchar_array | {🦆🦆🦆🦆🦆🦆,goose,NULL,""}
-[ RECORD 3 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 3 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool |
smallint |
int |
Expand All @@ -103,6 +104,7 @@ utinyint |
usmallint |
uint |
ubigint |
varint |
date |
timestamp |
timestamp_s |
Expand Down
1 change: 0 additions & 1 deletion test/regression/sql/test_all_types.sql
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ SELECT * FROM duckdb.query($$
FROM test_all_types()
SELECT * exclude(
tinyint, -- PG14 outputs this differently currently
varint,
TIME,
time_tz,
bit,
Expand Down
Loading