Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VARINT and numerics type conversion #626

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions src/pgduckdb_types.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -334,9 +334,29 @@ ConvertNumeric(const duckdb::Value &ddb_value, idx_t scale, NumericVar &result)
}
}

// A wrapper function to convert a string to a numeric value, used for `PostgresFunctionGuard` macro.
static Datum
ConvertStringToNumerics(Datum str, Datum typelem, Datum typmod) {
Datum pg_numeric = DirectFunctionCall3(numeric_in, str, typelem, typmod);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise compilation failure

src/pgduckdb_types.cpp: In function 'Datum pgduckdb::ConvertNumericDatum(const duckdb::Value&)':
include/pgduckdb/pgduckdb_utils.hpp:106:68: error: no matching function for call to '__PostgresFunctionGuard__<Datum (*)(FunctionCallInfo), numeric_in>(const char [11], Datum, Datum, Datum)'
  106 |         pgduckdb::__PostgresFunctionGuard__<decltype(&FUNC), &FUNC>(#FUNC, ##__VA_ARGS__)
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
src/pgduckdb_types.cpp:346:42: note: in expansion of macro 'PostgresFunctionGuard'
  346 |                 const Datum pg_numeric = PostgresFunctionGuard(numeric_in, CStringGetDatum(value_str.c_str()),
      |                                          ^~~~~~~~~~~~~~~~~~~~~
include/pgduckdb/pgduckdb_utils.hpp:83:1: note: candidate: 'template<class Func, Func func, class ... FuncArgs> typename std::invoke_result<Func, FuncArgs ...>::type pgduckdb::__PostgresFunctionGuard__(const char*, FuncArgs ...)'
   83 | __PostgresFunctionGuard__(const char *func_name, FuncArgs... args) {
      | ^~~~~~~~~~~~~~~~~~~~~~~~~

return pg_numeric;
}

static Datum
ConvertNumericDatum(const duckdb::Value &value) {
auto value_type_id = value.type().id();

// Special handle duckdb VARINT type.
if (value.type().id() == duckdb::LogicalTypeId::VARINT) {
// The performant way to handle the translation is to parse VARINT out, here we leverage string conversion and
// parsing mainly for code simplicity.
const std::string value_str = value.ToString();
const Datum pg_numeric =
PostgresFunctionGuard(ConvertStringToNumerics, /*str=*/CStringGetDatum(value_str.c_str()),
/*typelen=*/ObjectIdGetDatum(InvalidOid), /*typmod=*/Int32GetDatum(-1));
Copy link
Collaborator

@JelteF JelteF Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes most sense to create a function in pg/types.cpp

Datum
StringToNumeric(char* str) {
	// ...
}

And then have that function call the actual function with a PostgresFunctionGuard and name that actual function StringToNumeric_C. Similarly to how GetBaseDuckColumnType is implemented.

static Oid
GetBaseDuckColumnType_C(Oid attribute_type_oid) {
Oid typoid = attribute_type_oid;
if (get_typtype(attribute_type_oid) == TYPTYPE_DOMAIN) {
/* It is a domain type that needs to be reduced to its base type */
typoid = getBaseType(attribute_type_oid);
} else if (type_is_array(attribute_type_oid)) {
Oid eltoid = get_base_element_type(attribute_type_oid);
if (OidIsValid(eltoid) && get_typtype(eltoid) == TYPTYPE_DOMAIN) {
/* When the member type of an array is domain, you need to build a base array type */
typoid = get_array_type(getBaseType(eltoid));
}
}
return typoid;
}
Oid
GetBaseDuckColumnType(Oid attribute_type_oid) {
return PostgresFunctionGuard(GetBaseDuckColumnType_C, attribute_type_oid);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Updated.

return pg_numeric;
}

// Special handle duckdb DOUBLE TYPE.
if (value_type_id == duckdb::LogicalTypeId::DOUBLE) {
return ConvertDoubleDatum(value);
}
Expand Down Expand Up @@ -1112,6 +1132,8 @@ GetPostgresDuckDBType(const duckdb::LogicalType &type) {
return NUMERICOID;
case duckdb::LogicalTypeId::UUID:
return UUIDOID;
case duckdb::LogicalTypeId::VARINT:
return NUMERICOID;
case duckdb::LogicalTypeId::LIST: {
const duckdb::LogicalType *duck_type = &type;
while (duck_type->id() == duckdb::LogicalTypeId::LIST) {
Expand Down
10 changes: 6 additions & 4 deletions test/regression/expected/test_all_types.out
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ SELECT * FROM duckdb.query($$
FROM test_all_types()
SELECT * exclude(
tinyint, -- PG14 outputs this differently currently
varint,
TIME,
time_tz,
bit,
Expand All @@ -28,7 +27,7 @@ SELECT * exclude(
nested_int_array, -- The nested array has different lengths, which is not possible in PG
)
$$)
-[ RECORD 1 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 1 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool | f
smallint | -32768
int | -2147483648
Expand All @@ -39,6 +38,7 @@ utinyint | 0
usmallint | 0
uint | 0
ubigint | 0
varint | -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
date | 07-14-5881580
timestamp | Sun Jan 10 08:01:49.551616 294247
timestamp_s | Sun Jan 10 08:01:49.551616 294247
Expand All @@ -60,7 +60,7 @@ double_array | {}
date_array | {}
timestamp_array | {}
varchar_array | {}
-[ RECORD 2 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 2 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool | t
smallint | 32767
int | 2147483647
Expand All @@ -71,6 +71,7 @@ utinyint | 255
usmallint | 65535
uint | 4294967295
ubigint | 18446744073709551615
varint | 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
date | 07-10-5881580
timestamp | Sun Jan 10 04:00:54.775806 294247
timestamp_s | Sun Jan 10 04:00:54 294247
Expand All @@ -92,7 +93,7 @@ double_array | {42,NaN,Infinity,-Infinity,NULL,-42}
date_array | {01-01-1970,07-11-5881580,07-13-5881580,NULL,05-12-2022}
timestamp_array | {"Thu Jan 01 00:00:00 1970","Sun Jan 10 04:00:54.775807 294247","Sun Jan 10 04:00:54.775809 294247",NULL,"Thu May 12 16:23:45 2022"}
varchar_array | {🦆🦆🦆🦆🦆🦆,goose,NULL,""}
-[ RECORD 3 ]---+-------------------------------------------------------------------------------------------------------------------------------------
-[ RECORD 3 ]---+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bool |
smallint |
int |
Expand All @@ -103,6 +104,7 @@ utinyint |
usmallint |
uint |
ubigint |
varint |
date |
timestamp |
timestamp_s |
Expand Down
1 change: 0 additions & 1 deletion test/regression/sql/test_all_types.sql
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ SELECT * FROM duckdb.query($$
FROM test_all_types()
SELECT * exclude(
tinyint, -- PG14 outputs this differently currently
varint,
TIME,
time_tz,
bit,
Expand Down