Skip to content

Fix for Postgres regex and like binary operators #1928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3478,10 +3478,18 @@ impl<'a> Parser<'a> {
| BinaryOperator::LtEq
| BinaryOperator::Eq
| BinaryOperator::NotEq
| BinaryOperator::PGRegexMatch
| BinaryOperator::PGRegexIMatch
| BinaryOperator::PGRegexNotMatch
| BinaryOperator::PGRegexNotIMatch
| BinaryOperator::PGLikeMatch
| BinaryOperator::PGILikeMatch
| BinaryOperator::PGNotLikeMatch
| BinaryOperator::PGNotILikeMatch
) {
return parser_err!(
format!(
"Expected one of [=, >, <, =>, =<, !=] as comparison operator, found: {op}"
"Expected one of [=, >, <, =>, =<, !=, ~, ~*, !~, !~*, ~~, ~~*, !~~, !~~*] as comparison operator, found: {op}"
),
span.start
);
Expand Down
60 changes: 46 additions & 14 deletions tests/sqlparser_postgres.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2180,21 +2180,39 @@ fn parse_pg_regex_match_ops() {
("!~*", BinaryOperator::PGRegexNotIMatch),
];

// Match against a single value
for (str_op, op) in pg_regex_match_ops {
let select = pg().verified_only_select(&format!("SELECT 'abc' {} '^a'", &str_op));
let select = pg().verified_only_select(&format!("SELECT 'abc' {str_op} '^a'"));
assert_eq!(
SelectItem::UnnamedExpr(Expr::BinaryOp {
left: Box::new(Expr::Value(
(Value::SingleQuotedString("abc".into())).with_empty_span()
)),
left: Box::new(Expr::Value(single_quoted_string("abc").with_empty_span(),)),
op: op.clone(),
right: Box::new(Expr::Value(
(Value::SingleQuotedString("^a".into())).with_empty_span()
)),
right: Box::new(Expr::Value(single_quoted_string("^a").with_empty_span(),)),
}),
select.projection[0]
);
}

// Match against any value from an array
for (str_op, op) in pg_regex_match_ops {
let select =
pg().verified_only_select(&format!("SELECT 'abc' {str_op} ANY(ARRAY['^a', 'x'])"));
assert_eq!(
SelectItem::UnnamedExpr(Expr::AnyOp {
left: Box::new(Expr::Value(single_quoted_string("abc").with_empty_span(),)),
compare_op: op.clone(),
right: Box::new(Expr::Array(Array {
elem: vec![
Expr::Value(single_quoted_string("^a").with_empty_span()),
Expr::Value(single_quoted_string("x").with_empty_span()),
],
named: true,
})),
is_some: false,
}),
select.projection[0]
)
}
}

#[test]
Expand All @@ -2206,21 +2224,35 @@ fn parse_pg_like_match_ops() {
("!~~*", BinaryOperator::PGNotILikeMatch),
];

// Match against a single value
for (str_op, op) in pg_like_match_ops {
let select = pg().verified_only_select(&format!("SELECT 'abc' {} 'a_c%'", &str_op));
let select = pg().verified_only_select(&format!("SELECT 'abc' {str_op} 'a_c%'"));
assert_eq!(
SelectItem::UnnamedExpr(Expr::BinaryOp {
left: Box::new(Expr::Value(
(Value::SingleQuotedString("abc".into())).with_empty_span()
)),
left: Box::new(Expr::Value(single_quoted_string("abc").with_empty_span(),)),
op: op.clone(),
right: Box::new(Expr::Value(
(Value::SingleQuotedString("a_c%".into())).with_empty_span()
)),
right: Box::new(Expr::Value(single_quoted_string("a_c%").with_empty_span(),)),
}),
select.projection[0]
);
}

// Match against all values from an array
for (str_op, op) in pg_like_match_ops {
let select =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understood the fix, is the issue specific to the ALL and ANY functions? the current test doesn't seem to make changes to the list of operators being tested so that its not clear to me why this test fails without the fix in this PR

Copy link
Contributor Author

@solontsev solontsev Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, when parsing expressions with regex and like operators, only literals are supported, while syntax like where a ~ any(array['x']) is also valid (matching against an array of values using ALL or ANY). I've added additional more complex expressions to the tests (fails with the current version) + allowed 8 more operators before the ALL and ANY functions in the parse_infix function.

| BinaryOperator::PGRegexMatch
| BinaryOperator::PGRegexIMatch
| BinaryOperator::PGRegexNotMatch
| BinaryOperator::PGRegexNotIMatch
| BinaryOperator::PGLikeMatch
| BinaryOperator::PGILikeMatch
| BinaryOperator::PGNotLikeMatch
| BinaryOperator::PGNotILikeMatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • allowed 8 more operators before the ALL and ANY functions in the parse_infix function.

Are the added operators covered by the tests? if not could we add tests covering them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the added operators covered by the tests? if not could we add tests covering them?

Yeah, I've added the tests as well. Maybe it was not the best to test 2 cases during each cycle iteration, I've modified it a bit: a previous match against a single value, and added one more to test against an array of values. Does it look better now?

Also, to wrap up: The problem now, is that before ANY, ALL or SOME functions, parser allows only 6 basic binary operators (gt, lt, gteq, lteq, eq, noteq). But those 8 additional are also valid. So the fix basically just adds them to the allowlist and tests for match and like operators are modified to make sure, parser doesn't return an error in a valid cases. I haven't added exhaustive test cases, just chose ANY function for match tests and ALL function for like tests.

pg().verified_only_select(&format!("SELECT 'abc' {str_op} ALL(ARRAY['a_c%'])"));
assert_eq!(
SelectItem::UnnamedExpr(Expr::AllOp {
left: Box::new(Expr::Value(single_quoted_string("abc").with_empty_span(),)),
compare_op: op.clone(),
right: Box::new(Expr::Array(Array {
elem: vec![Expr::Value(single_quoted_string("a_c%").with_empty_span())],
named: true,
})),
}),
select.projection[0]
)
}
}

#[test]
Expand Down