Description
Let's go through the MySQL documentation pages and make sure even the complex SELECT queries are supported by the SQLite integration plugin:
- https://dev.mysql.com/doc/refman/8.0/en/select.html
- https://dev.mysql.com/doc/refman/8.0/en/join.html
- https://dev.mysql.com/doc/refman/8.0/en/union.html
- https://dev.mysql.com/doc/refman/8.0/en/intersect.html
- https://dev.mysql.com/doc/refman/8.0/en/except.html
- https://dev.mysql.com/doc/refman/8.0/en/with.html
This likely means rewriting execute_select
as more of a grammar parser or a state machine and reason about each encountered token. In contrast, the current approach is to consume all the tokens unless a tactical adjustment applies. This way we could reuse the SELECT logic for WITH, UNIONs, subqueries, etc. Currently we cannot, because the execute_select
method assumes it acts on an entire query, not on a part of it.
The implementation could look like this:
// Parse WITH
if($next_token->is_operator('WITH')) {
$this->consume_with_clause();
}
/**
* Processes the WITH clause (https://dev.mysql.com/doc/refman/8.0/en/with.html):
* WITH [RECURSIVE]
* cte_name [(col_name [, col_name] ...)] AS (subquery)
* [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ...
*/
protected function consume_with_clause() {
$token = $this->rewriter->consume();
if($token->is_operator('RECURSIVE')) {
$token = $this->rewriter->consume();
}
while(true) {
$table_alias = $this->rewriter->consume();
$token = $this->rewriter->consume();
$column_aliases = null;
if($token->is_operator('(')) {
$column_aliases = [];
// ...parse column aliases...
}
$token = $this->rewriter->consume_assert_is_keyword( 'AS' );
$this->consume_sub_query();
$comma_maybe = $this->rewriter->peek();
if(!$comma_maybe->is_operator(',')) {
break;
}
}
}
/**
* Processes the SELECT statement (https://dev.mysql.com/doc/refman/8.0/en/select.html)
* SELECT
* [ALL | DISTINCT | DISTINCTROW ]
* [HIGH_PRIORITY]
* [STRAIGHT_JOIN]
* [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
* [SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
* select_expr [, select_expr] ...
*/
protected function consume_select_query() {
$this->rewriter->consume_assert_is_keyword( 'SELECT' );
$token = $this->rewriter->peek();
if($token->is_keyword(['ALL', 'DISTINCT', 'DISTINCTROW'])) {
$this->rewriter->consume();
$token = $this->rewriter->peek();
}
if($token->is_keyword('HIGH_PRIORITY')) {
$this->rewriter->skip();
$token = $this->rewriter->peek();
}
// ... keep going token by token, don't just skip over things like we do now
// with a while loop ...
if($is_subquery) {
$this->consume_sub_query();
}
// inevitably at some point:
if($token->is_keyword('UNION')) {
$this->consume_select_query();
}
}
protected function consume_sub_query() {
// ... consume a nested query ...
// ... can it be just a SELECT query? Or can it also be something else? ...
// ... can it have a WITH clause? ...
// ...
// inevitably at some point:
$this->consume_select_query();
}
For starters, just migrating to a state machine approach would be more than enough as it would unlock support for UNIONs and easy ignoring of tokens like HIGH_PRIORITY
or SQL_SMALL_RESULT
.