Skip to content
This repository has been archived by the owner on Jan 10, 2023. It is now read-only.

Distinct and OrderBy Unindexed field #269

Open
falahati opened this issue Oct 12, 2020 · 3 comments
Open

Distinct and OrderBy Unindexed field #269

falahati opened this issue Oct 12, 2020 · 3 comments

Comments

@falahati
Copy link

falahati commented Oct 12, 2020

A simple query as follow works great:

await db
    .select(lf.fn.distinct(table.id))
    .from(table)
    .orderBy(table.id, lf.Order.ASC)
    .limit(10)
    .exec();

or even by an string field:

await db
    .select(lf.fn.distinct(table.id))
    .from(table)
    .orderBy(table.title, lf.Order.ASC)
    .limit(10)
    .exec();

But ordering by date doesn't work in my case:

await db
    .select(lf.fn.distinct(table.id))
    .from(table)
    .orderBy(table.created, lf.Order.ASC)
    .limit(10)
    .exec();

By not working I mean it gives me a result that doesn't change if I modify the order to DESC or ASC. It just returns a static result as if the date created is zero or the same number for all rows. Checked manually and each row does in fact have a different value.
Removing the distinct function fixes the problem.

If you need any additional information, please ask.

@falahati
Copy link
Author

falahati commented Oct 12, 2020

Additionally, when in a join, distinct also breaks the orderBy when ordered by a field in a table that is not used in distinct argument:

await db
   .select(lf.fn.distinct(table.id))
   .from(table, secondTable)
   .where(secondTable.parentId.eq(table.id))
   .orderBy(secondTable.id, lf.Order.ASC)
   .limit(10)
   .exec();

This will yield the same result as when ordering by created in the main table.

@freshp86
Copy link
Contributor

By not working I mean it gives me a result that doesn't change if I modify the order to DESC or ASC. It just returns a static result as if the date created is zero or the same number for all rows.

Could you post the code that registers the db schema? Can you verify that

  • created is registered as lf.Type.DATE_TIME?
  • when you populate these fields you are actually adding JS Date objects and not something else?

Removing the distinct function fixes the problem.

That sounds pretty interesting. Do you need to use distinct() though, if table.id is already unique (I assume?). Either way, does an equivalent query in SQLlite returns the results you expect?

@falahati
Copy link
Author

falahati commented Oct 12, 2020

Well, I just dug a little deeper into this, and here is what I have found:

  • If there is a distinct function involved, sorting can only be done by an indexed field. Fields without index just won't work for sorting. This needs to be documented if it is actually intentional.

  • If there is a distinct function involved in a multi-table query, sorting only works on the primary key of the table involved in the distinct function and won't work for any other field, indexed or not.

https://codepen.io/s-flhti/pen/mdEebgP

I am wondering what would happen when it is sorted by an aggregated function on a groupBy table1.parentId result set in a multi-table query since there is no index for the aggregated function. Like this:

await db
   .select(table.id)
   .from(table, table2)
   .where(table2.parentId.eq(table.id))
   .groupBy(table2.parentId)
   .orderBy(lf.fn.min(table2.integerField), lf.Order.ASC)
   .limit(10)
   .exec();

But that's another discussion.

EDIT

created is registered as lf.Type.DATE_TIME?

yes

when you populate these fields you are actually adding JS Date objects and not something else?

yes, I have a Typescript wrapper on top of LoveFields and it is strongly typed to the interfaces I have, so I can't really make a mistake like this. Unless I explicitly cast the value to any. I wrote this wrapper to save me especially from these sorts of mistakes.

Do you need to use distinct() though, if table.id is already unique (I assume?).

Well, it is complicated. I don't really need to use distinct on table1.id. I just did that as part of the example posting here. In reality, I have a dynamic list of filters that might happen on table1, or table2, or both. And since each row in table1 has multiple rows in table2, I need to use distinct on table2.parentId to get a clean and unique list of items. I do also have sorting options that might happen on table1 or table2 so I need to keep that in mind too. I was hoping to be able to do all of these filters and sorting in one query along with pagination (skip, limit) but apparently, it is impossible for now. So I have decided to do it in 3 different queries, and do the sorting and pagination later on the JS side. Lovefield is fast enough that I can afford to do so with around 12k rows in both tables in less than 100ms. So no worries.

Either way, does an equivalent query in SQLlite returns the results you expect?

SELECT DISTINCT(t1.id) FROM table1 as t1 ORDER BY t1.title DESC works as expected and needs no index to return the valid resultset.

SELECT DISTINCT(t1.id) FROM table1 as t1 INNER JOIN table2 as t2 ON t2.parentid = t1.id ORDER BY t2.created DESC also works as expected.

So both these two limitations are apparently only applicable to Lovefield and SQLite has no such rules and limitations regarding distinct and indexes, or distinct and inner join.

Tested with https://sqliteonline.com/ with this data:
sqlite.zip

@falahati falahati changed the title Distinct and OrderBy Date Distinct and OrderBy Unindexed field Oct 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants