"sort" and "distinct" misbehaving if used in wrong order

Found a bug in "Everything"? report it here
Post Reply
Temm
Posts: 2
Joined: Fri Jan 29, 2021 7:55 pm

"sort" and "distinct" misbehaving if used in wrong order

Post by Temm »

Take the following query:

Code: Select all

run-count:!=0 add-columns:run-count:0 sort:run-count;path distinct:path
According to the documentation of distinct:
distinct:[property-list]

Find unique results (show only one result when there are duplicates) based on the current sort order or the specified semicolon delimited (;) list of properties.

If the properties are not specified, use sort: to specified the properties.

distinct: will first sort the results by the specified properties.
Then walk over these sorted results looking for distinct properties between the current item and the previous item.
Use sort: before this call to override this sorting.
using sort before distinct should make it so that we
- first sort by run-count
- then for elements with the same run-count sort by path
- then distinct loops through the sorted list to eliminate consecutive elements with the same path

Since elements are sorted by run-count first, path second, then consecutive elements deduplicated by path, this should result in an ad-hoc group-by distinct filtering from my understanding (would actually hide too many elements if last element of a run-count had same path as first element of next lower run-count but oh well).

Instead, the query displays very few results by a logic that I can not comprehend.

Here is a comparison of the query mentioned above, and the query mentioned above but without distinct:path
0026601.png
0026601.png (8.53 KiB) Viewed 6687 times
0026602.png
0026602.png (29.28 KiB) Viewed 6687 times
The screenshot of the version without distinct is cut off, but just from the first elements we can spot multiple Paths that are not present even once if we add distinct:path (such as "Y:/ReLive Videos/unknown")

If we instead swap the order of sort and distinct, we see the expected result for that query (which still isn't exactly what we want): Elements sorted by run-count, each path only represented once (vs each path only once per run-count value)

I can't see how this result makes sense so I'm filing it as a bug - please let me know if I just missed something
NotNull
Posts: 5517
Joined: Wed May 24, 2017 9:22 pm

Re: "sort" and "distinct" misbehaving if used in wrong order

Post by NotNull »

Not entirely sure, but I think distinct: works only on the primary sort. In this case runcount (sort:run-count;path).

So this should give the same results:

Code: Select all

run-count:!=0   add-columns:run-count   sort:run-count;path   distinct:

Does the following give you the desired reults?

Code: Select all

run-count:!=0   add-columns:run-count   sort:path distinct:path  sort:run-count
- or -

Code: Select all

run-count:   add-columns:run-count   distinct:path  sort:run-count
void
Developer
Posts: 17153
Joined: Fri Oct 16, 2009 11:31 pm

Re: "sort" and "distinct" misbehaving if used in wrong order

Post by void »

Please try the following search:

run-count:!=0 add-columns:run-count:0 sort:path;run-count distinct:path sort:run-count

Find duplicates
Post Reply