Duplicate showing when its not supposed to

Discussion related to "Everything" 1.5 Alpha.
Post Reply
anmac1789
Posts: 668
Joined: Mon Aug 24, 2020 1:16 pm

Duplicate showing when its not supposed to

Post by anmac1789 »

There seems to be a problem here

This is the default view for this picture



2. This is the picture after applying a dupe:
default view 2.png
default view 2.png (32 KiB) Viewed 1445 times

3. This is the default view with multiple pictures search:
default view multiple pics.png
default view multiple pics.png (150.28 KiB) Viewed 1445 times

When I apply the dupe function: dupe:dc;dm;da it still shows

"Yw2sX8-vJK86ENJWaYYN1b7Ykg0=PHOTO-2019-11-09-10-40-42" in J:\photos app\ even though this picture shouldn't when compared to picture 2

filtered picture.png
filtered picture.png (136.45 KiB) Viewed 1445 times
This is one of the reasons why I had to move to excel because using =textjoin it was able to correctly match duplicates except with a more messey colour and gui experience. Ask me for the excel workbook through PM and I will happily illustrate the issue
Last edited by void on Fri Oct 07, 2022 1:40 am, edited 3 times in total.
Reason: moved to Everything 1.5 fourm
anmac1789
Posts: 668
Joined: Mon Aug 24, 2020 1:16 pm

Re: Duplicate showing when its not supposed to

Post by anmac1789 »

Here is picture 1, sorry about that I dont know why it wont let me attach more than 3 pictures

1.
default view.png
default view.png (32.66 KiB) Viewed 1443 times
void
Developer
Posts: 16672
Joined: Fri Oct 16, 2009 11:31 pm

Re: Duplicate showing when its not supposed to

Post by void »

"Yw2sX8-vJK86ENJWaYYN1b7Ykg0=PHOTO-2019-11-09-10-40-42" in J:\photos app\ even though this picture shouldn't when compared to picture 2
Because the date created, date modified and date accessed is the same as another file in your results.
Please try sorting by date created to confirm.

I think you are expecting the name and size to be included when searching for duplicates.
However, you have only specified dc;dm;da in your dupe: search.



Notes:
I am going to assume you have disabled date accessed on your system.
Loading the icon or thumbnail would likely cause the date accessed to change.

dupe:dc;dm;da will find duplicates for date created, date modified and date accessed down to the 100-nanoseconds.
Even though Everything only shows seconds in the result list.

If you want to find duplicate content, please try the following search:
ext:jpg dupe:size;first-256-bytes;sha256

I do not recommend using dates to reliably find duplicated files.
Although, it can be useful to give an instant guide to possibly duplicated files.
In which case, using just one date should be enough, please try the following search:
dupe:size;dm
-or-
dupe:size;name;dm
anmac1789
Posts: 668
Joined: Mon Aug 24, 2020 1:16 pm

Re: Duplicate showing when its not supposed to

Post by anmac1789 »

Because the date created, date modified and date accessed is the same as another file in your results.
Please try sorting by date created to confirm.

I think you are expecting the name and size to be included when searching for duplicates.
However, you have only specified dc;dm;da in your dupe: search.
Well thats the idea, I dont want to search by the name because the date and time of that one file is off by 1 hour so i want it to be excluded but because it shares the date and time with another file, it shows up. The only way I can think around this is if there is a conditional format, similar to how excel does.
Notes:
I am going to assume you have disabled date accessed on your system.
Loading the icon or thumbnail would likely cause the date accessed to change.
I'm not sure what you mean by this, the date accessed dupes did show up correctly
I do not recommend using dates to reliably find duplicated files.
Although, it can be useful to give an instant guide to possibly duplicated files.
In which case, using just one date should be enough, please try the following search:
dupe:size;dm
-or-
dupe:size;name;dm
I've literally tried tens of combinations and date modified or size or filename length and played around with almost properties. I'm not sure how excel is finding better duplicates than everything :(

Here is a screenshot for excel, although, mind you still some other formatting needs to be done on excel list to further illustrate better dupe finding.
excel dupes.png
excel dupes.png (261.82 KiB) Viewed 1425 times
As you can see in the screenshot, unhighlighted cells in column G are the ones that everything should've excluded. Column A is filename, B is parent path, C is size, D, E, F are date cr, date md, date ac respectively and column G is =TEXTJOIN("||",TRUE,C1:F1) joining together values from column C to F bonded by two | characters (shift+backwards slash) as the helper column
void
Developer
Posts: 16672
Joined: Fri Oct 16, 2009 11:31 pm

Re: Duplicate showing when its not supposed to

Post by void »

In excel, it looks like you are finding duplicates on size, date created (to the second), date modified (to the second) and date accessed (to the second).

Could you please give an example, where dupe:size;dm is not giving the exact same results.
What files is Everything missing, or reporting incorrectly?
anmac1789
Posts: 668
Joined: Mon Aug 24, 2020 1:16 pm

Re: Duplicate showing when its not supposed to

Post by anmac1789 »

Namely, these two filenames which worked for these two but not for a greater number of files with this exact same problem it somehow fails for other pictures

Yw2sX8-vJK86ENJWaYYN1b7Ykg0=PHOTO-2019-11-09-10-40-42.jpg
yVLCjLcsj2VARrK99fcvFIsQAKc=PHOTO-2019-11-09-10-41-20.jpg

That's what I wanted to find duplicates based on all 4 properties (to the second for dates and times)

See example below
example duplicate.png
example duplicate.png (143.23 KiB) Viewed 1424 times
void
Developer
Posts: 16672
Joined: Fri Oct 16, 2009 11:31 pm

Re: Duplicate showing when its not supposed to

Post by void »

Thanks for the images.

What was the dupe command used in the first image?
-It looks like it is still dupe:dc;dm;da
These results are expected for the dupe:dc;dm;da command.
I think you are expecting size to be included when finding duplicates?
dupe:dc;dm;da will not include the size when finding duplicates.


The second image shows the dupe:date-accessed with the expected results.
That is results that share the same date accessed.
I think you are expecting size to be included when finding duplicates?



Could you please give an example, where dupe:size;dm is not giving the exact same results as Excel.
What files is Everything missing, or reporting incorrectly?
anmac1789
Posts: 668
Joined: Mon Aug 24, 2020 1:16 pm

Re: Duplicate showing when its not supposed to

Post by anmac1789 »

Ahh I see the problem. It's because 2 filenames (XnKTgnu7Ny1LvkvaulNrqOHVCsk=PHOTO-2019-11-08-23-08-12, tkTMpj3cPFl0H1ofBoGlRzWUiN4=PHOTO-2019-11-08-23-08-10) have the same size of 8,238 bytes. Both of these filenames are located in J:\photos app and have ‎Friday, ‎November ‎08, ‎2019, ‏‎10:08:10 PM as date created, date modified and date accessed. The same two duplicate files are located in other folders with the date created, date modified and date accessed set as ‎Friday, ‎November ‎08, ‎2019, ‏‎11:08:10 PM. Am I correct in this reasoning that's why Everything wasn't able to exclude these entries from the search result after adding dupe:size;dm. Am I correct in this reasoning? Excel 365 has shown this result by highlighting all entries as duplicates on textjoin functon.

BTW, I have a question, can two dupe: be used together?

Everything:
verified dupes.png
verified dupes.png (57.27 KiB) Viewed 1408 times
Excel 365:
excel 365 dupes.png
excel 365 dupes.png (72.24 KiB) Viewed 1408 times
Post Reply