Image duplicate bug

If you are experiencing problems with "Everything", post here for assistance.
Post Reply
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Image duplicate bug

Post by anmac1789 »

This may be a bug that I found, the first 3 results should not show, because I have matched with date created, date modified, date accessed AND name so for that filename ending in "...10-44-46.jpg" it should only show 2 results because for the 1st filename, the dates do not match. For the filename ending in "...10-44-46 2.jpg" the same applies. Is this function intended?
bug.png
bug.png (49.65 KiB) Viewed 4957 times
NotNull
Posts: 5458
Joined: Wed May 24, 2017 9:22 pm

Re: Image duplicate bug

Post by NotNull »

Seems OK to me.

(first off: you specified da: twice instead of dc: da:, but that doesn't make a difference here)

First you match for files with the same dc: da: and dm:
Resulting in 2 groups: 9:53 and 10:53. 6 files in total.

Then you search for duplicate names among these 6 files.
Resulting in 2 groups: 10-44-46.jpg and 10-44-46-2.jpg. Still 6 files in total.
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

Thanks for finding out that mistake

But when I search for these files individually, the proper files show up. For example in the picture below:

bug 2.png
bug 2.png (83 KiB) Viewed 4937 times

Maybe, there is a way to run a secondary search on the 1st result list to narrow it down ?
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

I'm not quite sure what you are trying to find here..

Some notes: dupe:size;dm;dc dupe:da may not work as intended.
This will find files with the same size, dm and dc (where two+ results have ALL the same property values)
and then find results with the same date accessed, which will throw out a lot of results..


In other words, dupe: is limited to 3 properties.

Using dates to find duplicates is not ideal.
These values are likely to be unique to each file.



ext:jpg dupe:size should give the best results.

If you want to compare content, please try:

ext:jpg dupe:size;first-256-bytes;sha256
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

What I'm trying to do is to match up duplicate files with the same size but then also narrow down those results to match only with the same date created, date modified, date accessed, size and thats it. The duplicates have (1) or (space) 2.jpg for duplicates, thats how I know they are duplicates, sometimes when I was importing things from my android phone, the pictures imported were of another picture with the same name, thats why I'm trying to narrow down the searches to only find duplicates except the name.

I was thinking that by isolating duplicate pictures by dates and times and the size, it would only match given the critera given and narrow down the results each time after a property value was evaluated after the 1st property value, 2nd value then 3rd value.

The problem is that a picture has size 8900 bytes. It's the same picture (picture of a family member). The issue is that it has different date and timestamps and different locations:

vMNV2sVD6PT2DhhMDc0AzC5IKbI=PHOTO-2019-11-09-10-44-46 2.jpg located in:

location 1: D:\windows 10 import\Phone\Android\data\com.ghisler.android.TotalCommander\cache
location 2: C:\Users\username\Pictures\s8 backup\Phone\Android\data\com.ghisler.android.TotalCommander\cache

created: ‎Saturday, ‎November ‎09, ‎2019, ‏‎10:53:22 AM
modified: ‎Saturday, ‎November ‎09, ‎2019, ‏‎10:53:22 AM
accessed: ‎Saturday, ‎November ‎09, ‎2019, ‏‎10:53:22 AM

the 3rd file duplicate is located in:

location 3: D:\new photos

‎created: ‎‎Saturday, ‎November ‎09, ‎2019, ‏‎9:53:22 AM
modified: ‎Saturday, ‎November ‎09, ‎2019, ‏‎9:53:22 AM
accessed: ‎Saturday, ‎November ‎09, ‎2019, ‏‎9:53:22 AM

So, in other words, how do I isolate exact duplicates (with only 4 properties to match -- size, 3 dates and timestamps) as unique values? so that the result shows location 1 and 2 ONLY and negates location 3?

It worked for one file at a time, but it doesn't work with 2 files or more together. The reason why I dont want to match by filename is because, the same file can have a different filename as a duplicate. For example, photo.jpg saved from a camera to PC. Camera import again the same photo into the same folder thus creating a duplicate photo 1.jpg or photo (2).jpg (depending on device how it renames a duplicate). There are not that many files like this maybe 100 files or so

Suggestion: Wouldn't it be better using regular expressions or something to assign a unique value for each combination of hour and minute and second? for example :

hour=0 min=0 sec=0 --> assign value 0
hour=0 min=1 sec=0 --> assign value 1
hour=0 min=2 sec=0 --> assign value 2
hour=0 min=3 sec=0 --> assign value 3
....
hour=23 min=59 sec=56 --> assign value 996 (example value)
hour=23 min=59 sec=57 --> assign value 997
hour=23 min=59 sec=58 --> assign value 998
hour=23 min=59 sec=59 --> assign value 999

I've done some calculations. There should be 15,120 combinations of hour:minute:second from 0:0:0 to 23:59:59 (excluding 24:0:0)
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

Currently, Everything is limited to comparing 3 properties.

I am working on a solution..
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

I just came across secondary-sort parameter, I was wondering if something similar is available for properties and also if you are going to add more properties to dupe then I am okay with waiting for that. Thanks again
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

Everything 1.5.0.1322a adds Column Formulas.

Please try the following search:

Code: Select all

column1:=TEXTJOIN(";",TRUE,size:,formatfiletime(dm:,"YYYY-MM-DD\THH:mm:ss"),formatfiletime(dc:,"YYYY-MM-DD\THH:mm:ss"),formatfiletime(da:,"YYYY-MM-DD\THH:mm:ss")) add-column:column1 dupe:column1
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

void wrote: Thu Oct 13, 2022 7:15 am Everything 1.5.0.1322a adds Column Formulas.

Please try the following search:

Code: Select all

column1:=TEXTJOIN(";",TRUE,size:,formatfiletime(dm:,"YYYY-MM-DD\THH:mm:ss"),formatfiletime(dc:,"YYYY-MM-DD\THH:mm:ss"),formatfiletime(da:,"YYYY-MM-DD\THH:mm:ss")) add-column:column1 dupe:column1
Thank you is there a way to shorten this ?
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

The formulas in Everything are very forgiving.
You don't need to use the correct Excel syntax.

Please try the following:

Code: Select all

column1:=size:;dm:;dc:;da: dupe:column1
Note: dm:, dc: and da: use 100-nano-second precision.



Consider the following:

Code: Select all

column1:=size:;formatfiletime(dm:);formatfiletime(dc:);formatfiletime(da:) dupe:column1
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

void wrote: Thu Oct 13, 2022 7:22 am The formulas in Everything are very forgiving.

Please try the following:

Code: Select all

column1:=size:;dm:;dc:;da: dupe:column1
Note: dm:, dc: and da: use 100-nano-second precision.



Consider the following:

Code: Select all

column1:=size:;formatfiletime(dm:);formatfiletime(dc:);formatfiletime(da:) dupe:column1
Much appreciated, now I dont need to use both everything and excel, I can just use everything
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

I just found out that column1:=formatfiletime(dm:),formatfiletime(dc:) this doesn't work the 1st entry works for the correct format for date modified but the second entry for date created just says formatfiletime(dc:) as it's written isn't formatted properly. In earlier versions this used to work. What has changed since then ? I am using everything 1.5a 1335a
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

Thank you for the issue report anmac1789,

The correct syntax is:

concat(formatfiletime($dm:),",",formatfiletime($dc:))
-or-
formatfiletime($dm:)..","..formatfiletime($dc:))



The next update will treat the , as literal and parse the second function call correctly.
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

Thank you void, but may I ask why did it change? The syntax before was much more simpler.
Last edited by anmac1789 on Sun Jan 22, 2023 8:52 am, edited 2 times in total.
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

There was a fix to a parsing issue which has caused this unintentional change.
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

void wrote: Sun Jan 22, 2023 8:24 am There was a fix to a parsing issue which has caused this unintentional change.
No results are showing up. I am running as non-admin


update: nevermind I got it to work I didn't put it after column1:= lol That was my mistake, sorry void
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

Everything 1.5.0.1336a fixes an issue with parsing column expressions.

The following search should now set your column formula as expected:

column1:=formatfiletime(dm:),formatfiletime(dc:)

For example:
29/01/2023 20:46,29/01/2023 20:46
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

void wrote: Thu Feb 02, 2023 5:25 am Everything 1.5.0.1336a fixes an issue with parsing column expressions.

The following search should now set your column formula as expected:

column1:=formatfiletime(dm:),formatfiletime(dc:)

For example:
29/01/2023 20:46,29/01/2023 20:46
How can I use other parsing formats such as -- or | or {}
void
Developer
Posts: 16680
Joined: Fri Oct 16, 2009 11:31 pm

Re: Image duplicate bug

Post by void »

Please use double quotes to specify text.

For example:
column1:=formatfiletime(dm:).."|"..formatfiletime(dc:)

Shorter example:
column1:=formatfiletime(dm:)"|"formatfiletime(dc:)


Double quotes are only need for operators.
| is an operator meaning OR.
{} are not used as operators, so these don't need to be quoted.

For example:
column1:={formatfiletime(dm:)}{formatfiletime(dc:)}
anmac1789
Posts: 669
Joined: Mon Aug 24, 2020 1:16 pm

Re: Image duplicate bug

Post by anmac1789 »

void wrote: Thu Feb 02, 2023 6:05 am Please use double quotes to specify text.

For example:
column1:=formatfiletime(dm:).."|"..formatfiletime(dc:)

Shorter example:
column1:=formatfiletime(dm:)"|"formatfiletime(dc:)


Double quotes are only need for operators.
| is an operator meaning OR.
{} are not used as operators, so these don't need to be quoted.

For example:
column1:={formatfiletime(dm:)}{formatfiletime(dc:)}
Ohh okay so not just any character can be used as a space seperator lol
Post Reply