Duplicates

wrisco · Post by **wrisco** » Wed Jan 19, 2022 10:02 pm

Is it possible to find duplicates where multiple columns combined are duplicated instead of just individual columns?

This is what I want:
dupe: sizedupe: dmdupe:
File001 111kb 2010-10-10
File001 111kb 2010-10-10
File002 222kb 2011-10-10
File002 222kb 2011-10-10
File003 333kb 2012-10-10
File003 333kb 2012-10-10

This is what I get:
dupe: sizedupe: dmdupe:
File001 111kb 2012-10-10
File001 222kb 2011-10-10
File002 111kb 2010-10-10
File002 222kb 2010-10-10
File003 333kb 2012-10-10
File003 333kb 2011-10-10

raccoon · Post by **raccoon** » Wed Jan 19, 2022 10:29 pm

Everything 1.5 Alpha introduces the ability to perform column duplicate finding (unique rejecting) as well as unique finding (duplicate rejecting) from the right-click column header menus. These menu operations are stackable. The same operation can be performed by syntax but is not stackable at this time using syntax.

Right-click on a column header. Select Find xxx Duplicates. Repeat on the next column header. You can select the type of duplicate/unique finding method by further right-clicking on that menu item itself to reveal a secret hidden menu.

You're asking for stackable duplicate operations. However, what this will NOT do is reveal items that have both properties duplicated in tandem, explicitly. Only implicitly. Everything is only searching the column for duplicates in each sort operation, and so the duplicates it finds may be from unrelated items (ie, same size but different filename). There is currently no system in place to make sure that two object have, for example, exactly the same filename AND size. Just that more than 1 object in the Name column has the same name, and more than 1 object in the Size column has the same size. There's a good chance that they'll be the same name AND size, but not necessarily.

The benefit of the method in 1.5 versus 1.4 is that, at least, objects that do not have a duplicate name are discarded, and objects that do not have a duplicate size are also discarded. In 1.4, the duplicate finding looks at the entire Index and not just the objects remaining after each discard. 1.5's method only works on the refined on-screen results for each operation, so you can keep whittling it down.

You can compliment this by then adding the SHA-1 column once you have results whittled down to duplicate names and duplicate sizes, to locate items with duplicate SHA1 hash values. These files will be truly duplicate file contents.

Right-click the column headers, select Add Columns, type in SHA-1, select SHA-1, click OK, left-click on the SHA-1 column header to Sort the column, sit back for 10~30 minutes while Everything calculates the SHA1 hashes of the objects and the column fully populates, right-click the SHA-1 column header and select Find SHA-1 Duplicates. You now see true duplicates. Remember to do this only after you have refined and whittled down the results, so you don't unnecessarily calculate SHA-1 hashes of extraneous files. When you close the Everything window, these SHA-1 hash calculations will become forgotten.

wrisco · Post by **wrisco** » Thu Jan 20, 2022 8:03 pm

Thanks for the detailed explanation. So there's currently no way to do what I want. The hash columns would be interesting but I'm using filelists of offline drives.

raccoon · Post by **raccoon** » Fri Jan 21, 2022 3:54 am

See this post which covers your suggestion, and the one just above it for new changes that were made last night.

viewtopic.php?f=12&t=11014#p42745

voidtools forum

Duplicates

Duplicates

Re: Duplicates

Re: Duplicates

Re: Duplicates