Analyze content in files within content list (e.g. CRC32)

Have a suggestion for "Everything"? Please post it here.
Post Reply
rgbigel
Posts: 41
Joined: Sun Apr 17, 2011 4:00 pm

Analyze content in files within content list (e.g. CRC32)

Post by rgbigel »

I have all kinds of duplicates, but it makes no sense to index contents everywhere candidates could exist.
E.G. a file named boxcert.cer.pem exists in all sorts of places, with differing dates and all same size.
what I would love to do is to run a command that shows the CRC32 (or such) in an extra (temporary) column for only the files in the result list.
something like "showcontent:CRC32"

Thank you, Rolf ;)
therube
Posts: 4953
Joined: Thu Sep 03, 2009 6:48 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by therube »

In Everything 1.5 Alpha, you can add a Property [column] like MD5 (or CRC) hash, & so long as the column in not in focus; i.e. outside the current view-port, the hashes will not be computed. Bring the column into focus; i.e. scroll to the right such that it is shown, & hashes for those files seen will be computed (in lazy-load manner).

Alternatively, you can associate an external hash program to compute the hashes of selected files (which could be done via context-menu, if the hash program supplies one, or SendTo, or [an .exe or batch file] launched via hotkey set in Everything).


(1.5 can also Index Properties, such that a hash, or any other Property, will always be available.)
void
Developer
Posts: 16665
Joined: Fri Oct 16, 2009 11:31 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by void »

I recommend .sfv (CRC32) sidecar files.
rgbigel
Posts: 41
Joined: Sun Apr 17, 2011 4:00 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by rgbigel »

therube wrote: Mon May 13, 2024 3:05 pm In Everything 1.5 Alpha, you can add a Property [column] like MD5 (or CRC) hash, & so long as the column in not in focus; i.e. outside the current view-port, the hashes will not be computed. Bring the column into focus; i.e. scroll to the right such that it is shown, & hashes for those files seen will be computed (in lazy-load manner).
...
I've been using 1.5a exclusively for a long time. Still, Some of its newer capabilities remain a mystery (how about a help like "Everything for dummies"?)

Following your suggestion, I tried a lot.
- I added the CRC32 column via "organize Columns". No luck, it remains empty.
- finally found and used the Add-column:crc-32, but without the column "CRC-32 the effect is nil, no column is added at all. (I correct that: if there is another Tab that does contain the CRC32-column, it works ?!)
- if the CRC32 was already there, Add-column:crc-32 actually does fill the values of it!!
- I tried .pem add-column:sha-256 and that does add the column SHA-256, but it is empty. Tried same with a previously added SHA-256 column, still always empty.

Like I said, a mystery. Any illuminations very welcome.
void
Developer
Posts: 16665
Joined: Fri Oct 16, 2009 11:31 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by void »

If the column appears empty, Everything might be busy gathering the CRC32 for a large file.

Please try closing all Everything windows to cancel any pending requests.



If you are indexing crc32, it may take a long time for Everything to gather this information.
Progress is shown in the status bar.



Please try the following search:

*.pem addcolumn:sha256 dupe:size;sha256
GeoNomad
Posts: 2
Joined: Fri Jan 15, 2016 2:27 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by GeoNomad »

Has anyone done speed comparisons on CRC32 vs MD5 and the other methods of checking unique files?

I have 7 million files (5TB) to dedupe (some same file with different names), and am waiting while CRC32 does its thing.

Just wondered if a different method might be faster.
therube
Posts: 4953
Joined: Thu Sep 03, 2009 6:48 pm

Re: Analyze content in files within content list (e.g. CRC32)

Post by therube »

(
Has anyone done speed comparisons on CRC32 vs MD5 and the other methods of checking unique files?
Do a search here, "voidhash" &/or "MD5" are apt to turn up some hits.

xxHash (non-crypto) is the fastest, but not supported in Everything.
Depending on circumstances, i.e., where a bottle-neck may reside, there might not be too much difference in actual hash computations (vs. theoretical hash speeds).
)
Post Reply