Search/sort by frequency of file extension
Search/sort by frequency of file extension
Hello Forum,
I just noticed that there are tens of thousands of files with the *.pyc file extension on my system. Since I have no need to search for these files, I have excluded *.pyc from the index, which should reduce memory usage of Everything accordingly.
Now I'm wondering if there might be other file types that exist by massive numbers, but that I never search for.
I don't want to generally exclude the various installation folders for programs, as this would also exclude all programs and tools (i.e. *.exe files) themselves, including hundreds of portable programs.
Is there any way to sort all files on the system by file extension frequency with Everything, or with some other tool, for that matter?
I just noticed that there are tens of thousands of files with the *.pyc file extension on my system. Since I have no need to search for these files, I have excluded *.pyc from the index, which should reduce memory usage of Everything accordingly.
Now I'm wondering if there might be other file types that exist by massive numbers, but that I never search for.
I don't want to generally exclude the various installation folders for programs, as this would also exclude all programs and tools (i.e. *.exe files) themselves, including hundreds of portable programs.
Is there any way to sort all files on the system by file extension frequency with Everything, or with some other tool, for that matter?
Last edited by David.P on Sun May 08, 2022 11:30 am, edited 1 time in total.
Re: Search/sort by frequency of file extension
Everything 1.5 will have an extension frequency column.
To show extensions by frequency:
To reduce the number of duplicated extension results:
To show extensions by frequency:
- Right click the result list column header and click Add columns....
- Select for: ext
- Select Extension Frequency.
- Click OK.
- Click the Extension Frequency column header to gather and sort by extension frequency.
To reduce the number of duplicated extension results:
- Right click the Extension Frequency column header.
- Right click Find Extension Frequency Duplicates
- Click Find unique (including first duplicated).
Re: Search/sort by frequency of file extension
You and Everything™ are the worlds fastest!
Re: Search/sort by frequency of file extension
Please, another comprehension question on this::
What do the four different options mean for the (number of) search results?
And if I search for extension-frequency e.g. like this: then I don't get any search result
What do the four different options mean for the (number of) search results?
And if I search for extension-frequency e.g. like this:
*.* extension-frequency:>=333
Re: Search/sort by frequency of file extension
Find duplicates (including first one) -removes all items that are not duplicated: A, A, A, B, C => A, A, AWhat do the four different options mean for the (number of) search results?
Find duplicates (except first one) -removes all items that are not duplicated and the first duplicated item: A, A, A, B, C => A, A
Find unique (including first duplicated) -removes all items that are duplicated (except the first one): A, A, A, B, C => A, B, C
Find unique (not duplicated) -removes all items that are duplicated: A, A, A, B, C => B, C
The extension-frequency: search function is currently not supported.if I search for extension-frequency e.g. like this:
*.* extension-frequency:>=333
then I don't get any search result
I'll add this in the next alpha update.
Re: Search/sort by frequency of file extension
Thanks very much, looking forward to it!
I find that this option in particular provides an extremely powerful, informative result about the most frequent file extensions on one's system:
I find that this option in particular provides an extremely powerful, informative result about the most frequent file extensions on one's system:
Re: Search/sort by frequency of file extension
Where do the four, find duplicates dialogue come from?
Re: Search/sort by frequency of file extension
Right click on a column header (I believe any one will do), and then right click on "Find Date Created Duplicates"
Re: Search/sort by frequency of file extension
Holding down Shift and right-clicking the result list column header will also give more options.
Re: Search/sort by frequency of file extension
Thanks...the shift did it.
Re: Shift-Rightclick Menus
@void: Would you be inclined to make Shift-revealed menu items displayed in Italics? IDM_ITALIC
Re: Search/sort by frequency of file extension
I see (except for line 2) identical context menus, both with and without Shift
Last edited by David.P on Sun May 08, 2022 5:56 pm, edited 1 time in total.
Re: Search/sort by frequency of file extension
This killer feature is.... oddly satisfying.
I just wonder what the 2972 files of the half-baked type are good for
Re: Search/sort by frequency of file extension
And what would you do with the knowledge about all that file types ?
This list is not relevant at all for using or administrating a system.
The importand extensions are a rather small list which mainly comes from basic Windows functions
and the installed software.
I was admistrating servers with a lot of users and terabytes of data
but I never needed a list like this.
This list is not relevant at all for using or administrating a system.
The importand extensions are a rather small list which mainly comes from basic Windows functions
and the installed software.
I was admistrating servers with a lot of users and terabytes of data
but I never needed a list like this.
Re: Search/sort by frequency of file extension
-> Request: about:ext-survey -- File extension forum survey.
-> List of all file extensions used
-> List of all file extensions used
I never needed a thing and I don't see why anybody else should.horst.epp wrote: ↑Fri Jul 23, 2021 3:39 pm Just for interest, for what purpose do you need such a list.
Normaly I'm only interested on a few extensions which I have on daily use.
I even have a folder which test files for this extensions to check preview and thumbnail functions.
But for what reason do I need to know all the other extensions ?
Re: Search/sort by frequency of file extension
So ... what *is* the practical usefulness of this feature? I didn't get it from the linked thread either.
Or just curiousity? That I can relate to .
(Not that everything needs to be practical; I remember enjoying watching my disks getting defragmented. Every single time .. )
But it is cool that even this is possible with Everything!
Or just curiousity? That I can relate to .
(Not that everything needs to be practical; I remember enjoying watching my disks getting defragmented. Every single time .. )
But it is cool that even this is possible with Everything!
Re: Search/sort by frequency of file extension
@void: By using Find Unique on the Extension Frequency column, this will discard two different extensions that have the same frequency count. By contrast, using Find Unique on the Extension column itself will reset all the Extension Frequency counts to 1. Is there any way to lock hold on the Frequency count column to prohibit update?void wrote: ↑Sun May 08, 2022 11:30 am Everything 1.5 will have an extension frequency column.
To show extensions by frequency:To reduce the number of duplicated extension results:
- Right click the result list column header and click Add columns....
- Select for: ext
- Select Extension Frequency.
- Click OK.
- Click the Extension Frequency column header to gather and sort by extension frequency.
- Right click the Extension Frequency column header.
- Right click Find Extension Frequency Duplicates
- Click Find unique (including first duplicated).
Re: Search/sort by frequency of file extension
Yes, It's not 100% accurate when removing duplicates with this method.@void: By using Find Unique on the Extension Frequency column, this will discard two different extensions that have the same frequency count.
For the next alpha update, Everything will only populate Extension Frequency once after sorting by Extension Frequency.
You'll be able to use F5 to clear this cache.
This way, you can sort by Extension Frequency to build the initial cache, find distinct extensions and then resort by Extension Frequency.
Re: Search/sort by frequency of file extension
Typo'd file extension, perhaps? (Perhaps.)what *is* the practical usefulness of this feature?
There have been times when I've mistyped a file extension.
There are some that I have a habit of doing, fairly regularly.
So I might search for said extension (ext:wrongo) so I can fix it.
Other times, I'll search for ext: to possibly find files that should have an extension, but were omitted (so I can add them).
Other times, I'll exclude known extensions to possibly find file names with mistyped extensions (so I can fix them).
Would I use Extension Frequency for that, eh, probably not.
Re: Search/sort by frequency of file extension
Everything 1.5.0.1313a will now only populate Extension Frequency once after sorting by Extension Frequency.
Everything 1.5.0.1313a adds an extension-frequency: search function.
To list extensions by frequency:
Everything 1.5.0.1313a adds an extension-frequency: search function.
To list extensions by frequency:
- Right click the result list column header and click Add columns....
- Select for: ext
- Select Extension Frequency.
- Click OK.
- Right click the result list column header and click Add columns....
- Select for: ext
- Select Extension.
- Click OK.
- Click the Extension Frequency column header to gather and sort by extension frequency.
- Right click the Extension column header.
- Right click Find Extension Duplicates.
- Click Find unique (including first duplicated).
- Click the Extension Frequency column header to resort by extension frequency.
Re: Search/sort by frequency of file extension
The "lazy way": create bookmark for the following search
Code: Select all
add-columns:Extension;"Extension Frequency" distinct:Extension sort:"Extension Frequency"
Re: Search/sort by frequency of file extension
NotNull wrote: ↑Fri May 13, 2022 1:08 pm The "lazy way": create bookmark for the following search
You rather mean:Code: Select all
add-columns:Extension;"Extension Frequency" distinct:"Extension Frequency" sort:"Extension Frequency"
Re: Search/sort by frequency of file extension
distinct:"Extension" gives you one entry for each extension.
distinct:"Extension Frequency" gives you one entry for each frequency.
So if there are 5 .txt files and 5 .jpg files, either .jpg or .txt will not be shown.
distinct:"Extension Frequency" gives you one entry for each frequency.
So if there are 5 .txt files and 5 .jpg files, either .jpg or .txt will not be shown.
Re: Search/sort by frequency of file extension
I see, but your query just gives me a list of all extensions, with a frequency of 1 next to all of them.
Re: Search/sort by frequency of file extension
With Everything version 1.5.0.1313a ?
Re: Search/sort by frequency of file extension
Tested in a new, fresh instance and there I get the same (frequency = 1) results.
Time for some research ..
Time for some research ..
Re: Search/sort by frequency of file extension
It looks like a timing/processing issue. This works as expected:
When the frequency is calculated, adding the following to the search gets the correct results:
The following bookmark sort of does what it should:
.. but returns only ~200 file extensions instead of ~1700.
Conclusion: Don't use these bookmarks!
Code: Select all
add-columns:;Extension;"Extension Frequency" sort:"Extension Frequency"
Code: Select all
distinct:Extension
The following bookmark sort of does what it should:
Code: Select all
Search = distinct:"Extension Frequency"
Columns=Name;Extension;Extension Frequency
Sort = Extension Frequency
Conclusion: Don't use these bookmarks!
Re: Search/sort by frequency of file extension
I was able to use this feature just now to discover and delete 100,000 files at 200 MB (400 MB actual cluster space) using the faux extensions created from wget grabs of redundant index.html files. <html@C=S;O=A|html@C=N;O=D|html@C=N;O=A|html@C=M;O=D|html@C=M;O=A|html@C=S;O=D|html@C=D;O=A|html@C=D;O=D>
So there's something useful done. Nice and tidy.
So there's something useful done. Nice and tidy.
Re: Search/sort by frequency of file extension
Everything currently uses the extension frequency for the current results. (not the entire index)
This is causing too much confusion.
The next alpha update will make the following changes:
Use the extension frequency from the entire index, not the current results.
Gather extension frequency immediately when showing the extension frequency column, searching for extension-frequency: or sorting by extension frequency.
The following will work as expected in the next alpha update:
Extension frequency will still only be gathered once.
Use F5 to fresh this cache.
This is causing too much confusion.
The next alpha update will make the following changes:
Use the extension frequency from the entire index, not the current results.
Gather extension frequency immediately when showing the extension frequency column, searching for extension-frequency: or sorting by extension frequency.
The following will work as expected in the next alpha update:
Code: Select all
add-columns:Extension;"Extension Frequency" distinct:"Extension" sort:"Extension Frequency"
Use F5 to fresh this cache.
Re: Search/sort by frequency of file extension
Everything 1.5.0.1314a improves frequency properties:
Frequency property values are gathered immediately when showing a frequency column, searching for a frequency range or sorting by a frequency property.
Frequency property values are now calculated from the entire index.
Not the current results.
The following searches will now work as expected:
add-columns:extension;extension-frequency distinct:extension sort:extension-frequency
add-columns:size;size-frequency distinct:size sort:size-frequency
add-columns:name;name-frequency distinct:name sort:name-frequency
Frequency property values are gathered only once.
They are not updated in real-time.
Press F5 to update frequency property values.
I have put on my TODO list to add a regex-match1-frequency property.
extension-frequency:
name-frequency:
size-frequency:
Frequency property values are gathered immediately when showing a frequency column, searching for a frequency range or sorting by a frequency property.
Frequency property values are now calculated from the entire index.
Not the current results.
The following searches will now work as expected:
add-columns:extension;extension-frequency distinct:extension sort:extension-frequency
add-columns:size;size-frequency distinct:size sort:size-frequency
add-columns:name;name-frequency distinct:name sort:name-frequency
Frequency property values are gathered only once.
They are not updated in real-time.
Press F5 to update frequency property values.
I have put on my TODO list to add a regex-match1-frequency property.
extension-frequency:
name-frequency:
size-frequency:
Re: Search/sort by frequency of file extension
Everything 1.5.0.1315a improves add-columns: and columns:
These search functions will now clear any existing temporary columns when changing the search.
For example:
Changing the search from:
add-columns:extension-frequency
to:
add-columns:size-frequency
will no longer keep the extension-frequency column.
You can now also specify the insert position with :<insert-position>
For example, to add the size frequency column at position 1:
add-columns:size-frequency:1
These search functions will now clear any existing temporary columns when changing the search.
For example:
Changing the search from:
add-columns:extension-frequency
to:
add-columns:size-frequency
will no longer keep the extension-frequency column.
You can now also specify the insert position with :<insert-position>
For example, to add the size frequency column at position 1:
add-columns:size-frequency:1
Re: Search/sort by frequency of file extension
Hello,
sorry for reopening this post but as I understood that "Frequency property values are now calculated from the entire index.Not the current results."
If so, why not add another function (different syntax) for calculating the current results because this will help in managing the current search as all statistics are expected to be related to current results.
thank you.
Regards
sorry for reopening this post but as I understood that "Frequency property values are now calculated from the entire index.Not the current results."
If so, why not add another function (different syntax) for calculating the current results because this will help in managing the current search as all statistics are expected to be related to current results.
thank you.
Regards
Re: Search/sort by frequency of file extension
I will consider calculating the frequencies off the current results.
Thank you for the suggestion.
For now, please perform your search, export as an EFU, open the EFU to calculate frequencies:
Thank you for the suggestion.
For now, please perform your search, export as an EFU, open the EFU to calculate frequencies:
- Perform your search to limit your results.
- From the File menu, click Export....
Change save as type to EFU Everything File list. - Choose a filename and click Save.
- From the File menu, click Open File List....
Select your EFU file from above and click Open. - View or search your frequencies.
- When you are done, close the opened file list from File -> Close File List.