A month ago I studied duplicates.Now I am trying to ignore white space.
I have "Ignore white space" selected in the Search menu but the Result List is empty.
I have tried selecting the Name column and choosing "Find name Duplicates.
I tried putting "Dupe:" at the start of the search string and then at the end of the search string.
I had great success a month ago with "Dupe:", so I suspect that I am somehow mis-using the "Ignore white space", but that seems too simple to get wrong?
Thanks for any clues, guidance etc.
Chris
This is my search screen showing two files that my brain "sees" as duplicates. The size and date match exactly. I suspect that I have issued a File, saveAs to introduce spaces into the name, but have forgotten to delete the squeezed name. And here I am trying to isolate culprits.Duplicate names in a single folder, ignore whitespace
-
- Posts: 684
- Joined: Wed Jan 05, 2022 9:29 pm
Re: Duplicate names in a single folder, ignore whitespace
I suspect that this will be a very tough one to accomplish. Will think about it ...
-
- Posts: 684
- Joined: Wed Jan 05, 2022 9:29 pm
Re: Duplicate names in a single folder, ignore whitespace
NotNull I greatly respect your advice and knowledge, but are you pulling my leg here?
I had expected "ignore white space" to be a bit of a no-brainer as far as the executable code went. That's why, in this example more than any other, I thought that I had erred.
I mean, eliminating white space would be about the most common thing to do when comparing string data. No?
(signed) "eagerly awaiting your thoughts" of Bonavista
Re: Duplicate names in a single folder, ignore whitespace
I wouldn't dare! (although .. it is almost April 1st ..)
Seriously: "ignore white space" works on the search itself, so searching for abc will also find "a b c".
If no specific filename(pattern) search text is given, Everything will see "this is a file.txt" and "thisisafile.txt" as different filenames, even when "ignore white space" is enabled.
I guess some ugly regular expression is needed to be able to compare with/ without spaces.
-
- Posts: 684
- Joined: Wed Jan 05, 2022 9:29 pm
Re: Duplicate names in a single folder, ignore whitespace
Thank you NotNull. This progress.NotNull wrote: ↑Fri Mar 24, 2023 11:42 pmSeriously: "ignore white space" works on the search itself, so searching for abc will also find "a b c". If no specific filename(pattern) search text is given, Everything will see "this is a file.txt" and "thisisafile.txt" as different filenames, even when "ignore white space" is enabled.
It appears to be NOT user error!
I know nothing about the program code of Everything, and have no desire to go there. But from a pure programming perspective there must be a slave/utility procedures somewhere called (VBA)
Code: Select all
strSqueezeWhiteSpace(strIncoming, strcWhiteSpaceDefinitions) as String
So I imagine that the routine that assembles that/those(see below) arrays could, at the time it is loading a name into the array, apply strSqueezeWhiteSpace to the string. (OK, two arrays, one for OriginalNames and a matching array for SqueezedNames)
Below: In general when a user looks for duplicates the definition is not in the program, but in the mind of the user. To that end I would suggest that Dupe: allows a variety of processes to stipulate, for many columns, what is, to the user's mind, a duplicate.
Size: equal when rounded to the nearest hundred bites (=ROUND(LONG, -2), I think)
Date: equal when rounded to the nearest hour
Type: equal if maps to the same list of types (Picture, Audio, Document ...)
User: No matter how powerful a Dupe function is, it will be of little use to the user if it does not locate duplicates according to the user's mind. That does NOT mean that Everything should accommodate every sing;e user's day-to-day wishes - there must be a cutoff point.
User: When the average user sees "Ignore White Space" as a setting, the user assumes that the program will ignore white space until the setting is turned off. (I think this describes me). There is an understanding that setting ON "Ignore White Space" in the menu was a global edict to Everything to ignore white space in searches, whether the searches be for audio, pictures, documents, duplicates, ...
I see that Mike_PB has something on the go that might be useful
Could be true. I have dabbled in regex from time to time, but am ill-equipped to fabricate a complex regex.I guess some ugly regular expression is needed to be able to compare with/ without spaces.
I can leave my example for the time being and get on with something else. But I'd be interested to learn that "Ignore White Space" could be applied globally once it was set!
Cheers, Chris
Re: Duplicate names in a single folder, ignore whitespace
You could try adding a column using column functions. Add the following to your search
column1:=substitute($name:," ","") addcolumn:column1
This will add a column of the file name "ignoring" spaces. You could then DUPE for that column?
column1:=substitute($name:," ","") addcolumn:column1
This will add a column of the file name "ignoring" spaces. You could then DUPE for that column?
-
- Posts: 684
- Joined: Wed Jan 05, 2022 9:29 pm
Re: Duplicate names in a single folder, ignore whitespace
Thank you Phlashman . Brilliant! (mainly because I had not yet put my toe into the water in that pond ) I did try adding "Dupe:" at first at the left-hand end, then at the right-hand end of my search string, and neither worked.
But selecting your "Column 1" and from the menu choosing (right-click) "Find Column 1 Duplicates" gave me exactly what I wanted.
So thanks for the introduction to Column Functions.
My plan now is to pour another coffee, and use my new trick on my entire data partition, then start studying Column functions.
Thanks again, Chris
-
- Posts: 684
- Joined: Wed Jan 05, 2022 9:29 pm
Re: Duplicate names in a single folder, ignore whitespace
3,702 objects which does not mean that I have duplicates of 1,851 documents, since some of the entries are for triplicates and quadruplicates(?) and the like.
That is, once I learned that I choosing (right-click) "Find Column 1 Duplicates" was a necessary step!
Many many thanks
Chris