My frustrations with regular expressions and I would like to try to explain why difficulties may arise and what the limitations and possibilities are for their use.
Why do regexes have limitations?
Regular expressions are a powerful tool, but their performance is limited by specific rules that apply to different regex engines. Some of these limitations are due to how regexes are designed to process text - their job is to find patterns, but some operations, like relative comparisons based on different lines or group references in the ‘lookahead’ in some regex engines, can be difficult to implement.
It wants to compare filenames without considering the numbering at the beginning, and also detect duplicates.
Problem:
A lookahead with a group (e.g., (?=)) does not work in most cases, because it is not possible in regex engines to refer to groups in the context of a ‘lookahead’ (which only checks the match in the future). This introduces difficulties when trying to find duplicates in the way I am trying to do.
I tested in 101regex, and it keeps showing ‘not match’.
^(?:\d{2}\.)?(.*)
^(?:\d{2}\.)?(.*)(?=\r?\n\1)
^(?:\d{2}\.)?(.*?)(?=\r?\n\1)
^\d{2}\.(.*)
01.Q爱(DJ谋 Electro Remix)王.mp3
12.Q爱(DJ谋 Electro Remix)王.mp3
0 matches are found in 0 lines. Cannot find ^(?:\d{2}\.)?(.*)(?=\r?\n\1) above the current position.
frustrations with regular expressions
Re: frustrations with regular expressions
If I understand correctly you want to compare the filename excluding any digits at the start of the filename?
If so, then you can use
That captures all of the filename minus any leading digits, then it puts the result in a column, and then it search for duplicates based on that column
If so, then you can use
Code: Select all
regex:\d*(.*) add-column:1 dupe:1