Hello!
Thanks very much for the super desktop search utility for Windows. "Everything" is the very useful, helpful and easy search engine. Every day I search files by filename and by content many times. And so I found some bugs.
My English is poor, but I try to put my ideas (thought) into words clearly.
fact_01: BUG.
Everything_1.4.1.935 DOESN'T find English and Russian words (letters) in .htm and .html types of files, irrespective of character encoding (UTF-8 with BOM or UTF-8 without BOM) in part of code Location of Bookmark <A HREF=" " >.
fact_02: Normal. Everything_1.4.1.935 FIND English and Russian words (letters) in .htm and .html types of files, irrespective of character encoding (UTF-8 with BOM or UTF-8 without BOM) in parts of code Location and Description of Bookmark <DD> and Name of Bookmark > </A>.
For example:
There are two .html files (Everything_bookmarks_UTF8_BOM.html and Everything_bookmarks_UTF8_no_BOM.html) in attachments. These files are bookmarks from browser. These bookmarks contain 2 hyperlinks:
Вопросы и ответы - voidtools
https://www.voidtools.com/ru-ru/faq/
Очень быстрый поиск с программой Everything / Хабр
https://habr.com/ru/post/42354/
<DL><p>
<DT><A HREF="https://www.voidtools.com/ru-ru/faq/" ADD_DATE="1573496435" LAST_MODIFIED="1573496435" ICON_URI="https://www.voidtools.com/favicon.ico" >Вопросы и ответы - voidtools</A>
<DT><A HREF="https://habr.com/ru/post/42354/" ADD_DATE="1573496463" LAST_MODIFIED="1573496463" ICON_URI="https://habr.com/images/favicon-16x16.png" >Очень быстрый поиск с программой Everything / Хабр</A>
<DD>Начну немного «издалека». Дело в том, что я (и думаю не я один) — очень люблю маленькие но функциональные программы. Я встречал несколько таких приложений, которые иначе чем шедеврами софтостроения...
</DL>
Using Everything_1.4.1.935
01. You CAN find these words: "Вопросы", "voidtools", "Everything", "Хабр", "встречал", "люблю", because they are in part of code <DD> or > </A>.
02. You CANNOT find these words: "voidtools.com", "ru-ru", "habr", "post/42354/", "habr.com", "ww.void" because they are in part of code <A HREF=" " >.
Please, fix this BUG in next versions of Everything (if this possible).
Thanks very much.
Search in .htm (.html) files
Search in .htm (.html) files
- Attachments
-
- Everything_.htm_Search.zip
- Demonstrative Example
- (5.52 KiB) Downloaded 361 times
Last edited by AE_AE on Sun Mar 22, 2020 4:11 am, edited 1 time in total.
Re: Search in .htm (.html) files
If I understand the content: function correctly (I don't use it very often), it will search in the resulting text of a document (minus formatting and layout), just as Windows Search uses for indexing (see iFilter).
If you want to search in the raw text, you can use some other Everything functions:
ansicontent:
utf8content:
utf16content:
utf16becontent:
In your case, replace content: with utf8content: to also search in - for example - HREF attributes.
If you want to search in the raw text, you can use some other Everything functions:
ansicontent:
utf8content:
utf16content:
utf16becontent:
In your case, replace content: with utf8content: to also search in - for example - HREF attributes.
Re: Search in .htm (.html) files
Hello, NotNull!
Thanks very much for your quick answer. You helped me.
Now I find any text in .htm (.html) files by using request: .HTM utf8content:
Also I have began to use your advice for another cases.
For example, Everything_1.4.1.935 DOESN'T find Russian words (letters) in .txt type of files, when character encoding is UTF-8 without BOM.
Now I find any text in .txt files, when character encoding is UTF-8 without BOM, by using request: .txt utf8content:
Thanks very much for your quick answer. You helped me.
Now I find any text in .htm (.html) files by using request: .HTM utf8content:
Also I have began to use your advice for another cases.
For example, Everything_1.4.1.935 DOESN'T find Russian words (letters) in .txt type of files, when character encoding is UTF-8 without BOM.
Now I find any text in .txt files, when character encoding is UTF-8 without BOM, by using request: .txt utf8content:
Re: Search in .htm (.html) files
Glad that I could help.
FYI: it is under consideration for a next major version of Everything to bypass this iFilter behaviour for certain text-based files like xml, html and json.
That way you don't have to use the utf8content: function to find your text, but you can use the "normal" content: function instead.