viewtopic.php?f=12&t=11747
Can you expand the explanation on that?
Why using binarycontent: and not binary: and why do we need to add the hex:
Don't these functions perform a search in what we see in Hex editor?
hex: binary: binarycontent:
Re: hex: binary: binarycontent:
binarycontent:
binarycontent: is a search function.
The file content is treated as binary.
Everything will not try to load the content with an iFilter.
Everything will not try to load the content as text/plain.
The search is treated as text.
Everything will try matching the binary content as UTF-8, ANSI, UTF-16, UTF-16 with a byte offset of 1, UTF-16BE and UTF-16BE with a byte offset of 1.
binary:
binary: is a search modifier.
The file content is treated as binary.
Everything will not try to load the content with an iFilter.
Everything will not try to load the content as text/plain.
The search is also treated as binary.
Both the content and the search are treated as binary byte streams.
No special search is performed other than a simple byte comparison test.
Using the regex: modifier will also treat the regex pattern as binary: \xff == 255
binary: and hex: are the same, except how they handle characters 0-9, a-f and A-F.
With binary:, 0-9, a-f and A-F are treated as ASCII character codes.
With hex:, two consecutive 0-9, a-f or A-F characters are converted to a byte value.
Using binary: might be preferable to hex: when you are trying to find the binary ASCII characters: "abc" in a file:
binary:content:abc
With hex: you would have to search for:
hex:content:616263
hex:binarycontent: is the same as hex:content:
These will both match the text you would see in a hex editor.
binarycontent: is a search function.
The file content is treated as binary.
Everything will not try to load the content with an iFilter.
Everything will not try to load the content as text/plain.
The search is treated as text.
Everything will try matching the binary content as UTF-8, ANSI, UTF-16, UTF-16 with a byte offset of 1, UTF-16BE and UTF-16BE with a byte offset of 1.
binary:
binary: is a search modifier.
The file content is treated as binary.
Everything will not try to load the content with an iFilter.
Everything will not try to load the content as text/plain.
The search is also treated as binary.
Both the content and the search are treated as binary byte streams.
No special search is performed other than a simple byte comparison test.
Using the regex: modifier will also treat the regex pattern as binary: \xff == 255
binary: and hex: are the same, except how they handle characters 0-9, a-f and A-F.
With binary:, 0-9, a-f and A-F are treated as ASCII character codes.
With hex:, two consecutive 0-9, a-f or A-F characters are converted to a byte value.
Using binary: might be preferable to hex: when you are trying to find the binary ASCII characters: "abc" in a file:
binary:content:abc
With hex: you would have to search for:
hex:content:616263
hex:binarycontent: is the same as hex:content:
These will both match the text you would see in a hex editor.
Re: hex: binary: binarycontent:
Thanks for the explanation!
There is one thing I don't understand though:
as each PDF file starts with %PDF (hex: 25504446)
But the hex: doesn't seem to modify the content: function to a bytestream. I needed this to make it work as expected:
(PDF files are not content-indexd by Everything on this system, although that should not matter. I think ...)
Everything will try matching the binary content as UTF-8, ANSI, UTF-16, UTF-16 with a byte offset of 0, UTF-16BE and UTF-16LE with an offset of 2.
Differences: offset (FEFF/FFFE) and UTF16-LE instead of BE
There is one thing I don't understand though:
This works as expected:void wrote: ↑Sat Jul 09, 2022 8:29 am binary:
binary: is a search modifier.
The file content is treated as binary.
Everything will not try to load the content with an iFilter.
The search is also treated as binary.
Both the content and the search are treated as binary byte streams.
No special search is performed other than a simple byte comparison test.
[...]
binary: and hex: are the same, except how they handle characters 0-9, a-f and A-F.
Code: Select all
test.pdf startwith:hex:binarycontent:25504446
But the hex: doesn't seem to modify the content: function to a bytestream. I needed this to make it work as expected:
Code: Select all
test.pdf startwith:hex:fromdisk:content:25504446
(PDF files are not content-indexd by Everything on this system, although that should not matter. I think ...)
Not sure, but this seems more likely to me:
Everything will try matching the binary content as UTF-8, ANSI, UTF-16, UTF-16 with a byte offset of 0, UTF-16BE and UTF-16LE with an offset of 2.
Differences: offset (FEFF/FFFE) and UTF16-LE instead of BE
Re: hex: binary: binarycontent:
If you search for:test.pdf startwith:hex:fromdisk:content:25504446
test.pdf startwith:hex:content:25504446
What are the search ops from the Everything debug console?
For example:
Code: Select all
FILE TERM START 0000000032a2a6f8 M 000000000018dc70 N 000000000018dd90
0000000032a2a6f8 20e01104 M 0000000032a2a838 N 000000000018dd90 OP 163 c:\PDFs\
0000000032a2a838 20e01100 M 0000000032a2a978 N 000000000018dd90 OP 205 pdf
0000000032a2a978 20e01140 M 000000000018dc70 N 000000000018dd90 OP 558 %PDF
Everything will try both UTF-16LE and UTF-16BE.Not sure, but this seems more likely to me:
Everything will try matching the binary content as UTF-8, ANSI, UTF-16, UTF-16 with a byte offset of 0, UTF-16BE and UTF-16LE with an offset of 2.
Differences: offset (FEFF/FFFE) and UTF16-LE instead of BE
If there's no match, Everything will try both UTF-16LE and UTF-16BE again with a 1 byte offset.
This is because UTF-16 text in the content might not be aligned to two bytes.
Consider the following files (shown in hex):
Code: Select all
00680065006C006C006F // UTF16BE text: hello
680065006C006C006F00 // UTF16LE text: hello
FF00680065006C006C006F // UTF16BE text: hello with junk first byte 0xff
FF680065006C006C006F00 // UTF16LE text: hello with junk first byte 0xff
If your search text is all ASCII characters, Everything will do the search in one pass.
Re: hex: binary: binarycontent:
Code: Select all
search 'ext:pdf menu startwith:hex:content:25504446' filter '' sort 10 ascending 0
parse flags 00000000 type 20c00100
TERM pdf
parse flags 00000000 type 20c00100
TERM menu
parse flags 00080008 type 20c00100
TERM %PDF
FOLDER TERM START 0000000000afe280 M 0000000000afe160 N 0000000000afe280
000000000c2f9578 20e00100 M 000000000c2f97f8 N 0000000000afe280 OP 5 menu
000000000c2f97f8 20e00140 M 0000000000afe160 N 0000000000afe280 OP 378 %PDF
FILE TERM START 000000000c2fc638 M 0000000000afe160 N 0000000000afe280
000000000c2fc638 20e00100 M 000000000c2f9578 N 0000000000afe280 OP 205 pdf
000000000c2f9578 20e00100 M 000000000c2f97f8 N 0000000000afe280 OP 5 menu
000000000c2f97f8 20e00140 M 0000000000afe160 N 0000000000afe280 OP 378 %PDF
found 0 files with 2 threads in 0.013394 seconds
found 0 folders with 0 threads in 0.000002 seconds
Got it. Thanks!If there's no match, Everything will try both UTF-16LE and UTF-16BE again with a 1 byte offset.
Re: hex: binary: binarycontent:
OP code 378 (Binary content search) is unexpected.
What version of Everything are you using?
What version of Everything are you using?
Re: hex: binary: binarycontent:
1315a x64.
Will take a closer look tomorrow (I saw more unexpected results).
Will take a closer look tomorrow (I saw more unexpected results).
Re: hex: binary: binarycontent:
Everything 1.5.0.1316a fixes an issue with binarycontent: and hex: not using the correct search op code.