problem with content indexing
problem with content indexing
I added a 20 GB folder for content indexing, the indexing prosses is stock for the last 3 hours at: indexing properties 77%
Re: problem with content indexing
All found text will be stored in the database. If Everything is running, that means that it will all be loaded in memory.
You might be running out of RAM on your machine?
You might be running out of RAM on your machine?
Re: problem with content indexing
didn't show that on Task Manager.
I restarted now, delated the old db, as of now it works at 75% and he move on. let's see. will update.
BTW, on Option/Content it sow exactly on which file he work's now, and you can check if he is stock or working.
Re: problem with content indexing
he finished to index, it worked well.
but, now, everything is using 1350 mb of RAM when he is running ... - as all the db works on memory.
but, now, everything is using 1350 mb of RAM when he is running ... - as all the db works on memory.
Re: problem with content indexing
Thank you for your feedback.
You are reaching the limits of Everything content indexing.
Content indexing in Everything is intended for a couple 100MB of raw text.
If you want to index 1GB+ of raw text and have 1GB of free memory, I won't stop you
The initial content index will be slow, progress is shown in the status bar and detailed progress is now shown in Tools -> Options -> Content.
Please consider adding more filters for which files are content indexed.
For example, limit the content to a specific folder, set file types and size limit:
You are reaching the limits of Everything content indexing.
Content indexing in Everything is intended for a couple 100MB of raw text.
If you want to index 1GB+ of raw text and have 1GB of free memory, I won't stop you
The initial content index will be slow, progress is shown in the status bar and detailed progress is now shown in Tools -> Options -> Content.
Please consider adding more filters for which files are content indexed.
For example, limit the content to a specific folder, set file types and size limit:
- In Everything, from the Tools menu, click Options.
- Click the Content tab on the left.
- Set include only folders to a semicolon delimited list of folders, for example:
c:\users\<my user name>\Documents;D:\documents - Set the Include only filters, for example: *.docx;*.pdf
- Set a Maximum size, for example: 10 MB
- Click OK.
Re: problem with content indexing
I used it for a folder with a lot of pdf files, I want to have searchable - filter will not help me in this situation, since I already choose the specific folder. as of your information I know now it's not the right tool to use for it.
I will back out of that, and use content indexing only for DOC files, and I'm sure even that will be very useful for me.
I will back out of that, and use content indexing only for DOC files, and I'm sure even that will be very useful for me.
Re: problem with content indexing
Please be aware that if you enable content indexing, content: will only search your indexed content.
Use the notindexed: modifier to search content in your PDF folder, for example:
"d:\pdf folder\" ext:pdf notindexed:content:"text to search"
Use the notindexed: modifier to search content in your PDF folder, for example:
"d:\pdf folder\" ext:pdf notindexed:content:"text to search"
Re: problem with content indexing
Yes, I see that in your original post.
Is it possible to use a filter for separate EXT depend on folder?
For example: from folder: c:/user/files - index pdf and doc;
and from folder: c:/user/otherfiles - index only doc
Is it possible to use a filter for separate EXT depend on folder?
For example: from folder: c:/user/files - index pdf and doc;
and from folder: c:/user/otherfiles - index only doc
Re: problem with content indexing
Can you please give some more detail on that, can this effect the speed of the pc on general?void wrote: ↑Mon Mar 15, 2021 12:25 am Please consider adding more filters for which files are content indexed.
For example, limit the content to a specific folder, set file types and size limit:If you have a fast NVMe drive, instead of using content indexing, consider using faster content searching.
- In Everything, from the Tools menu, click Options.
- Click the Content tab on the left.
- Set include only folders to a semicolon delimited list of folders, for example:
c:\users\<my user name>\Documents;D:\documents- Set the Include only filters, for example: *.docx;*.pdf
- Set a Maximum size, for example: 10 MB
- Click OK.
Re: problem with content indexing
Yes, please try the following:Is it possible to use a filter for separate EXT depend on folder?
leave include only folders blank.
include only files:
c:\user\files\**.pdf;c:\user\files\**.doc;c:\user\otherfiles\**.doc
Enabling /no_incur_seek_penalty_multithreaded=1 will only effect search performance in Everything.Can you please give some more detail on that, can this effect the speed of the pc on general?If you have a fast NVMe drive, instead of using content indexing, consider using faster content searching.
Most NVMe SSDs can read at 3000+ MB/s, with /no_incur_seek_penalty_multithreaded=1 Everything can read content at this speed, which makes content indexing moot.
Enabling /no_incur_seek_penalty_multithreaded=1 for normal SSDs will also increase search performance (just not to the extent with NVMe).
Enabling /no_incur_seek_penalty_multithreaded=1 will make no difference for HDDs.
/no_incur_seek_penalty_multithreaded=1 is not enabled by default because it can be very demanding on the system.
I will consider enabling it by default with more testing.
Re: problem with content indexing
Everything will load all indexed content in memory, so you will need a system with *a lot* of RAM for 50GB of documents.
But technically it is possible ..
But technically it is possible ..
Re: problem with content indexing
I'm double checking here to make sure we're on the same page. Do you want to index just the metadata of these documents? Or do you want to search the entire contents of 50 gigabytes of documents, over a network, without keeping a local copy of 50 gigabytes of documents?
You can certainly index the metadata (date modified, author, etc) of the documents, but if you want to search the documents themselves you will need to keep a local copy of those documents in your own possession.
Re: problem with content indexing
I highly recommend storing your text content on a local NVMe SSD drive.
Everything will max out your NVMe SSD read speeds (3000+ MB/s)
Consider using Everything content indexing if you want instant content searching and have the free ram available.
Everything will max out your NVMe SSD read speeds (3000+ MB/s)
Consider using Everything content indexing if you want instant content searching and have the free ram available.
Last edited by void on Thu Mar 17, 2022 9:10 am, edited 1 time in total.
Reason: *free ram available
Reason: *free ram available
Re: problem with content indexing
Currently I use Archivarius 3000 for content indexing. The index there is about 15 GB in size, but is not loaded into RAM. Still, the search results are instantaneous in most cases, although not in Everyting's FAYT style.
Archivarius 3000 also doesn't need access to the network drive for searching, since the entire raw text is stored in the index on a local SSD.
However, I don't fully understand your comments regarding my network drive -- especially not Void's, who seems to point to a way to have instant content search and still have low RAM usage?
Archivarius 3000 also doesn't need access to the network drive for searching, since the entire raw text is stored in the index on a local SSD.
However, I don't fully understand your comments regarding my network drive -- especially not Void's, who seems to point to a way to have instant content search and still have low RAM usage?
Re: problem with content indexing
To clarify,
Consider using Everything content indexing if you want instant content searching at the cost of high RAM usage.
Consider using Everything content indexing if you want instant content searching at the cost of high RAM usage.
Re: problem with content indexing
OK thanks. Since I would probably need dozens of gigabytes of RAM for my amount of data then, content indexing is probably out of question for the time being.
Re: problem with content indexing
You can also query the Windows index with Everything using
si:your_search
si:your_search
Re: problem with content indexing
OK thanks.
I just tried and started indexing only the MSG files for content. Instantly, CPU load jumped to 100%, and after a few minutes, Everything already used 16GB RAM.
I managed to stop indexing and disable content indexing in the settings, but my PC still crashed afterwards. I.e. it still runs and stuff can be accessed via network, but all screens have gone black.
Anyway, I guess I'll keep Archivarius 3000 around for content intexing for the time being (unfortunately, it seems to be abandoned), and use Everything for what it does best: blazing fast filename search.
I just tried and started indexing only the MSG files for content. Instantly, CPU load jumped to 100%, and after a few minutes, Everything already used 16GB RAM.
I managed to stop indexing and disable content indexing in the settings, but my PC still crashed afterwards. I.e. it still runs and stuff can be accessed via network, but all screens have gone black.
Anyway, I guess I'll keep Archivarius 3000 around for content intexing for the time being (unfortunately, it seems to be abandoned), and use Everything for what it does best: blazing fast filename search.
Re: problem with content indexing
One more question regarding Content Indexing: Is the content index also saved to disk when shutting down Everything, and later loaded from disk to RAM again?
I would say that this makes even more sense than with the file name index, however could not observe such behavior so far?
And when content is indexed, you'd still have to use in order to search the content, correct?
I would say that this makes even more sense than with the file name index, however could not observe such behavior so far?
And when content is indexed, you'd still have to use
content:[search term]
Re: problem with content indexing
1. Of course the content index is saved with the database.David.P wrote: ↑Sun Mar 20, 2022 4:21 pm One more question regarding Content Indexing: Is the content index also saved to disk when shutting down Everything, and later loaded from disk to RAM again?
I would say that this makes even more sense than with the file name index, however could not observe such behavior so far?
And when content is indexed, you'd still have to usein order to search the content, correct?content:[search term]
2. You can define any shortname or macro to search for content.
I use a filter with defines a macro to search for name and content together.
Example:
fc:mysearch
Re: problem with content indexing
OK thanks. It seems however that ET 1.5a always scans everything again first when started, which seems to take hours in my case.
However, I am running in non-admin, non-service mode because I have no admin rights on the machine. Additionally, I'm scanning a network drive with ~50GB of data over a VPN connection.
However, I am running in non-admin, non-service mode because I have no admin rights on the machine. Additionally, I'm scanning a network drive with ~50GB of data over a VPN connection.
Re: problem with content indexing
None service and None admin prevents Everything from scanning NTFS drives before you allow it to start.David.P wrote: ↑Sun Mar 20, 2022 5:49 pm OK thanks. It seems however that ET 1.5a always scans everything again first when started, which seems to take hours in my case.
However, I am running in non-admin, non-service mode because I have no admin rights on the machine. Additionally, I'm scanning a network drive with ~50GB of data over a VPN connection.
Remove the local drives from the Indexes NTFS tab
and use Folder indexing in this case.
Set the schedule to manual.
For the VPN connected systems it would be much better to run Everything (Server or ETP)
on this machine and connect to it with your client.
Re: problem with content indexing
Thanks again.
I only use "Folders" and "Network Drives" under the "Indexes" settings.
Later, I might be able to run ET on the server, at the moment this is only for trying out.
I only use "Folders" and "Network Drives" under the "Indexes" settings.
Later, I might be able to run ET on the server, at the moment this is only for trying out.
Re: problem with content indexing
I have now tried and created a content index of the following file types: *.msg; *.doc; *.docx; *.txt -- limiting the maximum file size to 2 MB.
Everything then found about 200,000 files and indexed their content. While the index was being created, memory usage went up to about 14 GB. However, after closing and restarting Everything, its RAM usage remains at about 1 GB, which also corresponds approximately to the size of the Everything-1.5a.db file.
Is this expected behavior?
Is there a way to create a content index with reduced RAM usage (during the creation)?
If so, I could possibly have (almost) all files of the type mentioned initially content-indexed, without that size limitation.
Everything then found about 200,000 files and indexed their content. While the index was being created, memory usage went up to about 14 GB. However, after closing and restarting Everything, its RAM usage remains at about 1 GB, which also corresponds approximately to the size of the Everything-1.5a.db file.
Is this expected behavior?
Is there a way to create a content index with reduced RAM usage (during the creation)?
If so, I could possibly have (almost) all files of the type mentioned initially content-indexed, without that size limitation.
Re: problem with content indexing
Oh wow, I just realized that Everything can also do a combined search like so:
si:[content search term] [file name portion]
That is AWESOME! So I don't need Everything to index any content at all, and still have full access to the content in Everything searches, by "side-loading" the Windows Index.
Incredible.
Re: problem with content indexing
I type only
fc:searchtext
and it finds content or file names with the searchtext.
To make this I have a filter with define the fc macro You can make a similar one including the system index.
fc:searchtext
and it finds content or file names with the searchtext.
To make this I have a filter with define the fc macro You can make a similar one including the system index.
Re: problem with content indexing
Thanks!
I can't believe how awesome this tool is.
I can't believe how awesome this tool is.
-
- Posts: 58
- Joined: Mon Sep 19, 2022 10:38 am
Re: problem with content indexing
Hi David,
Did you solve your problems with content search? I have the same problem with having >100 GB of PDFs that I want to search, but of course I cannot do that with regular RAMs, I need a database solution. Do you still use Archivarius?
Did you solve your problems with content search? I have the same problem with having >100 GB of PDFs that I want to search, but of course I cannot do that with regular RAMs, I need a database solution. Do you still use Archivarius?
Re: problem with content indexing
Yes I still use Archivarius.
FileLocator Pro aka Agent Ransack could be a possible successor, but unfortunately still doesn't support multicolor highlighting of search term occurrences.
FileLocator Pro aka Agent Ransack could be a possible successor, but unfortunately still doesn't support multicolor highlighting of search term occurrences.