Index html and htm files in subfolders?

Discussion related to "Everything" 1.5 Alpha.
Post Reply
Biff
Posts: 1154
Joined: Mon May 25, 2015 7:09 am

Index html and htm files in subfolders?

Post by Biff »

How would one have to adapt the code so that Everything also indexes html and htm files that are in this folder, I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\, and in its subfolders and perhaps other subfolders?

*.doc;*.docx;*.pdf;*.txt;*.xls;*.xlsx;*.ods;*.odt;*.ott;*.scrivx;*.csv;*.ics;*.rtf;*.eml;regex:^I:\\Eigene Dateien\\Notepad - Ansammlungen txt-Dateien\\[^.]*$

Image

Can Everything only index the text of an html or htm page that a visitor sees, not the code?
void
Developer
Posts: 16665
Joined: Fri Oct 16, 2009 11:31 pm

Re: Index html and htm files in subfolders?

Post by void »

How would one have to adapt the code so that Everything also indexes html and htm files that are in this folder, I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\, and in its subfolders and perhaps other subfolders?
Include the following in your Include only files:

I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.html;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.htm




To include multiple folders, please try:

I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.html;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.htm;C:\Another folder\**.html;C:\Another folder\**.htm



Can Everything only index the text of an html or htm page that a visitor sees, not the code?
A couple ways to do this:

1). Disable Tools -> Options -> Advanced -> content_builtin_text_plain_handler
Select your html/htm files and hit Ctrl + F5 to reindex content.

-or-

2). Remove html/htm from the Everything built-in list of extensions:
  • Type in the following search and press ENTER:
    about:config
    Change the following line:

    Code: Select all

    text_plain_extensions=a;ans;asc;ascx;asm;asp;aspx;asx;bas;bat;bcp;btm;c;cc;cls;cmd;contact;cpp;cs;csa;csproj;css;csv;cxx;dbs;def;dic;dos;dsp;dsw;efu;ext;faq;fky;h;hhc;hpp;hta;htm;html;htt;htw;htx;hxx;i;ibq;ics;idl;idq;inc;inf;ini;inl;inx;jav;java;js;json;kci;lgn;lst;lua;m3u;mak;mk;odc;odh;odl;php;pl;prc;ps1xml;py;rc;rc2;rct;reg;rgs;rul;s;scc;shtm;shtml;sol;sql;srf;stm;tab;tdl;tlh;tli;trg;txt;udf;udt;user;usr;vbproj;vbs;vcproj;viw;vspscc;vsscc;vssscc;wri;wtx;xml;xsd;xsl;xslt
    to:

    Code: Select all

    text_plain_extensions=a;ans;asc;ascx;asm;asp;aspx;asx;bas;bat;bcp;btm;c;cc;cls;cmd;contact;cpp;cs;csa;csproj;css;csv;cxx;dbs;def;dic;dos;dsp;dsw;efu;ext;faq;fky;h;hhc;hpp;hta;htt;htw;htx;hxx;i;ibq;ics;idl;idq;inc;inf;ini;inl;inx;jav;java;js;json;kci;lgn;lst;lua;m3u;mak;mk;odc;odh;odl;php;pl;prc;ps1xml;py;rc;rc2;rct;reg;rgs;rul;s;scc;shtm;shtml;sol;sql;srf;stm;tab;tdl;tlh;tli;trg;txt;udf;udt;user;usr;vbproj;vbs;vcproj;viw;vspscc;vsscc;vssscc;wri;wtx;xml;xsd;xsl;xslt
    (remove htm;html)
  • Save changes and exit Notepad
  • Accept the prompt in Everything to reload your config.
Biff
Posts: 1154
Joined: Mon May 25, 2015 7:09 am

Re: Index html and htm files in subfolders?

Post by Biff »

Thank you very much!

It seems this code

Code: Select all

*.doc;*.docx;*.pdf;*.txt;*.xls;*.xlsx;*.ods;*.odt;*.ott;*.scrivx;*.csv;*.ics;*.rtf;*.eml;regex:^I:\\Eigene Dateien\\Notepad - Ansammlungen txt-Dateien\\[^.]*$;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.html;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.htm
lets Everything index the content of these files:
*.doc;*.docx;*.pdf;*.txt;*.xls;*.xlsx;*.ods;*.odt;*.ott;*.scrivx;*.csv;*.ics;*.rtf;*.eml
and
html and htm
and the content of files without extension in the folder
"Notepad - Ansammlungen txt-Dateien" and all of its sub folders.

And a html file in the bin:
Image

Is it like it should be? Why is the html file in the bin shown / indexed, respectively kept in the index (which isn't bad).

Image



So I would not need this(?):

Code: Select all

I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.html;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.htm;C:\Another folder\**.html;C:\Another folder\**.htm
Or what for is this part good for?
void
Developer
Posts: 16665
Joined: Fri Oct 16, 2009 11:31 pm

Re: Index html and htm files in subfolders?

Post by void »

Is it like it should be?
Yes.


Why is the html file in the bin shown / indexed, respectively kept in the index (which isn't bad).
This might be from an old content index.
Please wait until Everything finishes indexing content.
Progress is shown in the status bar on the right.
The content for this file will eventually be removed.


So I would not need this(?):

Code: Select all

I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.html;I:\Eigene Dateien\Notepad - Ansammlungen txt-Dateien\**.htm;C:\Another folder\**.html;C:\Another folder\**.htm
Or what for is this part good for?
It's not needed unless you wanted to index html/htm content in other folders.
Biff
Posts: 1154
Joined: Mon May 25, 2015 7:09 am

Re: Index html and htm files in subfolders?

Post by Biff »

This might be from an old content index.
So Everything just keeps files in the index until (the new) indexing is finished although they are in a folder / in the bin in which they should not be indexed?
It's not needed unless you wanted to index html/htm content in other folders.
Ah, so that just is additionally code I could use (adapted) for every other folder. It was not intended to use it for that special folder.
Post Reply