This is probably best explained with output from a command window. The first 3 lines show what happens when ES is executed normally and this is exactly what I expect to see. However, note what happens to "prüfung" (with diacritics) when output is written to a text file.
c:\test>es.exe prüfung -full-path-and-name
C:\test\prufung.txt
C:\test\prüfung.txt
c:\test>es.exe prüfung -full-path-and-name -export-txt exported.txt
c:\test>type exported.txt
C:\test\prufung.txt
C:\test\pr├╝fung.txt
c:\test>
My use case is to import the output from ES in order to create a Directory Opus collection. I can make this work by redirecting ES output to the clipboard and processing from there, but I would prefer to be able to rely on output written to an intermediate file.
es.exe is dated 27/06/2018
ES -export-txt with non-ASCII
Re: ES -export-txt with non-ASCII
es exports text as UTF-8.
If you are ok with the active console code page, please try redirecting output to a file instead of using the -export command line options.
For example:
es.exe prüfung -full-path-and-name > exported.txt
Characters not supported by the active code page will be displayed as ?
To change the consoles active code page, please see:
https://ss64.com/nt/chcp.html
Added to the ES help:
UTF-8 encoding is used for exporting as txt.
If you are ok with the active console code page, please try redirecting output to a file instead of using the -export command line options.
For example:
es.exe prüfung -full-path-and-name > exported.txt
Characters not supported by the active code page will be displayed as ?
To change the consoles active code page, please see:
https://ss64.com/nt/chcp.html
Added to the ES help:
UTF-8 encoding is used for exporting as txt.
-
- Posts: 35
- Joined: Sun Mar 08, 2015 11:05 pm
Re: ES -export-txt with non-ASCII
Thanks for clarifying. There is no BOM on the exported text file so by default (for my use case) it is not recognised as such. However, I can force the import to assume UTF-8 and that works. Would you consider adding a BOM, or an option to do so?
Re: ES -export-txt with non-ASCII
I'll add an option to do so.
However, it will be off by default as the UTF-8 spec doesn't recommend using the BOM.
Thanks for the suggestion.
However, it will be off by default as the UTF-8 spec doesn't recommend using the BOM.
Thanks for the suggestion.
Re: ES -export-txt with non-ASCII
You could also use PowerShell to convert a UTF8 file to a UTF8-BOM file:
(Get-Content .\exported.txt) | Set-Content -Encoding UTF8 .\exported.txt
Re: ES -export-txt with non-ASCII
ES-1.1.0.25 adds a -utf8-bom command line option to write a UTF-8 byte order mark at the start of the export file.