Title

Message

Confirm

<< DiscussionsReply

Search string character encoding

Scanner
15 discussion posts
Just found FileSeek (FS) and it looks like a promising solution for my need.
I would like to use FS to scan image files (mostly JPGs) for text strings in metadata.
Some of these strings are plain ASCII, other are Unicode, UTF-16, UTF-8, DBCS or possibly other character set encodings. (Not all metadata editor follow the specs)
I have read through (most of) the other discussions on this topic and I understand that I can try to specify an encoding in the advanced settings, but it still involves guessing at the exact spelling and availability, even if the encoding is spelled correctly.
Is there a way FS can supply a list of which encodings are supported by making the "Custom file Encoding" a drop-down with built-in encodings.
If not, what happens if I misspell an encoding or ask for an unsupported encoding as soon as I close the 'Settings' rather than wait until I tr yo use it?
When I try a bad encoding, the error message I get points me to the 'documentation; and provides a link, but when I follow this link I end up at the top of a web page, but I find myself lost because I now am expected to do a new search for the topic. And what is worse, the error message text cannot be highlighted and copied to the clipboard for easier searching
TIA for any help.
Sep 29, 2020  • #1
Keith Lammers (BFS)'s profile on WallpaperFusion.com
I'm not sure that FileSeek can search the metadata. How do you normally view the metadata for one of these images? Are you using another utility?
Sep 30, 2020  • #2
Scanner
15 discussion posts
I use a number of utilities to view the data.
Unfortunately, I have found that not all programs, free & otherwise, which claim to be usable to view/edit/display JPG metadata stick to the specifications closely enough. Some are very selective about what they show or handle. (My main interest in all of this is genealogical metadata.)
Hence most of my work looking for data has been with hex editors, but the also are usually limited in their search option and typically allow searching only within a single file.
That is where utilities such as FileSearch look promising. Searching for JPG metadata normally involves only a (relatively small) number of special character encoding, if one sticks to the original spec. But I have found many metadata editors, which seem oblivious to the specs. This comes partly from a fault with the developers, but from also misunderstood specs/expectations and usage by users.
Being able to select the character encoding with reasonable ease would make the job a lot easier. At least the more common formats such as UTF-8, UTF-16, M$'s wide character sets, aside from plain ASCII would be a big help.
I quite understand that supporting arbitrary encodings might be a challenge and I don't really expect that.
Still, from various forums, I have found that others are struggling with similar issues, and a lot of their problems revolve around the numerous encodings used to handle the many European languages.
HTH
Sep 30, 2020  • #3
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Ok, thanks for clarifying! I will check in with our developers to see if there's a list somewhere of what encoding values can be used in that advanced setting.
Oct 2, 2020  • #4
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Ok, you can actually set multiple encoding values in that setting, separated by a comma. The full list is here, make sure to use the "Name" field in the advanced setting: https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding?view=netframework-4.6.2#list-of-encodings

When you specify multiple encodings, FileSeek will search the file once for each encoding, so if you specify 3 encodings, it will search the file 3 separate times.

Hope that helps!
Oct 7, 2020  • #5
Scanner
15 discussion posts
Hi Keith,
thank you for the information and my apologies for being slow to reply.
The encoding names look useful, but I still, I find FS rather confusing; when will it search for the query string in the file name and when will it look for text inside the files - assuming I specify a file pattern in the 'Include files' box.
Somehow, I have lost the option to search inside files, which seemed to work the first time I tried it

Also, FWIW, if FS finds a JPG file which contains the query string, it will show the corresponding image in the box at the lower right, but it will not clear that field the next time I start a new query
Oct 9, 2020  • #6
Keith Lammers (BFS)'s profile on WallpaperFusion.com
The file contents will always be searched with the text in the Query box. On the Advanced tab, there are options to enable/disable also searching the file/folder name with the text from the Query box.

Thanks for the heads up on the image preview not clearing when a new search is started. I've added that to our list to fix up

Thanks!
Oct 15, 2020  • #7
Scanner
15 discussion posts
Good to know that the contents are always searched.
Still, FS does not seem to find things the the old AgentRansack can find for me. While I know nothing of the search algorithm for either app, nor what the built-in character set option is set at for AR, for my needs AR seems to be more likely to give me the results expected
Oct 15, 2020  • #8
Keith Lammers (BFS)'s profile on WallpaperFusion.com
FileSeek just does a straight plaintext search on the file contents unless there's a file handler for the type (e.g. .docx or .pdf). What kind of info are you finding with AgentRansack that you're not able to locate with FileSeek?
Oct 16, 2020  • #9
Scanner
15 discussion posts
Well, I have no idea how it searches, but I am attaching some screenshots and a test file for you.
As you can see, FS does not find anything in the one test file.
If I have missed an option, which would fix this, please let me know.
FWIW, these strings are part of the JPG 'guts' and are in ACII, for the most part, though some other strings of interest are in either UTF8 or multi-byte MBCS, even possibly UTF-16 though I have not been able to fully investigate that part
• Attachment [protected]: FS_AR_Ducky-Screenshot - 2020-10-16 , 2_37_59 PM.png [51,263 bytes]
• Attachment [protected]: FS_FS_Ducky-1-Screenshot - 2020-10-16 , 2_37_59 PM.png [40,879 bytes]
• Attachment [protected]: FS_FS_Ducky-2-Screenshot - 2020-10-16 , 2_37_59 PM.png [36,493 bytes]
• Attachment [protected]: smiley1.jpg [2,909 bytes]
Oct 16, 2020  • #10
Keith Lammers (BFS)'s profile on WallpaperFusion.com
I'm able to find that string in the smiley1.jpg that you attached if I disable the "Process File Contents Using File Handlers" option on the Advanced tab. Does that work for you?
Oct 20, 2020  • #11
Scanner
15 discussion posts
Afraid not.
My 'real' test case has a number of similar files in a directory tree, including that smiley file in several branches, all containing that string, with one of the sub-directories also named Ducky
FS only finds the sub-directory on my system; if I rename the sub-directory, it does not even find that.
As a test, I also cleared out the field in Custom encoding, but it made no difference. It had been "UTF-8;us-ascii", I believe
Oct 21, 2020  • #12
Keith Lammers (BFS)'s profile on WallpaperFusion.com
I noticed in your screenshots that the Query box has its mode set to "Exact Match." Could you try changing that to "Query is Text Query?"
Oct 22, 2020  • #13
Scanner
15 discussion posts
Nothing found for any value in that field.
According to AR, there are 9 files in that tree - and I know the results from AR are correct, because these really are only the same files in different places on the branches of the tree root
Oct 22, 2020 (modified Oct 22, 2020)  • #14
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Can you send me a screenshot of the FileSeek window again after you've tried the search?
Oct 23, 2020  • #15
Scanner
15 discussion posts
Any specific 'page(s)'?
Oct 24, 2020  • #16
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Just the Search tab is fine
Oct 26, 2020  • #17
Scanner
15 discussion posts
see the attached
• Attachment [protected]: FS_AR_Ducky-Screenshot.png [33,288 bytes]
Oct 26, 2020  • #18
Keith Lammers (BFS)'s profile on WallpaperFusion.com
That all looks fine, but the tab for the search looks like it hasn't run yet. Did it run at all?
Oct 26, 2020  • #19
Scanner
15 discussion posts
Not in that particular instance, but it looks the same before the run as it does after
Oct 26, 2020  • #20
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Oh! Your Include Files is set to "Text Query" which doesn't support wildcards. You can either do just
.jpg
in Text Query mode, or you can change it to Pattern and do
*.jpg
Oct 28, 2020  • #21
Scanner
15 discussion posts
Thank you, Keith,
Either one of those does make a big difference and all instances are found
With this 'adjustment', even my search for a UTF-8 string was successful, though I had to try several times to get the proper string in the Custom file Encoding Value.
Here is a place where more information in the FAQ, or even better in a drop down in the dialog to edit this value with valid strings would be very helpful and save time & frustration because it would make guessing (& cussing) unnecessary.
Oct 28, 2020  • #22
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Glad to hear it! It's not a commonly used advanced setting, but we'll look into adding a better list of the available values to the online help section for it.

Thanks!
Oct 28, 2020  • #23
Scanner
15 discussion posts
The main reason for my need is to be able to find strings in languages other than plain ASCII English.
Even listing the options in the FAQ would be a significant help.
If not a list, a reference to an acceptable format for this option.
Is it possible to enter a string as a sequence of either hex 0x?? bytes or other representations, say u+0123?

FWIW, I have just purchased a personal pro license.
Thank you sticking with me on resolving this problem.
Oct 28, 2020  • #24
Keith Lammers (BFS)'s profile on WallpaperFusion.com
No worries, glad I could help, and thank you for your purchase! It's not possible to enter the searches in hex or other representations as far as I know
Oct 28, 2020  • #25
Scanner
15 discussion posts
One can always hope
Oct 28, 2020  • #26
Owen Muhlethaler (BFS)'s profile on WallpaperFusion.com
Hello,

We've released a new version of FileSeek that should have this issue fixed up. Please let us know if you run into any more troubles after updating.

Thanks!
Apr 22, 2021  • #27
Was this helpful?    
<< DiscussionsReply