Processing Ajax...

Title
Close Dialog

Message

Confirm
Close Dialog

Confirm
Close Dialog

Confirm
Close Dialog

User Image
Matt Cameron
11 discussion posts
FileSeek is a wonderful program enough so that i convinced the powers that be to run a site license here. I am using it primarilly to locate credit card information within our network.

Using the iFilters and Filter Packs is in theory the way i need to go to scrub out erroneous data that I am not looking for such as meta data for office filles and pdfs. I have installed the Adobe and Office Filters from the links provided however it does not look like there are working.

I Read for pdf filters on a 32bit os that you might need to install Adobe Reader which i have done.
http://www.documentlocator.com/support/get-ifilters.htm

Either way between the Adobe Filter listed on the FAQ and adobe reader i would have expected that the files would be parsed properly. If i look up File Handlers in FileSeek i see that the .pdf's handler is query.dll which should instead be AcroRdIF.dll. ( even though the Microsoft Filter Pack is installed the .xlsx is also pointing to Query.dll)

Its not the program's fault but do you know a was that i can manually change the handler. I think this information is derived from HKLM\Software\Classes\{extension}\PersistentHandler but im not sure.

Also how often does the handler list attempt to refresh itself. I have noticed that some extensions seem to dissappear and reappear.

My list of handlers is attached for your viewing pleasure.
• Attachment [protected]: export.txt [37,872 bytes]
Jun 5, 2013 (modified Jun 5, 2013)  • #1
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Awesome, glad to hear you like it!

There's actually a bug in the current version that messes up the File Handler list in the Settings, but that will be fixed in the next beta.

However, the file handlers should still be working if you have them installed. Could you check the Advanced tab for your search to make sure the "Process file contents using File Handlers" option is enabled?
Jun 6, 2013  • #2
User Image
Matt Cameron
11 discussion posts
Yes it is enabled. For the time being i will try and manually verify the filters using the registry.

I think something is working since i dont get many "No File Handler" errors. However i get matches along the lines of

Code

(more)186801120.819800.8604352.4652806078731815.541088010131372.5410880101312192464.383104587610433-8739-8(more)

Code

556 556 556 556 556 556 556 556 556 556 278 0 0 0 0 0

Code

(more)axWidth 1690/FontWeight 400/XHeight 250/StemV 50/FontBBox[ -476 -250 1214 750] /FontFile2 2941 0 R>>


Which looks like meta data and it should not be part of the match.
Jun 6, 2013  • #3
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Ok, strange! We've just posted FileSeek 3.1. Could you give it a try?
Jun 6, 2013  • #4
User Image
Matt Cameron
11 discussion posts
Forgive the hiatus as i was away from work for a while. I am still having issues getting FileSeek to read pdf's properly and ignore metadata like font configuration and the like

I have done this so far

Updated FileSeek to 3.1.1
Uninstalled Adobe Reader X and installed XI. I had read that X had issues with programs using the filters so i updated.
Did a search with Reader present and while uninstalled and i got the same result?

Now that the handler bug is gone i can confirm that the .pdf is point to the corrent dll inside the Adobe Reader XI installation.

In the end while doing searches i get results like below which i am trying to get rid of. I can

Code

/Widths [ 250 250 371 250 500 840 778 208 333 333 389 250 250 333 250 606 500


The above hit came from running this against the pdf file i have attached as an example. Just a income tax guide ... nothing secret. This should have been skipped over. THe regex query i ran was:

Code

\b(4(?:\d[ -]*?){12}(?:(?:\d[ -]*?){3})?|5[1-5](?:\d[ -]*?){14}|3[47](?:\d[ -]*?){13})\b


Not sure where the fault lies. Hoping to see if you can recreate the issue.
• Attachment [protected]: t2 guide.pdf [595,100 bytes]
Sep 9, 2013  • #5
Keith Lammers (BFS)'s profile on WallpaperFusion.com
No worries, thanks for the update! I'll test this out next week and keep you posted on what I find out.

Thanks!
Sep 13, 2013  • #6
Keith Lammers (BFS)'s profile on WallpaperFusion.com
Finally had a chance to test this out here, but didn't run into any issues. However, I'm using the standalone PDF iFilter instead of the one that comes with Adobe Reader. Could you try uninstalling Adobe Reader and install this PDF iFilter instead?

http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611
Sep 18, 2013  • #7
Subscribe to this discussion topic using RSS
Was this helpful?  Login to Vote(-)  Login to Vote(-)