What about the possibility of searching for a parameter, without the
specific characters within that parameter?
What I am trying to do is:
Search a global network drive which is a shared drive for patient
names in this format (<first name>,<last name>) or social security
numbers in this format (000.00.000). We need to be able to find
instances of these two entries and delete them if found...but we don't
know the specific names or the specifics social security numbers. Is
there anything your program could do for that?
Search parameters when searching file content
-
- Posts: 593
- Joined: 09 Dec 2005, 17:30
- Location: a step further
- Contact:
Re: Search parameters when searching file content
You can use regular expressions to search file content. With RE you can do this in easy way.
Jiri {x2} Cincura
-
- ALTAP Staff
- Posts: 5231
- Joined: 08 Dec 2005, 06:34
- Location: Novy Bor, Czech Republic
- Contact:
(We would probably need sample of your file.)
Example: supposing we have file new.txt with following content:
Regular expression searching for names in format (string,string):
Regular expression searching for social security numbers in format (nnn.nn.nnn):
Note: this expression will not match for (12.45.678) because of two digits "12" instead of three "123". Is it what are you looking for?
To find files use Commands > Find Files and Directories command, see http://www.altap.cz/salam_en/help/salam ... k_find.htm
Set the Regular expression option.
For syntax of regular expressions see http://www.altap.cz/salam_en/help/salam ... regexp.htm
Let us know if you have any questions...
Example: supposing we have file new.txt with following content:
Code: Select all
xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx(Jan,Rysavy)xxxxxxxxx
xxxxxxxxx(123.45.678)xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx
Code: Select all
\(([a-zA-Z]+),([a-zA-Z]+)\)
Code: Select all
\([0-9][0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9]\)
To find files use Commands > Find Files and Directories command, see http://www.altap.cz/salam_en/help/salam ... k_find.htm
Set the Regular expression option.
For syntax of regular expressions see http://www.altap.cz/salam_en/help/salam ... regexp.htm
Let us know if you have any questions...
- Attachments
-
- findregexp.png (93.56 KiB) Viewed 10890 times
Parameters
Actually, the file would be more like
xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxJan,Rysavyxxxxxxxxx
xxxxxxxxx123.45.678xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx
or it could also look like this in a form format
xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxName: Jan,Rysavyxxxxxxxxx
xxxxxxxxxSS#: 123.45.678xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx
Thanks so much for your reply. This may be exactly what we need!!
xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxJan,Rysavyxxxxxxxxx
xxxxxxxxx123.45.678xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx
or it could also look like this in a form format
xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxName: Jan,Rysavyxxxxxxxxx
xxxxxxxxxSS#: 123.45.678xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx
Thanks so much for your reply. This may be exactly what we need!!
-
- ALTAP Staff
- Posts: 5231
- Joined: 08 Dec 2005, 06:34
- Location: Novy Bor, Czech Republic
- Contact:
Jonathan, there should be some separators. How do you distinguish the name or numbers from the surrounding characters?
Last edited by Jan Rysavy on 01 Apr 2008, 08:21, edited 1 time in total.
They may be in a form or just in the body of a document. Some could be listed like this.
Name: Joe Smith or Resident: Joe Smith .
The problem is that we are searching a large global network drive with random file types and content on it. We need to be able to delete anything with personal information in it. There is no way to be sure that the content would have any kind of seperators.
Name: Joe Smith or Resident: Joe Smith .
The problem is that we are searching a large global network drive with random file types and content on it. We need to be able to delete anything with personal information in it. There is no way to be sure that the content would have any kind of seperators.
-
- ALTAP Staff
- Posts: 5231
- Joined: 08 Dec 2005, 06:34
- Location: Novy Bor, Czech Republic
- Contact:
Do you know all the possible names and numbers before search?
For example you could know there could be only 3 names:
<First1, Last1>
<First2, Last2>
<First3, Last3>
(could be presented in different forms such as: "First Last" or "Last, First")
and only 3 numbers:
<number1>
<number2>
<number3>
Do you know exactly these names and numbers before you start the search?
For example you could know there could be only 3 names:
<First1, Last1>
<First2, Last2>
<First3, Last3>
(could be presented in different forms such as: "First Last" or "Last, First")
and only 3 numbers:
<number1>
<number2>
<number3>
Do you know exactly these names and numbers before you start the search?
Since you don't know where you need to search you need to search anything.JOnathan wrote:They may be in a form or just in the body of a document.
Unless you cannot invent an algorithm for what you want to do, you cannot get a computer do it for you.
In this case you cannot use a simple file search as Altap Salamander offers. The name can be encoded in any way and the program the document is made for will transform it into something readable. Imagine, for example a docx file (create by a recent version of Microsoft Word). Any text that docoment contains is stored within a compressed zip archive file and unless your search tool knows how to make sens of the bits it finds on the disk, it will not find anything sensible. Furthermore, if the names and numbers need to be removed, the tool needs to know how to edit that file, or else you'd be better of zapping your disk right away (writing zeros or random data all over it).The problem is that we are searching a large global network drive with random file types
My advice: zap the drive or get you some people doing it with intelligence using the applications made for editing the file types of which you need to make the content irrecognisable (or else get you a _large_ IT budget and a lot of time).We need to be able to delete anything with personal information in it. There is no way to be sure that the content would have any kind of seperators.