I recently had a requirement to build a script that listed all PST files on some of our DFS folders. I know there are a bunch of ways to get this done, but I wanted to build my own way. Since we needed to search roughly 20 directories, I decided to play with the Start-Job cmdlet so I could have one script to spawn multiple worker processes. At first, it was tricky (since I had never even messed with it), but anyone can easily pick it up.


First, I had to set up some parameters, which was only the paths initially and searched for only PST files, but grew so you can search for, and delete, any file type, recursively. Now, in the usage statement, you’ll see no mention of the -del switch, which is fine, I don’t want people accidentally deleting all their DLL files because they get ‘click happy’ and don’t read. I also added an ‘are you sure’ to the -del switch.

It’s pretty self explanatory, but you basically provide the command, like this:
.\Get-files.ps1 -paths P:\ -ext txt

for multiple paths, you can do this:
.\Get-files.ps1 -paths ("P:\","D:\") -ext txt

For every path you specify, a new job/process is spun up to work that for you. Your only output is this:
Id Name State HasMoreData Location Command
-- ---- ----- ----------- -------- -------
3 P:\ Running True localhost param($path,$ext,$del)...

Notice the state is Running? If you do a get-process, you’ll see that second powershell.exe chomping away. Eventually, depending on the size of the directory structure, the job will finish (I have one that runs for about 30hrs). When finished, the state will be Completed (or Failed), but in either case, you’ll want to look if HasMoreData is True or False. If it’s False, everything went fine with zero output to the console. If it’s true, you’ll want to do a Receive-Job to see the output, and possible error messages.

One thing to note, that’s very important: Do not close the originating powershell window, or you won’t be able to check the status of your jobs, only watch for their processes to die off. It still creates log files, you just can’t see which are still running, if they failed, or if they have more info.

Also, if you want to filter to exclude specific file names or folder, you can edit line 7:
$files = gci $path -r -i *.$ext
to
$files = gci $path -r -i *.$ext | ? {$_.fullname -notmatch 'filefoldername'}
replacing ‘filefoldername’ with whatever you need.

Now, I have the logfile path hard coded to the D:\, but you can change that to whatever you want, it’s at line 6:
$logfile = "D:\" + (($path.replace(":\","_").tostring() + "-$ext.txt"))
Just change the D:\ to whatever you want. That line actually builds the log file name based off the path it’s scanning, and the extension provided. It takes the path given, strips out :\ and replaces it with a _, adds the drive letter to the front, and –extension.txt to the end.

The log files it generates will have a breakdown of the files, grouped by folder, and will have a total count at the end of the log file. Now, if you add the -del switch, There’s no going back, unless you can restore from backup or shadow copy. It will delete everything you tell it to.

Last thing, if it doesn’t find anything in the path specified, it deletes the log file.

With that, here you go, happy scripting: