Monday 13 August 2007

Camera Automation with PowerShell

Following a conversation with some friends I had an idea on Friday last that it would be quite cool to be able to link up a camera to a PC, figure out how to get the PC to drive the camera and then add some image analysis on top so that I could leave it to watch the sky for Perseids so I could possibly take some interesting pictures while not losing out on any beauty sleep. This was just the sort of mini-project I've been looking for as a way to really get a handle on what PowerShell is good at and to see if I can actually build something useful with it. For those of you not familiar with it, PowerShell is a new command shell that Microsoft have developed for Windows platforms. You can find some pretty decent introductory documentation from Microsoft Switzerland here and a quick overview with some handy links from MS Channel 9 here.

What I want to do here.

This should be fairly straightforward but the only way to be sure is to go ahead and build the thing.

  1. Find some way to control a camera and get pictures taken and delivered to the PC
  2. Figure out some way to examine and compare those pictures so we can decide if we want to do something with a specific picture
  3. Write some scaffolding code around both of these to make it all happen in the correct sequence.


Part 1. Getting Started - Connecting to API's and Service Interfaces using PowerShell

We'll go off on a small tangent for a bit first to see how PowerShell can be used to plug into and talk to the various API's and services that the Windows Platform provides. Generally you have to build an application in order to do this sort of stuff - PowerShell makes it pretty simple, quick and more importantly makes the whole exercise interactive so investigating the capabilities of a service or API becomes much more intuitive. Anyway enough waffle - time for some code.

For example if the machine we're using is part of an Active Directory Domain then we can plug into the Active Directory and start poking about really easily:

$Username = "USERID"
$DS_Search = new-object System.DirectoryServices.DirectorySearcher
$DS_Search.Filter="(sAMAccountName=$Username)"
$User=$DS_Search.FindOne()
foreach ($item in $User.Properties["memberof"]) {
if ($item -match "Domain Administrators") {
"$Username is a Domain Administrator"
}
}

Here we create a new object of the system directory services class (which provides an interface to the Active Directory's Global Catalog), create a filter that finds a specific user object and then enumerate the groups the user belongs to to see if they hare a member of a specific group. The point of this is to demonstrate the way that PowerShell can directly instantiate .NET framework objects. Once we have created an object we can then explore its properties interactively using PowerShell's tab completion or by piping the object into the ultra-useful Get-Member cmdlet (aliased by default as gm to speed this sort of thing up). To play with this a bit edit the above with any valid UserID from your domain and then paste it into a PowerShell command line. Hit enter at the end to make sure the loop completes. Unless you are a Domain Admin it will appear to do nothing but if you then type


$User | gm and $User.Properties

You can start to explore the user object interactively. You can use the enumeration concept demonstrated for the "memberof" collection above to dig into the more complex structured properties.

You can also attach to specific .NET assemblies directly rather than instantiating via a namespace. The syntax here is slightly less obvious but the result is identical - you have an object and an interactive shell that allows you to inspect and interact with its properties and methods.



[System.Reflection.Assembly]::LoadFrom("c:tempOpenNETCF.Desktop.Communication.dll") |Out-null
$rapi = New-Object OpenNETCF.Desktop.Communication.RAPI
$ActiveSyncVer=$rapi.ActiveSync.Version.ToString()


This (obviously) won't do anything unless you have ActiveSync installed. Anyway here we directly load an assembly via its DLL and then make use of the namespace that presents, in this case the very useful OpenNETCF desktop interface that provides a .NET wrapper for the RAPI (ActiveSync) functions for controlling Windows Mobile PDA's and SmartPhones. Once again you can explore the capabilities here by simply piping the newly instantiated object into Get-Member/gm and then working with the displayed methods and properties interactively. You will need to use the Connect() method to bind the initial object to any PDAPhone you physically connect before being able to interact fully with it.

Finally we can also instantiate legacy objects from the OLECOM+ name-space that native Windows applications have used for years. This is what we need to for our Camera Automation effort since we need to use the Windows Image Automation service for this and it exposed via a COM+ interface. The following code snippet creates a WIA Management object, enumerates through all connected devices and returns the last object found.

$WIADeviceManager = new-object -comobject WIA.DeviceManager
foreach ($Device in $WIADeviceManager.DeviceInfos) {
$Camera=$Device
}
$CameraControl=$Camera.connect()


The instantiation is very similar to that for native .NET objects but we have to give the "-comobject" hint as a parameter in order to find names (ProgIDs) from within the COM+ name-space. Once again we can enumerate the properties and methods of the object interactively. It is very instructive to drill into this interactively yourself with a camera attached so you can really explore the object structures but here's a basic sequence of commands that steps through creating the object, setting some important things up, taking the picture and then getting it back onto the PC as a file.

$WIAManager = new-object -comobject WIA.DeviceManager
$DeviceList = $WIAManager.DeviceInfos
foreach ($item in $DeviceList) {
$Device=$item
}
$ConnectedDevice = $Device.connect()
$Commands = $ConnectedDevice.Commands
foreach ($item in $Commands) {
if ($item.name -match "take") { $TakeShot=$x.CommandID }
}
$ConnectedDevice.ExecuteCommand($TakeShot)
$Pictures = $ConnectedDevice.Items
foreach ($item in $Pictures) {
$Picture = $item
}
$PictureFile = $Picture.Transfer()
$PictureFile.SaveFile("filename.jpg")


If you play around with the object yourself you will find a significant amount of other data depending on the make of camera attached - including lots of properties of the camera, its state (focusexposureshutter speedwhite balance etc) and similar data about any images that are on the device (format, dimensions, compression ratio, colour depth etc). In the real script we will want to add a lot of error trapping as we're driving a real world object here and they tend to fail to do what you want a lot. For the moment though what we have here is the basic set of capabilities that we need in order to be able to tell the camera to take a picture, retrieve it from the camera and then save it on the PC.

Part 2. Manipulating Images and PowerShell Console Scripting

Now that we have a way to take and retrieve pictures from the camera. The next step is much more specialized but the principle is frequently required for many scripting tasks. Ideally I'd like to have a native APIService to call on here but there isn't one that does what we need (or at least I'm not aware of one) so we're going to have to fall back to the general purpose strategy of driving an external application and then pulling out some data from that task when it's completed. This is a useful enough exercise in itself in any case and if my own mistakes in figuring it out for this task are anything to go by it's something that everyone should be forced to do very early on in their PowerShell learning curve.

What we want to be able to do here is compare sequential pictures with each other and save ones that seem to have something significantly different about them. Eventually we'll want to get smarter and figure out a way to establish deviations from a moving average but this simple comparison will do to get us started. There is a very useful cross platform utilityAPI called ImageMagick that does this sort of thing, it provides a set of tools for manipulating image files and was expressly developed to make it possible to write scripts and applications that could automate image manipulation tasks like format conversion, re-sizing, changing colour palettes etc. The capability that we need is made available to command line scripts via the ImageMagick Compare command. This function is pretty powerful and has quite a few options but for now we're going to limit ourselves to doing a fairly basic comparison between images based on a metric called Mean Error Per Pixel (MEPP) which averages out changes over the whole image. The command syntax we'll be using is:
[ImageMagick-Dir]compare -metric MEPP filename.jpg previous.jpg null:

This prints out result values to the console screen that look like the following:

0 (0,0)
(for identical pictures)
591.746 (0.000264707, 0.282353) (for very similar pictures)
34978.3 (0.361637, 1) (for completely different pictures)


I will be returning to this later when I've had time to establish what metric and value gives the best result but for now we'll assume that an MEPP value of > 1000 indicates that something has noticeably changed between our two test images.

Calling external executables from within PowerShell is quite easy to do using the "&" or ". " operators. I say this despite the fact that I spent a soul destroying couple of hours over the past weekend trying to find them. There is a much more comprehensively documented way to spawn off any application in a separate process (using [System.Diagnostics.Process]::Start() ) but then you have to build some fairly awkward scaffolding around that in order to watch for its termination and then go through a few more hoops to capture the console output stream that we are looking for. Rather than go to that sort of trouble the "&"". " operators syntax works in a fashion that will be much more familiar to traditional command line junkies. There is one quirk as the following will show but it works pretty much as any Perl Windows CMD.EXE Shell script writer should expect. I'm disgusted with myself that it took me so long to figure this out BTW which is why I'm harping on so much - the lesson here is that if you are the type who never bothers to RTFM you are going to be totally FUBAR when the stuff you are looking for is punctuation - the chances of finding information on the ". " command via a search engine or any application help interface are not good and "&" isn't much better.

So now we have a kind of bulky compound command line that we want to execute. From a CMD.EXE shell on my machine it looks like:

"c:Program FilesImageMagick-6.3.5-Q16compare.exe" -metric MEPP filename.jpg previous.jpg null:

The two operators we're dealing with expect their first parameter to be an executable, cmdlet, function or script so we should pass it exactly as it is seen above including the quotes around the executable. We also want to capture the output text into a variable for later so we attempt something like this:

$Compare=& "c:Program FilesImageMagick-6.3.5-Q16compare.exe" -metric MEPP filename.jpg previous.jpg null:

This (unexpectedly) does not capture the output into the variable. Instead we still see the output ( something like 591.746 (0.000264707, 0.282353) ) on the console when we test this interactively. Suspecting that COMPARE.EXE might be sending this output to STDERR we try the classic shell STDERR to STDIN redirection syntax of 2>&1 at the end of the command line.

$Compare=& "c:Program FilesImageMagick-6.3.5-Q16compare.exe" -metric MEPP filename.jpg previous.jpg null: 2>&1

This works but when we echo out the contents of the $Compare variable to the screen we see a bit more than the simple text value that we would see under similar circumstances in a Perl Script for example.

PS > $Compare compare.exe : 591.746 (0.000264707, 0.282353) At line:1 char:13 + $Compare = & <<<< "c:Program FilesImageMagick-6.3.5-Q16compare.exe" -metric MEPP picturethumb.gif lastthumb.gif null: 2>&1

The extra data appears because $Compare is actually an object not a simple string and the PowerShell console is dumping a lot of the properties out at once. Once we get over that minor confusion we can use Get-Member to see what sort of object it is and this indicates that what we actually want at this stage is the return value from $Compare.ToString(). To be even more specific we want to extract out the first few digits that were being sent to the console (591 in the above example). I'm a fiend for misusing Regular Expressions but this is definitely a good place to use a simple one so we're going to fetch the value of our compare operation using the following code.

if ($Compare.ToString() -match "(d+)(.|s)") { $CompareMetric=$matches[1] }
PowerShell encapsulates it's regex's in an object syntax inherited from .NET but the core regex strings (the important part) are very close to Perl5 syntax which is a Very Good Thing. If you don't think Regular Expressions are a worth the effort then I strongly recommend that you drop everything you're doing and head off to buy Jeffrey E.F. Friedl's "Mastering Regular Expressions".

Anyway this all now means that we have a straightforward means of getting an indication of the degree of change between two images.

Part 3. (To Follow) Putting it all together and taking some Pictures.
Coming tomorrow: Gluing it all together and testing various image comparison techniques for sensitivity and performance.

* Edited to fix the code snippet formatting.

No comments: