Win32 API with Powershell

It's been a while ... I have been sunk with some tools for automation testing. It's an interesting topic with client side, however, I have ended up with system Powershell code again :). So, just a quick references, I note down a small part here during my task that is using Powershell to call Win32 API.


Use Case:


I have to use automation tool to capture the tooltip screenshot, again, tooltip screenshot. This is the real problem as almost automation tools that provide the screen capture, they usually render the DOM html content from its browser but not from UI perspective. 


Solution:


So I have to use another process to capture the screenshot in order to capture the tooltip when mouse hover DOM element. Then, a quick way is using Powershell to write the automation and it has capability to call system capturing. We can use Java, Javascript or C# to write the automation however, they will be more complicated than scripting language and Javascript might need other 3rd party for screen capture, so Powershell is my selection. 

To capture screenshot, I have to find the appropriate Window process and its rectangle to push it in front and copy the bitmap on screen into file. Then, some old-school APIs can deal with this easily but how Powershell will trigger it.

First item, find an appropriate Window to move into front.

$chrome = Get-Process -Name "chrome" | ? { ($_.MainWindowHandle -ne 0) }

$chromeHwd = $chrome[0].MainWindowHandle

Voila, you get the Chrome browser (by some reasons, there are some chrome processes running background, so I filter the one with valid Window Handle, means visible window with assumption running only one Chrome instance). So now, I have the chrome browser handle. Then I will bring it to front with user32 Win API, SetForegroundWindow. How can we do that, we will use Add-Type to define .NET type that will be available for your Powershell session (read this).

Add-Type @"
  using System;
  using System.Runtime.InteropServices;
  public class Win32Lib {
    [DllImport("user32.dll")]
    [return: MarshalAs(UnmanagedType.Bool)]
    public static extern bool SetForegroundWindow(IntPtr hWnd);
  }
"@
[Win32Lib]::SetForegroundWindow($chromeHwd)

So ... done! Simple enough, right! We can go here to find the Win API definition for the Add-Type definition, more in depth about Win API with Powershell here and here. Simple usage examples that I first found from here and here

There are some considering about ref parameter for pointer parameter of unmanaged code from Win API, like GetClientRect or ClientToScreen. Here is the full sample

Add-Type @"
  using System;
  using System.Runtime.InteropServices;
  public class Win32Lib {
    [DllImport("user32.dll")]
    [return: MarshalAs(UnmanagedType.Bool)]
    public static extern bool GetWindowRect(IntPtr hWnd, out RECT lpRect);
    [DllImport("user32.dll")]
    [return: MarshalAs(UnmanagedType.Bool)]
    public static extern bool GetClientRect(IntPtr hWnd, out RECT lpRect);
    [DllImport("user32.dll")]
    [return: MarshalAs(UnmanagedType.Bool)]
    public static extern bool SetForegroundWindow(IntPtr hWnd);
    [DllImport("user32.dll")]
    [return: MarshalAs(UnmanagedType.Bool)]
    public static extern bool ClientToScreen(IntPtr hWnd, ref POINT lpPoint);
  }
  
  public struct RECT
  {
    public int Left;
    public int Top;
    public int Right;
    public int Bottom;
  }

  public struct POINT
  {
    public int x;
    public int y;
  }
"@
$Error.Clear()
$rcWindow = New-Object RECT
$rcClient = New-Object RECT
$chrome = Get-Process -Name "chrome" | ? { ($_.MainWindowHandle -ne 0) }

$chromeHwd = $chrome[0].MainWindowHandle
[Win32Lib]::GetWindowRect($chromeHwd, [ref]$rcWindow)
[Win32Lib]::SetForegroundWindow($chromeHwd)
[Win32Lib]::GetClientRect($chromeHwd, [ref]$rcClient)
$upleft = New-Object POINT
$upleft.x = $rcClient.Left
$upleft.y = $rcClient.Top
[Win32Lib]::ClientToScreen($chromeHwd, [ref]$upleft)

Another tip on mouse move for Powershell here:

Add-Type -AssemblyName System.Windows.Forms
[Windows.Forms.Cursor]::Position = "$($elementPoint.x), $($elementPoint.y)"

If you want to find something about Win API from Powershell, try to search with "PInvoke" or "Invoke-Win32", there should me some useful information out there.

That's all for today. I will write out some more about the automation tools very soon.

Comments