In the previous article we simulated some keystrokes using Robot, a class provided by AWT. However, there are some limitations with this. For example, there is no way to send input to a specific application. You can trigger global keystrokes, for example, but there's no way to send shortcuts to a specific application other than the application in focus. You can pair Robot with AutoHotKey (a popular Windows automation tool), or we can just do it ourselves with the Win32 API. To access the Win32 API, we'll use the FFI (Foreign Function Interface).
The Win32 API
The Win32 API is a rather low level and powerful API you can use to interface with Windows. We're going to be accessing it using the FFI, the Foreign Function Interface, to load the user32 DLL and call some functions. Note that Ruby or JRuby doesn't necessarily know anything about Windows, but the good thing here is it doesn't need to know about Windows. It doesn't know what the parameters it's pushing onto the stack are, just what type they are and the calling convention used.
However, native DLL files don't hold any type of function signature (the number and type of arguments, as well as the return type). A DLL file is often not much more than a list of functions and the address of that function in memory, as it's intended to be used by C or C++ programs the call signature is stored in the header files. So, in order to use a Win32 function using FFI we need to tell FFI what type of arguments it takes.
We'll start by declaring a module and extending that module with the FFI::Library mixin. This will get us started and give us a number of methods we can use to tell FFI about the functions we want to use.
require 'ffi' module Win extend FFI::Library # … end
We now need to tell FFI which library we want to load. All of the functions we want are in the user32 DLL. We also want to tell it which calling convention we want to use. Low-level programs don't have a set way to push arguments onto the stack and retrieve return values, they only work together via calling conventions. The calling convention we want to use here is referred to the "conventional" calling convention.
require 'java' require 'ffi' module Win extend FFI::Library ffi_lib 'user32' ffi_convention :stdcall end
Great, we're all set to start telling FFI about the functions we want to use. We'll do something easy at first, the very first thing most programmers do when learning how to use the Win32 API: a message box. The function we'll use to do this is called MessageBox. The MSDN documentation is very useful here, as we need to know the exact number and type of the arguments. The FFI Documentation has a list of useful equivalents of C types and "FFI Types."
First note that most Win32 API functions have multiple variants, a W and an A. The W functions are used with wide strings, AKA Unicode strings. However, this is an unnecessary complication in this case (even though Ruby, Java, etc are all unicode-capable), so we'll be using the A variant. This is why the function is called MessageBox, but we call it MessageBoxA.
We can see that the MessageBox function takes four arguments. The first is an HWND argument. You'll see these quite often in Win32 applications, it's a handle to a window, and in Windows speak a "handle" is a pointer, this is of type :pointer in FFI-speak. The next two arguments are of type LPCSTR. Again, more cryptic Win32 stuff, but this is just a pointer to a constant string. In FFI-speak, this type is :string. And finally the last argument is an unsigned int (type :uint) and the return value is int (type :int). Once you get that sorted, it's time to tell FFI about it.
The method to do this is attach_function. This method attaches a C function to a Ruby method. Its first argument is the name of the Ruby method to attach to and second is the function name in the DLL we're interfacing with. We'll build this incrementally, so we start with:
attach_function :message_box, :MessageBoxA, #...
Next we need to give it an array of argument types corresponding with the types of arguments the native function from the DLL. We worked that out in the paragraph above.
[ :pointer, :string, :string, :uint ], #...
And finally, the return value of the function.
Put that all together (along with the boilerplate code from the beginning of the article) and we get this.
require 'java' require 'ffi' module Win extend FFI::Library ffi_lib 'user32' ffi_convention :stdcall attach_function :message_box, :MessageBoxA, [:pointer, :string, :string, :uint], :int end
And that's it. All told that's not very difficult if you're familiar with C functions. Even if you're not, you can usually guess, the worst thing that can happen if you get the function signature wrong is the program will crash. Let's test it out by showing a message box.
Win.message_box nil, "Hello, Win32 world!", "Win32 via FFI", 1