Date: 19 June 1997
Author: Don Griffin
Both Windows 95 and Windows NT support mechanisms that allows 16-bit applications to load and call 32-bit DLL functions. There are, in fact, multiple techniques for doing this, and each is called a "thunk". There are Universal Thunks, Flat Thunks and Generic Thunks (and possibly more that I am unaware of). I found Generic Thunks to be simple to program and saw that they solved all of my 16-to-32 communication problems.
Using Generic Thunks does not require you to write thunk scripts, which is a big plus in my book. Generic Thunks are also available on both Windows 95 and Windows NT: a requirement that other thunks do not fulfill. The only difficulty I found was that the API calls needed to make Generic Thunks work aren't provided as imported functions. This simply means that one must use GetProcAddress and call the APIs via function pointer.
There are five functions needed to perform the Generic Thunk magic. These are:
HINSTANCE32 LoadLibraryEx32W (LPCSTR,DWORD,DWORD); BOOL FreeLibrary32W (HINSTANCE32); HPROC32 GetProcAddress32W (HINSTANCE32,LPCSTR); DWORD GetVDMPointer32W (LPVOID,UINT); DWORD CallProc32W (HPROC32,DWORD,DWORD,...);
I concocted the HINSTANCE32 and HPROC32 data types; the documentation calls them DWORD and FARPROC respectively (if memory serves). The value returned by GetProcAddress32W is most definitely not a FARPROC and you will not be like the results if you try to call it directly. To prevent such problems at compile time I do not label the return value as FARPROC, but rather HPROC32.
The simplest functions to understand are LoadLibraryEx32W and FreeLibrary32W. These functions behave just like the API calls LoadLibrary and FreeLibrary, except that they expect slightly different arguments. To call LoadLibraryEx32W, you simply need the filename. The other two parameters are ignored and should be 0.
Of similar complexity is GetProcAddress32W. This is the equivalent of GetProcAddress, and takes an HINSTANCE32 returned by LoadLibraryEx32W and a function name as arguments. The return value should be treated as a 32-bit "cookie". It is passed only to CallProc32W. For those adventurous souls, this value is the linear address of the function, but it cannot be called directly from 16-bit code.
GetVDMPointer32W is more of a utility function and is not usually needed when communicating with the 32-bit module. It takes a 16-bit far pointer and size and returns the linear address of that buffer. This is needed if such a pointer is to be passed to 32-bit code. The reason that this function is seldom needed is that CallProc32W will make this conversion as necessary for any pointer parameters you need to pass. It would be useful if you needed to pass a pointer to a structure that contained a pointer. CallProc32W can pass the structure pointer as one of its argument, but the pointer inside the structure would have to be converted into a 32-bit linear address to allow the 32-bit side to dereference it.
I saved the most complex function for last: CallProc32W. This is the function that actually transfers control from your 16-bit code to the 32-bit function. The first argument is simply the value returned by GetProcAddress32W. The third argument is the number of DWORD sized parameters that follow. This must be in the range of 0 to 32. Since all parameters must be 32-bits wide, it is risky to pass int's as a parameter since that will only push 16-bits on the stack. To pass an int, be sure to cast it to a long (or DWORD if it is a WORD, etc.). Fortunately, the return value from CallProc32W is straightforward: it's whatever the 32-bit function returns.
The second parameter to CallProc32W is the hardest to deal with. It is an address conversion bit-mask. Each bit corresponds to one of the DWORD arguments passed. If the bit is set (i.e., is 1), the argument is assumed to be a far pointer and is converted to a linear address before being passed to the 32-bit side. If the bit is 0, the parameter is passed unmodified to the 32-bit code. The difficulty comes when you need to pass several arguments, some of which are pointers. You have to figure out the proper bit-mask value to pass to CallProc32W. The bits are allocated with the least significant bit (LSB, or bit 0) corresponding to the last argument. Bit 1 is corresponds to the second to last argument, bit 2 to the third to last and so on. For example, assume we want to cal foo and it is declared like this:
foo(void *p1, DWORD dw2, void *p3, DWORD dw4, DWORD dw5);
The bits are:
p1: 1 dw2: 0 p3: 1 dw4: 0 dw5: 0
Or 10100 which is 0x14 in hex. To call foo from 16-bit, the call to CallProc32W would look like this:
CallProc32W(hFoo, 0x14, 5, p1, dw2, p3, dw4, dw5);
I decided that I would encapsulate all this complexity in a C++ class designed for use as a base class. The class is called TGenericThunk. The constructor takes a filename, and it calls LoadLibraryEx32W to load the module. The function pointers we need to make the API calls are defined as static class members. The first object constructed will take care of initializing these values. The destructor will call FreeLibrary32W.
The derived class should add HPROC32 members for each function that it will be thunking to, which would be initialized in the derived class constructor. Further, the derived class provides methods that encapsulate the call to CallProc32W for each of those functions. After all this work, the user of the derived thunk class should have no problem using the thunk.
#if !defined(__GENTHUNK_H) && !defined(__WIN32__) #define __GENTHUNK_H DECLARE_HANDLE32 (HPROC32); DECLARE_HANDLE32 (HINSTANCE32); class TGenericThunk { public: ~TGenericThunk (); bool IsOK () const { return mInst32 != 0; } protected: TGenericThunk (const char * fileName); DWORD __cdecl CallProc32 (HPROC32, DWORD fAddrCvt, DWORD nArgs, ...); void FreeVdmPtr32 (void far *); HPROC32 GetProcAddr32 (const char * procName); DWORD GetVdmPtr16 (void far *buffer, unsigned cbSize); void far * GetVdmPtr32 (DWORD dwAddr32, DWORD cbSize); HINSTANCE32 GetInstance () const { return mInst32; } private: HINSTANCE32 mInst32; static HINSTANCE32 (WINAPI *LoadLibraryEx32W) (LPCSTR, DWORD, DWORD); static BOOL (WINAPI *FreeLibrary32W) (HINSTANCE32); static HPROC32 (WINAPI *GetProcAddress32W)(HINSTANCE32, LPCSTR); static DWORD (WINAPI *GetVDMPointer32W) (LPVOID, UINT); static DWORD (WINAPI *CallProc32W) (HPROC32, DWORD, DWORD); }; #endif // __GENTHUNK_H || __WIN32__
#define STRICT #include <windows.h> #pragma hdrstop #ifndef __WIN32__ // only for 16-bit platform #include <stdarg.h> #include "GenThunk.h" HINSTANCE32 (WINAPI *TGenericThunk::LoadLibraryEx32W) (LPCSTR, DWORD, DWORD); BOOL (WINAPI *TGenericThunk::FreeLibrary32W) (HINSTANCE32); HPROC32 (WINAPI *TGenericThunk::GetProcAddress32W)(HINSTANCE32, LPCSTR); DWORD (WINAPI *TGenericThunk::GetVDMPointer32W) (LPVOID, UINT); DWORD (WINAPI *TGenericThunk::CallProc32W) (HPROC32, DWORD, DWORD); TGenericThunk::TGenericThunk (const char * fileName) { // Initialize our static members if they're not already: // if (! LoadLibraryEx32W) { HMODULE hKernel = GetModuleHandle ("KERNEL"); (FARPROC) LoadLibraryEx32W = GetProcAddress (hKernel, "LoadLibraryEx32W"); (FARPROC) FreeLibrary32W = GetProcAddress (hKernel, "FreeLibrary32W"); (FARPROC) GetProcAddress32W = GetProcAddress (hKernel, "GetProcAddress32W"); (FARPROC) GetVDMPointer32W = GetProcAddress (hKernel, "GetVDMPointer32W"); (FARPROC) CallProc32W = GetProcAddress (hKernel, "CallProc32W"); // All or nothing: if (! LoadLibraryEx32W || ! FreeLibrary32W || ! GetProcAddress32W || ! GetVDMPointer32W || ! CallProc32W) { LoadLibraryEx32W = 0; FreeLibrary32W = 0; GetProcAddress32W = 0; GetVDMPointer32W = 0; CallProc32W = 0; } } mInst32 = 0; if (LoadLibraryEx32W) mInst32 = LoadLibraryEx32W (fileName, 0, 0); } TGenericThunk::~TGenericThunk () { if (mInst32) { FreeLibrary32W (mInst32); mInst32 = 0; } } HPROC32 TGenericThunk::GetProcAddr32 (const char * procName) { return mInst32 ? GetProcAddress32W (mInst32, procName) : NULL; } //////////////////////////////////////////////////////////////////////////// // This method makes the call to the 32-bit function referenced by hProc. // The fAddrCvt flags are the trickiest part, while nArgs is simply the // number of DWORD arguments to pass. Each bit of fAddrCvt maps to one of // the arguments. The lowest bit represents the last parameter, the 2nd // lowest bit represents the 2nd-to-last parameter, etc.. If the bit is // set (ie, a 1), that parameters is treated as a 16:16 far pointer. The // 32-bit side will receive a 32-bit pointer to that buffer. If the bit // is not set, the 32-bit side will receive the 32-bit value unchanged. // DWORD __cdecl TGenericThunk::CallProc32 (HPROC32 hProc, DWORD fAddrCvt, DWORD nArgs, ...) { va_list args; DWORD dwTemp; if (! CallProc32W || ! hProc) return 0; va_start (args, nArgs); // Copy the arguments from our stack frame to the stack frame that // CallProc32W expects: // for (DWORD n = nArgs; n; ) { dwTemp = va_arg (args, DWORD); -- n; __asm push word ptr [dwTemp+2]; __asm push word ptr [dwTemp]; } va_end (args); // Call CallProc32W. The pushed variable list precedes the parameters, // as assumed by CallProc32W. Appropriate parameters will be popped by // CallProc32W based on the value of nArgs: // return CallProc32W (hProc, fAddrCvt, nArgs); } //////////////////////////////////////////////////////////////////////////// // This method returns the 32-bit address for the specified 16:16 address // and size. // DWORD TGenericThunk::GetVdmPtr16 (void far *buffer, unsigned cbSize) { return GetVDMPointer32W ? GetVDMPointer32W (buffer, cbSize) : 0; } //////////////////////////////////////////////////////////////////////////// // This method will allocate a 16:16 pointer that points at the specified // 32-bit memory address range. This pointer must be released by calling // FreeVdmPtr32. // void far * TGenericThunk::GetVdmPtr32 (DWORD dwAddr32, DWORD cbSize) { UINT uSelector = ::AllocSelector (_DS); // our DS (for access rights) if (uSelector) { // Assign its linear base address and limit ::SetSelectorBase (uSelector, dwAddr32); ::SetSelectorLimit (uSelector, cbSize); } return MAKELP (uSelector, 0); } //////////////////////////////////////////////////////////////////////////// // This method will free the selector allocated by GetVdmPtr32. // void TGenericThunk::FreeVdmPtr32 (void far *ptr) { if (SELECTOROF (ptr)) ::FreeSelector (SELECTOROF (ptr)); } #endif // __WIN32__
One of the most frustrating things about being a 16-bit app is that the 32-bit Operating Systems take delight in returning "incorrect" version numbers. Well, with a Generic Thunk, we can talk directly to the 32-bit OS and get the true version number.
#include <owl/pch.h> class TGetVersionThunk : public TGenericThunk { public: TGetVersionThunk () : TGenericThunk ("KERNEL32") { // Get proc address for each function we need: m_hGetVer = GetProcAddr32 ("GetVersion"); } // Provide wrapper methods to call CallProc32W: uint32 GetVersion () { return CallProc32 (m_hGetVer, 0, 0); } private: HPROC32 m_hGetVer; }; inline uint16 SwapBytes (uint16 w) { return uint16((w << 8) | (w >> 8));} extern uint16 GetTrueOsVer () { static uint16 trueOsVer; #ifndef __WIN32__ if (! trueOsVer) if (TSystem::IsWoW() || TSystem::IsWin95()) trueOsVer = SwapBytes (uint16 (TGetVersionThunk().GetVersion())); #endif // Not WIN32 or else we failed to thunk to 32-bit: if (! trueOsVer) trueOsVer = SwapBytes (uint16 (::GetVersion())); return trueOsVer; }
Well, that wraps it up for Generic Thunks. There is additional functionality that the 32-bit DLL can exploit if necessary. If your needs grow, the Microsoft Developer Network (MSDN) CD-ROM has good explanations and documentation on the rest of the API for Generic Thunks. There are also limitations in Win95 that prevents the 32-bit code loaded in this context from creating threads. This is also discussed in the MSDN documentation. If anyone has suggestions or questions, feel free to contact me at dongriffin at juno dot com or check out the OWL Listserv.