Chapter 4. Foreign Interface

Chez Scheme provides two ways to interact with "foreign" code, i.e., code written in other languages. The first is via subprocess creation and communication, which is discussed in the Section 4.1. The second is via static or dynamic loading and invocation from Scheme of procedures written in C and invocation from C of procedures written in Scheme. These mechanisms are discussed in Sections 4.2 through 4.4.

The method for static loading of C object code is dependent upon which machine you are running; see the installation instructions distributed with Chez Scheme.

Section 4.1. Subprocess Communication

Two procedures, system and process, are used to create subprocesses. Both procedures accept a single string argument and create a subprocess to execute the shell command contained in the string. The system procedure waits for the process to exit before returning, however, while the process procedure returns immediately without waiting for the process to exit. The standard input and output files of a subprocess created by system may be used to communicate with the user's console. The standard input and output files of a subprocess created by process may be used to communicate with the Scheme process.

procedure: (system command)
returns: see below
libraries: (chezscheme)

command must be a string.

The system procedure creates a subprocess to perform the operation specified by command. The subprocess may communicate with the user through the same console input and console output files used by the Scheme process. After creating the subprocess, system waits for the process to exit before returning.

When the subprocess exits, system returns the exit code for the subprocess, unless (on Unix-based systems) a signal caused the subprocess to terminate, in which case system returns the negation of the signal that caused the termination, e.g., -1 for SIGHUP.

procedure: (open-process-ports command)
procedure: (open-process-ports command b-mode)
procedure: (open-process-ports command b-mode ?transcoder)
returns: see below
libraries: (chezscheme)

command must be a string. If ?transcoder is present and not #f, it must be a transcoder, and this procedure creates textual ports, each of whose transcoder is ?transcoder. Otherwise, this procedure returns binary ports. b-mode specifies the buffer mode used by each of the ports returned by this procedure and defaults to block. Buffer modes are described in Section 7.2 of The Scheme Programming Language, 4th Edition.

open-process-ports creates a subprocess to perform the operation specified by command. Unlike system, process returns immediately after creating the subprocess, i.e., without waiting for the subprocess to terminate. It returns four values:

to-stdin is an output port to which Scheme can send output to the subprocess through the subprocess's standard input file.
from-stdout is an input port from which Scheme can read input from the subprocess through the subprocess's standard output file.
from-stderr is an input port from which Scheme can read input from the subprocess through the subprocess's standard error file.
process-id is an integer identifying the created subprocess provided by the host operating system.

If the process exits or closes its standard output file descriptor, any procedure that reads input from from-stdout will return an end-of-file object. Similarly, if the process exits or closes its standard error file descriptor, any procedure that reads input from from-stderr will return an end-of-file object.

The predicate input-port-ready? may be used to detect whether input has been sent by the subprocess to Scheme.

It is sometimes necessary to force output to be sent immediately to the subprocess by invoking flush-output-port on to-stdin, since Chez Scheme buffers the output for efficiency.

On UNIX systems, the process-id is the process identifier for the shell created to execute command. If command is used to invoke an executable file rather than a shell command, it may be useful to prepend command with the string "exec ", which causes the shell to load and execute the named executable directly, without forking a new process---the shell equivalent of a tail call. This will reduce by one the number of subprocesses created and cause process-id to reflect the process identifier for the executable once the shell has transferred control.

procedure: (process command)
returns: see explanation
libraries: (chezscheme)

command must be a string.

process is similar to open-process-ports, but less general. It does not return a port from which the subproces's standard error output can be read, and it always creates textual ports. It returns a list of three values rather than the four separate values of open-process-ports. The returned list contains, in order: from-stdout, to-stdin, and process-id, which correspond to the second, first, and fourth return values of open-process-ports.

Section 4.2. Calling out of Scheme

Chez Scheme's foreign-procedure interface allows a Scheme program to invoke procedures written in C or in languages that obey the same calling conventions as C. Two steps are necessary before foreign procedures can be invoked from Scheme. First, the foreign procedure must be compiled and loaded, either statically or dynamically, as described in Section 4.6. Then, access to the foreign procedure must be established in Scheme, as described in this section. Once access to a foreign procedure has been established it may be called as an ordinary Scheme procedure.

Since foreign procedures operate independently of the Scheme memory management and exception handling system, great care must be taken when using them. Although the foreign-procedure interface provides type checking (at optimize levels less than 3) and type conversion, the programmer must ensure that the sharing of data between Scheme and foreign procedures is done safely by specifying proper argument and result types.

Scheme-callable wrappers for foreign procedures can also be created via ftype-ref and function ftypes (Section 4.5).

syntax: (foreign-procedure entry-exp (param-type ...) res-type)
syntax: (foreign-procedure conv entry-exp (param-type ...) res-type)
returns: a procedure
libraries: (chezscheme)

entry-exp must evaluate to a string representing a valid foreign procedure entry point or an integer representing the address of the foreign procedure. The param-types and res-type must be symbols or structured forms as described below. When a foreign-procedure expression is evaluated, a Scheme procedure is created that will invoke the foreign procedure specified by entry-exp. When the procedure is called each argument is checked and converted according to the specified param-type before it is passed to the foreign procedure. The result of the foreign procedure call is converted as specified by the res-type. Multiple procedures may be created for the same foreign entry.

If conv is present, it specifies the calling convention to be used. The default is #f, which specifies the default calling convention on the target machine. Three other conventions are currently supported, all only under Windows: __stdcall, __cdecl, and __com. Since __cdecl is the default, specifying __cdecl is equivalent to specifying #f or no convention.

Use __stdcall to access most Windows API procedures. Use __cdecl for Windows API varargs procedures, for C library procedures, and for most other procedures. Use __com to invoke COM interface methods; COM uses the __stdcall convention but additionally performs the indirections necessary to obtain the correct method from a COM instance. The address of the COM instance must be passed as the first argument, which should normally be declared as iptr. For the __com interface only, entry-exp must evaluate to the byte offset of the method in the COM vtable. For example,

(foreign-procedure __com 12 (iptr double-float) integer-32)

creates an interface to a COM method at offset 12 in the vtable encapsulated within the COM instance passed as the first argument, with the second argument being a double float and the return value being an integer.

Complete type checking and conversion is performed on the parameters. The types scheme-object, string, wstring, u8*, u16*, u32*, utf-8, utf-16le, utf-16be, utf-32le, and utf-32be, must be used with caution, however, since they allow allocated Scheme objects to be used in places the Scheme memory management system cannot control. No problems will arise as long as such objects are not retained in foreign variables or data structures while Scheme code is running, since garbage collection can occur only while Scheme code is running. All other parameter types are converted to equivalent foreign representations and consequently can be retained indefinitely in foreign variables and data structures. Following are the valid parameter types:

integer-8: Exact integers from -2⁷ through 2⁸ - 1 are valid. Integers in the range 2⁷ through 2⁸ - 1 are treated as two's complement representations of negative numbers, e.g., #xff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually signed char).

unsigned-8: Exact integers from -2⁷ to 2⁸ - 1 are valid. Integers in the range -2⁷ through -1 are treated as the positive equivalents of their two's complement representation, e.g., -1 is treated as #xff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned char).

integer-16: Exact integers from -2¹⁵ through 2¹⁶ - 1 are valid. Integers in the range 2¹⁵ through 2¹⁶ - 1 are treated as two's complement representations of negative numbers, e.g., #xffff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually short).

unsigned-16: Exact integers from -2¹⁵ to 2¹⁶ - 1 are valid. Integers in the range -2¹⁵ through -1 are treated as the positive equivalents of their two's complement representation, e.g., -1 is treated as #xffff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned short).

integer-32: Exact integers from -2³¹ through 2³² - 1 are valid. Integers in the range 2³¹ through 2³² - 1 are treated as two's complement representations of negative numbers, e.g., #xffffffff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually int).

unsigned-32: Exact integers from -2³¹ to 2³² - 1 are valid. Integers in the range -2³¹ through -1 are treated as the positive equivalents of their two's complement representation, e.g., -1 is treated as #xffffffff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned int).

integer-64: Exact integers from -2⁶³ through 2⁶⁴ - 1 are valid. Integers in the range 2⁶³ through 2⁶⁴ - 1 are treated as two's complement representations of negative numbers. The argument is passed to C as an integer of the appropriate size (usually long long or, on many 64-bit platforms, long).

unsigned-64: Exact integers from -2⁶³ through 2⁶⁴ - 1 are valid. Integers in the range -2⁶³ through -1 are treated as the positive equivalents of their two's complement representation, The argument is passed to C as an integer of the appropriate size (usually unsigned long long or, on many 64-bit platforms, long).

double-float: Only Scheme flonums are valid---other Scheme numeric types are not automatically converted. The argument is passed to C as a double float.

single-float: Only Scheme flonums are valid---other Scheme numeric types are not automatically converted. The argument is passed to C as a single float. Since Chez Scheme represents flonums in double-float format, the parameter is first converted into single-float format.

short: This type is an alias for the appropriate fixed-size type above, depending on the size of a C short.

unsigned-short: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned short.

int: This type is an alias for the appropriate fixed-size type above, depending on the size of a C int.

unsigned: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned.

unsigned-int: This type is an alias unsigned. fixed-size type above, depending on the size of a C unsigned.

long: This type is an alias for the appropriate fixed-size type above, depending on the size of a C long.

unsigned-long: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned long.

long-long: This type is an alias for the appropriate fixed-size type above, depending on the size of the nonstandard C type long long.

unsigned-long-long: This type is an alias for the appropriate fixed-size type above, depending on the size of the nonstandard C type unsigned long long.

ptrdiff_t: This type is an alias for the appropriate fixed-size type above, depending on its definition in the host machine's stddef.h include file.

size_t: This type is an alias for the appropriate unsigned fixed-size type above, depending on its definition in the host machine's stddef.h include file.

ssize_t: This type is an alias for the appropriate signed fixed-size type above, depending on its definition in the host machine's stddef.h include file.

iptr: This type is an alias for the appropriate fixed-size type above, depending on the size of a C pointer.

uptr: This type is an alias for the appropriate (unsigned) fixed-size type above, depending on the size of a C pointer.

void*: This type is an alias for uptr.

fixnum: This type is equivalent to iptr, except only values in the fixnum range are valid. Transmission of fixnums is slightly faster than transmission of iptr values, but the fixnum range is smaller, so some iptr values do not have a fixnum representation.

boolean: Any Scheme object may be passed as a boolean. #f is converted to 0; all other objects are converted to 1. The argument is passed to C as an int.

char: Only Scheme characters with Unicode scalar values in the range 0 through 255 are valid char parameters. The character is converted to its Unicode scalar value, as with char->integer, and passed to C as an unsigned char.

wchar_t: Only Scheme characters are valid wchar_t parameters. Under Windows and any other system where wchar_t holds only 16-bit values rather than full Unicode scalar values, only characters with 16-bit Unicode scalar values are valid. On systems where wchar_t is a full 32-bit value, any Scheme character is valid. The character is converted to its Unicode scalar value, as with char->integer, and passed to C as a wchar_t.

wchar: This type is an alias for wchar_t.

double: This type is an alias for double-float.

float: This type is an alias for single-float.

scheme-object: The argument is passed directly to the foreign procedure; no conversion or type checking is performed. This form of parameter passing should be used with discretion. Scheme objects should not be preserved in foreign variables or data structures since the memory management system may relocate them between foreign procedure calls.

ptr: This type is an alias for scheme-object.

u8*: The argument must be a Scheme bytevector or #f. For #f, the null pointer (0) is passed to the foreign procedure. For a bytevector, a pointer to the first byte of the bytevector's data is passed. If the C routine to which the data is passed requires the input to be null-terminated, a null (0) byte must be included explicitly in the bytevector. The bytevector should not be retained in foreign variables or data structures, since the memory management system may relocate or discard them between foreign procedure calls, and use their storage for some other purpose.

u16*: Arguments of this type are treated just like arguments of type u8*. If the C routine to which the data is passed requires the input to be null-terminated, two null (0) bytes must be included explicitly in the bytevector, aligned on a 16-bit boundary.

u32*: Arguments of this type are treated just like arguments of type u8*. If the C routine to which the data is passed requires the input to be null-terminated, four null (0) bytes must be included explicitly in the bytevector, aligned on a 32-bit boundary.

utf-8: The argument must be a Scheme string or #f. For #f, the null pointer (0) is passed to the foreign procedure. A string is converted into a bytevector, as if via string->utf8, with an added null byte, and the address of the first byte of the bytevector is passed to C. The bytevector should not be retained in foreign variables or data structures, since the memory management system may relocate or discard them between foreign procedure calls, and use their storage for some other purpose.

utf-16le: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf16 with endianness little, and they are extended by two null bytes rather than one.

utf-16be: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf16 with endianness big, and they are extended by two null bytes rather than one.

utf-32le: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf32 with endianness little, and they are extended by four null bytes rather than one.

utf-32be: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf32 with endianness big, and they are extended by four null bytes rather than one.

string: This type is an alias for utf-8.

wstring: This type is an alias for utf-16le, utf-16be, utf-32le, or utf-32be as appropriate depending on the size of a C wchar_t and the endianness of the target machine. For example, wstring is equivalent to utf-16le under Windows running on Intel hardware.

(* ftype): This type allows a pointer to a foreign type (ftype) to be passed. The argument must be an ftype pointer of with type ftype, and the actual argument is the address encapsulated in the ftype pointer. See Section 4.5 for a description of foreign types.

The result types are similar to the parameter types with the addition of a void type. In general, the type conversions are the inverse of the parameter type conversions. No error checking is performed on return, since the system cannot determine whether a foreign result is actually of the indicated type. Particular caution should be exercised with the result types scheme-object, double-float, double, single-float, float, and the types that result in the construction of bytevectors or strings, since invalid return values may lead to invalid memory references as well as incorrect computations. Following are the valid result types:

void: The result of the foreign procedure call is ignored and an unspecified Scheme object is returned. void should be used when foreign procedures are called for effect only.

integer-8: The result is interpreted as a signed 8-bit integer and is converted to a Scheme exact integer.

unsigned-8: The result is interpreted as an unsigned 8-bit integer and is converted to a Scheme nonnegative exact integer.

integer-16: The result is interpreted as a signed 16-bit integer and is converted to a Scheme exact integer.

unsigned-16: The result is interpreted as an unsigned 16-bit integer and is converted to a Scheme nonnegative exact integer.

integer-32: The result is interpreted as a signed 32-bit integer and is converted to a Scheme exact integer.

unsigned-32: The result is interpreted as an unsigned 32-bit integer and is converted to a Scheme nonnegative exact integer.

integer-64: The result is interpreted as a signed 64-bit integer and is converted to a Scheme exact integer.

unsigned-64: The result is interpreted as an unsigned 64-bit integer and is converted to a Scheme nonnegative exact integer.

double-float: The result is interpreted as a double float and is translated into a Chez Scheme flonum.

single-float: The result is interpreted as a single float and is translated into a Chez Scheme flonum. Since Chez Scheme represents flonums in double-float format, the result is first converted into double-float format.