REF SYSPOP11 John Gibson Aug 1987 COPYRIGHT University of Sussex 1987. All Rights Reserved. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<< >>>>>>>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<< POP-11 SYSTEM DIALECT >>>>>>>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<< (sysPOP-11) >>>>>>>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<< >>>>>>>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< WARNING: THIS IS INCOMPLETE AND OUT OF DATE =========================================== Overview -------- This file describes sysPOP-11, the extended dialect of POP-11 used in POPLOG system source files. Its purpose is to enable system source code to perform operations that are outside the scope of normal POP-11, but which are nevertheless essential for the underlying implementation of the system. (Note that sysPOP-11 is recognised only by the system source compiler POPC, and is not available inside the normal POPLOG run-time system.) In general, sysPOP-11 is POP-11 plus a collection of extra syntactic constructs, and the ability to manipulate values which are not proper POPLOG objects. We deal with the latter first. Pop and Non-Pop Values ---------------------- An essential requirement for some parts of the source code that implements POPLOG is the ability to manipulate data that does not conform to the normal representation of POPLOG items; this includes, for example, integers in normal 'machine' rather than POPLOG format, as well as pointers/addresses that do not necessarily address sensible POPLOG objects. We call these 'non-pop' values, as distinguished from properly-represented 'pop' data. The basic problem with handling non-pop data concerns garbage collection. Normally the garbage collector will test all values in the system for being structure-pointers, tracing and copying structures so found and relocating their pointers, etc; on the assumption that non-pop values are not actually distinguishable as such, confusion would result if the garbage collector were allowed to process them. It is thus necessary to confine the occurrence of these values to locations that are specially defined to hold them, and that are ignored by the garbage collector. To this end, sysPOP segregates all identifiers, both constants and variables, into pop type and non-pop type. Generally, non-pop values must only ever be assigned to non-pop identifiers, and never to pop ones (or indeed to any locations, like structure fields etc, which should contain proper pop items). For the other way round, pop values that are simple (i.e. not pointers) may safely be assigned to non-pop identifiers, but assigning a compound item to one may invalidate it if a garbage collection occurs. [There is, however, one set of locations that unavoidably must contain both types of value: the user stack. For this, the garbage collector will attempt to determine whether values are of pop or non-pop type, but this is not guaranteed to succeed. In practice the problem almost never arises, because most system procedures that pass non-pop arguments and results do so in a context where garbage collections cannot happen.] In parallel with the manipulation of non-pop data, sysPOP also defines a non-pop 'procedure' call, for executing subroutines. Nominally, subroutines are units of code defined by the hand-coded assembler files that form part of the system source, and which obey a much simpler calling protocol than ordinary POPLOG procedures; in practice however, many of them are explicitly recognised by POPC and compiled directly into in-line code. A subroutine call is defined simply as a call of a non-pop identifier, either constant or variable as for ordinary procedures (i.e. in both cases, the value of the identifier is the address of the subroutine). Subroutine arguments and results are passed on the user stack in the normal way. The Underscore Convention ------------------------- Non-pop identifiers in sysPOP have no special declaration construct, but are simply recognised on a naming basis: a non-pop identifier is one whose name begins with the character `_` (underscore). This applies to all variables and constants, both permanent and lexical. All other identifiers, whose names do not begin with `_`, are pop-type and have their standard POP-11 meaning. Thus constant _c; vars _v; lconstant _lc; lvars _lv; all declare non-pop identifiers. Remember that any values assigned to such identifiers will be TOTALLY IGNORED by the garbage collector. A call of a non-pop identifier is interpreted as a subroutine call, e.g. constant _a_subr; define foo(_x); lvars _x; _a_subr(_x) enddefine; is a procedure that calls the subroutine -_a_subr- on its (non-pop valued) argument. Alternatively, define baz(_x, _subroutine); lvars _x, _subroutine; _subroutine(_x) enddefine; is a procedure that calls a variable subroutine supplied as its second argument, applying it to its first argument. The call baz(_value, _a_subr); would then have the same effect as foo(_value). Note that, like ordinary POP-11 procedures, constant or variable subroutines can be declared and used as operators. The convention of flagging non-pop values with an `_` prefix also applies to numeric literals. That is, an integer or floating-point constant prefixed with an underscore will generate the constant in normal machine format rather than ordinary POP-11 format. E.g. _1, _16:FFFF, _-3 all produce (signed) machine integer values (cf -_int- described below), while _2.5s0 results in a machine format single-length float. For double-length floats, the result is a pointer to a machine format double, e.g. _-29.3, _0.5689e2 generate addresses of double floats in memory. For literals where the underscore does not lexically join with the character immediately following it, the macro _: may be used, e.g. _:`A` generates a machine format `A` character constant. This can in fact be used with any numeric literal, e.g. _:1, _:16:FFFF, _:-3 including a numeric literal defined as a macro, as in lconstant macro FOO = 14; _:FOO -> _x; etc. The final use of _: is to generate references to symbolic expressions, the values of which are only known to the assembler processing the code output by POPC. In this case, the _: is followed by a string containing the expression. E.g. in VAX VMS, the symbol RMS$_FNF is an assembler name for a particular return status code, the value of which could be referenced as _:'RMS$_FNF' Standard Subroutines -------------------- The following sections in this file describe the standard subroutines made available by POPC (many of which actually compile to inline code). It is important to note that, unlike most normal POP-11 procedures, these operations do not check the values of their arguments; indeed, where the arguments are non-pop values, there is no way of doing so. Hence, garbage out will result from garbage in. Arithmetic Operations and Comparisions on Machine Integers ---------------------------------------------------------- _I _add _J -> _K [subroutine operator 5] _I _sub _J -> _K [subroutine operator 5] _I _mult _J -> _K [subroutine operator 4] These operators respectively add, subtract and multiply their arguments as signed machine integers. If overflow occurs, the high order bits of the result will be truncated. (Of these operators, at least -_add- and -_sub- should always compile to inline code.) _I _div _J -> _REM -> _QUOT [subroutine operator 4] Divides the machine integer _I by the machine integer _J, producing a quotient _QUOT and a remainder _REM (Cf / for pop integers). _negate(_I) -> _J [subroutine] Returns the negation of its signed machine integer argument. If overflow occurs, the sign bit of the result will be truncated. _I _gr _J -> BOOL [subroutine operator 6] _I _greq _J -> BOOL [subroutine operator 6] _I _lt _J -> BOOL [subroutine operator 6] _I _lteq _J -> BOOL [subroutine operator 6] These operators compare their arguments as UNSIGNED machine integers for greater than, greater than or equal, less than and less than or equal respectively. _I _sgr _J -> BOOL [subroutine operator 6] _I _sgreq _J -> BOOL [subroutine operator 6] _I _slt _J -> BOOL [subroutine operator 6] _I _slteq _J -> BOOL [subroutine operator 6] These operators compare their arguments as SIGNED machine integers for greater than, greater than or equal, less than and less than or equal respectively. _zero(_I) -> BOOL [subroutine] _nonzero(_I) -> BOOL [subroutine] _neg(_I) -> BOOL [subroutine] _nonneg(_I) -> BOOL [subroutine] Tests on machine integers for being zero, non-zero, negative and non-negative (>= 0) respectively. Bitwise Logical Operations on Machine Integers ---------------------------------------------- _I _biset _J -> _K [subroutine operator 4] _I _biclear _J -> _K [subroutine operator 4] _I _bimask _J -> _K [subroutine operator 4] _I _bixor _J -> _K [subroutine operator 4] Logical bit operations on machine integers: -_biset- ORs its arguments, so that _K has ones in each position that either _I or _J have. (Cf || for pop integers.) -_biclear- ANDs _I with the bit complement of _J, so that _K has ones in each position that _I has ones but _J has zeros. (Cf &&~~ for pop integers.). -_bimask- ANDs its arguments, so that _K has ones in each position that both _I and _J have. (Cf && for pop integers.) -_bixor- EXCLUSIVE ORs its arguments, so that _K has ones in each position that either _I or _J have but not both. (Cf ||/& for pop integers.) _logcom(_I) -> _J [subroutine] Returns the bitwise complement of its argument, i.e. _J has ones in each position that _I has zeros. (Cf ~~ for pop integers.) _shift(_I, _J) -> _K [subroutine] Shifts _I left or right by _J bit positions, depending on the sign of _J (i.e. positive = left shift, negative = right shift). The shift is ARITHMETIC, meaning that on a right shift, the sign bit of _I is propagated from the left; if overflow occurs on a left shift, the high order bits are truncated. (Cf << for pop integers, although this of course deals with overflow by producing a big integer.) _I _bitst _J -> BOOL [subroutine operator 4] This operator returns true if _I and _J have any common bits set, i.e. same as _nonzero(_I _bimask _J) but faster. Conversion Between Pop and Machine Integers ------------------------------------------- _int(I) -> _I [subroutine] Converts a simple pop integer I to the equivalent machine integer _I. Since the range of a machine integer is always greater than or equal to that of a pop integer, overflow cannot occur. _pint(_I) -> I [subroutine] Converts a signed machine integer _I to the equivalent simple pop integer I. This operation can overflow; if it does, the value of the result is undefined. _pint_testovf(_I) -> -> I [subroutine] _pint_testovf(_I) -> Same as -_pint-, but tests for overflow. If overflow occurs, the single result is returned; otherwise, there are two results, and the equivalent simple pop integer. Arithmetic Operations on Pop Integers ------------------------------------- Note that of the standard POP-11 fast integer operators at least fi_+ and fi_- are compiled by POPC into inline code. _padd_testovf(I, J) -> BOOL -> K [subroutine] _psub_testovf(I, J) -> BOOL -> K [subroutine] _pmult_testovf(I, J) -> BOOL -> K [subroutine] Add, subtract and multiply two pop simple integers with a test for overflow. BOOL is if overflow doesn't occur, if it does. Note that whether overflow occurs or not, the result K is always returned. _pshift_testovf(I, _N) -> -> J [subroutine] _pshift_testovf(I, _N) -> Arithmetically shifts left the simple pop integer I by machine integer _N places (_N must be not be negative). If overflow occurs, the single result is returned; otherwise, there are two results, and the shifted pop integer. Control Stack Operations ------------------------- These subroutines provide access to the POPLOG control stack. _sp() -> _SFPTR [subroutine] Returns a pointer to the call stack frame for the current procedure. _caller_sp() -> _SFPTR [subroutine] Returns a pointer to the call stack frame for the caller of the current procedure. (Same as _nextframe(_sp()) but faster.) _nextframe(_SFPTR) -> _NEXT_SFPTR [subroutine] Given a pointer to a call stack frame, returns a pointer to the next caller's frame. User Stack Operations --------------------- These subroutines provide access to the POPLOG user stack. _userhi [variable] This variable always holds the limit address for the user stack, i.e. the address of the word following its end. Changed by -_move_userstack- when the stack is shifted up or down in memory. _user_sp() -> _USPTR [subroutine] _USPTR -> _user_sp() Returns or updates the current user stack pointer, i.e. a word pointer to its top item. If the stack is empty then _user_sp() == _userhi will be true. The updater should be used very carefully indeed. _stklength() -> _WOFFS [subroutine] Returns the current size of the user stack as a word offset _WOFFS. It will always be true that _user_sp()@(w){_WOFFS} == _userhi _useras(_WOFFS) [subroutine] Erases a block of words of word offset size _WOFFS from the top of the user stack. _move_userstack(_WOFFS) [subroutine] Shifts the whole user stack up or down in memory by an amount given by the signed word offset _WOFFS, changing both the user stack pointer and -_userhi- appropriately (i.e. the new _userhi will be _userhi@(w){_WOFFS} ). Procedure Calling/Chaining -------------------------- These subroutines are optimised to inline code. _chainfrom_caller(P) [subroutine] _fast_chainfrom_caller(P) [subroutine] Chains the procedure P from the caller of the current procedure, with or without a procedure check on P. Effectively the same as chain(P, chain) chain(P, fast_chain) in normal POP-11. _srchain(_SUBROUTINE) [subroutine] Chains the subroutine _SUBROUTINE from the current POP procedure, i.e. unwinds the current procedure and then calls _SUBROUTINE. Special Operations ------------------ _mksimple(COMPOUND) -> SIMPLE [subroutine] Disguises a compound object (i.e. a pop pointer) as a simple one, in such a way that the original value can later be recovered with -_mkcompound-. How this is done is unspecified; it is merely guaranteed that if iscompound(COMPOUND) is true then issimple(SIMPLE) will be true also, and moreover, _mkcompound(SIMPLE) == COMPOUND. Use this with care; a pointer disguised as a simple object will not be processed properly by the garbage collector. _mkcompound(SIMPLE) -> COMPOUND [subroutine] Recovers the value of a compound object disguised as a simple one with _mksimple. It should never be used for any other purpose. _subsv0 plog_trail_push sysPOP-11 Syntax Constructs --------------------------- _extern [syntax] _extern_indir [syntax] Used for calling external routines. exported [syntax] nonexported [syntax] protected [syntax] These three syntax words may be used to prefix a -constant- or -vars- statement (at execute level). Both -exported- and -protected- cause permanent identifiers so declared to be exported to the normal run-time system dictionary, the latter with the identifier marked as protected. -nonexported- prevents exportation of identifiers. normal_compile [syntax] end_normal_compile [syntax] WRITEABLE Structure and Pointer Operations -------------------------------- sysPOP-11 provides a set of syntax constructs for performing operations with pointers to data and data structures, which are in many ways similar to the facilities available in the "C" language (although by no means identical). The remainder of this file is devoted to an explanation of these constructs. Types ----- All pointer and related operations require a type specification for the unit of data to which they refer. Pointers are actually assumed to be of three basic types, viz "byte", "short" and "word", other types being derived from these. Currently, the only such derived types available are "double" (= 2 words) and "vpage" (= N words, depending on the virtual page size), but at some time in the future the "structure" syntax construct will be redesigned along the lines of "struct" in C to allow "structure X" as a type, etc. As in the "structure" construct, a minus sign prefixing a type indicates a signed value as opposed to an nsigned value (e.g "-byte"). The types "byte", "short", "word" and "double" may be abbreviated to "b", "s", "w" and "d" respectively. Bitfield types are specified by positive and negative integers (respectively signed and unsigned) as for "conskey". Because it is assumed that there are no proper bit pointers, and that bit fields will always be accessed via byte, short or word pointers (depending on the implementation), those operations which take the address of a field are not allowed for bitfields. There is also a type "code" for pointers to executable machine code. This will be synonymous with "byte", "short" or "word" according to the particular implementation. struct ! and @ ------- The syntax operators "!" and "@" respectively access a value through a pointer, or construct a new pointer to that value. Syntactically, the two are identical, the operator being preceded by a and followed by either a (for a field declared with -struct-), or a in parentheses. E.g. pair!P_BACK accesses the P_BACK word of a pair, while _bptr!(b) gets the byte pointed to by -_bptr-. In the case of a declared in a structure X, the input pointer is taken to be a pointer to "struct X", conversion of the pointer to the field type being effected when necessary. Index Expression ---------------- The or () following ! or @ may optionally be followed by a machine integer in square brackets [...], where an index value of N references the (N+1)-th component of the given type. So string!V_BYTES[_3] gets the fourth byte of the string "string", and _wptr!(w)[_n] accesses the _n-th plus 1 word at "_wptr". These two examples are respectively equivalent to string@V_BYTES[_0] -> _bptr; _bptr!(b)[_3] and _wptr@(w)[_n] -> _wptr; _wptr!(w) etc. (Note that a zero index is the same as none, i.e. there is no difference between 'string@V_BYTES' and 'string@V_BYTES[_0]' .) The may of course have a negative value, but a positive value may taken in the negative sense by preceding the "[" with a minus sign, e.g. _wptr!(w)-[_n], _wptr!(w)[_negate(_n)] both access the _n-th plus 1 word backwards from "_wptr" (although the first form produces better code where the index is non-constant). Offset Expression ----------------- Instead of an index, the or () may be followed by an in curly brackets {...}. An offset is a value whose units are already those of the address units of the pointer, and which is added to/subtracted from the pointer directly (as opposed to an index, which is first scaled by multiplying by the size in address units of the data type, to turn it into an offset). Offsets are produced by the "@@" syntax word, described below. For example, _wptr!(w)[_n] -> _word; is equivalent to @@(w)[_n] -> _woffs; ;;; word index _n to word offset _wptr!(w){_woffs} -> _word; In general, using offsets rather than indices produces more efficient code, since it can avoid the continual scaling necessary with the latter. An offset can also be specified as the difference between two pointers, by replacing the single with two s separated by a comma. E.g. _ptr1@(w){_ptr2, _ptr3} gives "_ptr1" offset by the offset from "_ptr3" to "_ptr2" (that is, _ptr3@(w){_ptr2, _ptr3} is equal to "_ptr2"). As with indices, the "{" may be preceded by a minus sign to indicate subtraction of the offset rather than addition. Postincrement/Predecrement -------------------------- Any of the variations on ! or @ described thus far can also take an optional autoincrement or autodecrement modifier, viz "++" or "--", causing the total value of the pointer construct to be incremented or decremented by the size of one unit of the data type. In the case of !, the new pointer is produced as a second result, the accessed value being got before incrementing for "++" (postincrement) and after decrementing for "--" (predecrement). Reflecting this difference, "++" appears at the end of the construct, while "--" appears before it in front of the ! or @. E.g. _ptr!(w)++ -> _ptr -> _word; is the same as _ptr!(w) -> _word; _ptr@(w)[_1] -> _ptr; while _ptr--!(w) -> _ptr -> _word is equivalent to _ptr@(w)-[_1] -> _ptr; _ptr!(w) -> _word; For "@", _ptr@(w)[_1], _ptr@(w)++ are the same, as are _ptr@(w)-[_1], _ptr--@(w) Note that "++" and "--" can be used with indices or offsets, e.g. string!V_BYTES[_3]++ -> _bptr -> _byte; returns the fourth byte of the string and a pointer to its fifth byte. @@ and ## --------- The operation of "@" can be thought of as evaluating a total offset which is then added to the preceding it. This total offset can be obtained directly by substituting @ with the syntax word "@@", and omitting the preceding pointer. For example, @@V_WORDS[_3] -> _woffs; returns the word offset to the fourth word of a word vector. To increment this to the fifth word, we could use @@(w){_woffs}++ -> _woffs; etc. Or, @@ can be used get the offset difference between two pointers, as in @@(b){_bptr1, _bptr2} and so on. "##" is the same as "@@" except that it returns the total offset as an index, dividing it first by the size in address units of the data type. Thus for example, ##(w){ @@(w)[_n] } is identically equal to _n. Type Conversion ---------------- Pointers to different data types must not be assumed to be interchangeable (even though they MAY be in a particular implementation); when a pointer to one type is to be used to access a datum of another type, the necessary conversion must be specified. When ! or @ is followed by a , conversion between 'pointer to structure X' and 'pointer to x' (where x is the type of the field) is implicit, and no other conversion can be specified. When not used with a field, the type of the incoming pointer can be specified as different from the main type of the construct by preceding the main type with ' ->', as in _wptr!(w->b) which converts the word pointer "_wptr" to a byte pointer, and returns the byte pointed to by the latter. Similarily, _wptr@(w->b)[_3] returns the converted pointer incremented by 3 bytes. Type conversion may also be specified for indices and offsets. In this case, the , or pair of s inside [...] or {...} may be followed by ' | ' to specify the type of the index or offset as being different from the main type. For example, _bptr!(b)[_2|w] gives the byte at offset 2 words from "_bptr", while ##(b){_woffs|w} returns a word offset _woffs as a number of bytes. Unless otherwise specified, all the above conversions on pointers, indices and offsets are (when possible) effected without any truncation or rounding (put another way, the conversion is done as 'straightforwardly' as possible). Where explicit truncation or rounding is required, the alternate for the pointer, index or offset may be followed by '.t' (for truncation) or '.r' (for rounding). Thus @@(w)[_bindex|b.r] converts a number of bytes "_bindex" to a word offset, rounded to a whole number of words, while _wptr@(w.t->vpage) truncates a word pointer to the preceding virtual page boundary. Pointer Comparison ------------------ While equality of two pointers can be tested for (as previously) by using "==", ordinal comparisons between pointers must now employ the syntax operators <@ (less than) <=@ (less than or equal), >@ (greater than) >=@ (greater than or equal) where in each case, the operator is followed by a in parentheses. For example, to fill a string with characters from the stack, lvars string, _bptr, _blim; string@V_BYTES[_0] -> _blim; ;;; low limit for _bptr _blim@(b)[string!V_LENGTH] -> _bptr; ;;; start _bptr after last byte while _bptr >@(b) _blim do _int() -> _bptr--!(b) -> _bptr ;;; insert next popint byte endwhile; Note that these operators must be used only for pointers, and not for offset or index values. The latter are signed system integers, and must therefore be compared with "_slt", "_slteq", "_sgr" and "_sgreq". The above example implemented using offsets rather than pointers would be lvars string, _boffs, _blim; @@V_BYTES[_0] -> _blim; ;;; low limit for _boffs @@(b)[string!V_LENGTH] -> _boffs; ;;; start _boffs after last byte while _boffs _sgr _blim do _int() -> string!(w->b){_boffs}; ;;; insert next popint byte --@@(b){_boffs} -> _boffs ;;; decrement _boffs endwhile; Pointers to POPLOG Data Structures ---------------------------------- Pointers to proper POPLOG data structures are always of type "word" -- this is why the '(w->b)' conversion was necessary in the previous example. In fact, it would be permissible to rewrite this example as lvars string, _bptr, _boffs, _blim; @@V_BYTES[_0] -> _blim; ;;; low limit for _boffs @@(b)[string!V_LENGTH] -> _boffs; ;;; start _boffs after last byte string@(w->b) -> _bptr; ;;; convert to byte pointer while _boffs _sgr _blim do _int() -> _bptr!(b){_boffs}; ;;; insert next popint byte --@@(b){_boffs} -> _boffs ;;; decrement _boffs endwhile; that is, converting the word pointer "string" to a byte pointer before commencing the loop. However, because of relocation happening during a garbage collection, care must be exercised when dealing with non-word pointers to POPLOG structures, or pointers of any type that go into the middle of a such a structure. If a garbage collection can occur in a particular context, then a structure pointer must always be retained in its 'proper' form in a pop-type (i.e. non-underscore) variable. For example, since _CHECKUSER is a macro that generates a check for user stack overflow (which can lead to a garbage collection), the following two variants for exploding a string are both incorrect: lvars string, _bptr, _blim; string@V_BYTES[_0] -> _bptr; string@V_BYTES[string!V_LENGTH] -> _blim; while _bptr <@(b) _blim do _CHECKUSER; ;;; can cause a gc, invalidating _bptr _pint( _bptr!(b)++ -> _bptr ); endwhile; lvars string, _bptr, _boffs, _blim; @@V_BYTES[_0] -> _boffs; @@V_BYTES[string!V_LENGTH] -> _blim; string@(w->b) -> _bptr; while _boffs _slt _blim do _CHECKUSER; ;;; can cause a gc, invalidating _bptr _pint( _bptr!(b){_boffs} ); @@(b){_boffs}++ -> _boffs; endwhile; The correct version is lvars string, _boffs, _blim; @@V_BYTES[_0] -> _boffs; @@V_BYTES[string!V_LENGTH] -> _blim; while _boffs _slt _blim do _CHECKUSER; ;;; if a gc occurs, string is properly relocated _pint( string!(w->b){_boffs} ); @@(b){_boffs}++ -> _boffs; endwhile; Representation of POPLOG Items ------------------------------ For a given implementation of POPLOG, there will be an encoding scheme which allows a word datum to contain either a simple object (a pop integer or decimal), or a compound one (a word pointer to a POPLOG structure). The procedure "issimple" is then guaranteed to return true for the former and false for the latter, and vice-versa for "iscompound". The results of these procedures are undefined when applied to any other values, e.g. to a non-word pointer, system integer, etc. There also is a requirement in various parts of POPLOG to be able to 'mark' a compound item to distinguish it from other such items (e.g. tracing a structure by marking its key field in the garbage collector), and the subroutines "_mksimple" and "_mkcompound" are defined for this purpose. "_mksimple" takes a compound item and encodes it as a simple object, while "_mkcompound" retrieves the original compound item from its encoded form. The exact manner of the encoding is implementation dependent; all that matters is that if "struct" is a compound item, then _mksimple(struct) -> _struct; is simple, and that _mkcompound(_struct) returns the original value of "struct".