The M-code Instruction Set ========================== Robert Smith June 1988 1.0 Introduction ---------------- POPC is a user program that extends the normal POP-11 compiler to form SYSPOP-11, the systems programming language of POPLOG. These extensions allow such facilities as definition of structures, manipulation of pointers and machine integers, etc, and the language starts to bear a strong similarity to 'C'. When compiling a system source file, calls are made to the VM interface in the same manner as for user programs, but rather than optimisation occuring as the file is processed, an unadulterated (but slightly modified) code-list of VM instructions for each procedure is passed to POPC. POPC optimises this code stream and translates it to an intermediate representation called M-code (this corresponds to a multiple-operand machine instruction set with generalised addressing modes). For each M-code there is a procedure responsible for translating that M-code into the equivalent target machine assembler. This note describes the instructions and operands of M-code. A later document will decribe other aspects of the system code generation process such as register declaration and use, target code emission, inline code expansions, etc. The M-code to target assembler translation routines are located in $popsrc/syscomp/genproc.p for a given machine. The following descriptions are not completely machine independant. There is an implicit assumption that a 32-bit machine is used. The notes which follow some instructions assume byte-addressability (the case for all current POPLOG hosts) which allow the following tagging scheme: A pointer to an object (all objects word aligned, thus same as object address) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit word index |0|0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ POP integer +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit signed integer |1|1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ POP decimal +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit decimal |0|1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2.0 The M-code Instruction Set ------------------------------ The M-code instruction set has a total of 47 instructions. I have divided them into 7 major classes, based in some cases on physical rather than logical connections. The instruction groups are: Data Movement Instructions: M_MOVE M_MOVEs M_MOVEb M_MOVEbit M_MOVEss M_MOVEsb M_MOVEsbit M_UPDs M_UPDb M_UPDbit Arithmetic Instructions: M_ADD M_SUB M_MULT M_NEG M_PADD M_PSUB M_PADD_TEST M_PSUB_TEST M_PTR_ADD_OFFS M_PTR_SUB_OFFS M_PTR_SUB Logical and Shift Instructions: M_BIS M_BIC M_BIM M_LOGCOM M_ASH Compare and Test Instructions: M_BIT M_CMP M_TEST M_PCMP M_PTR_CMP M_CMPKEY Branch Instructions: M_BRANCH M_BRANCH_std M_BRANCH_ON M_BRANCH_ON_INT Procedure Call and Stack Frame Instructions: M_CALL M_CALLSUB M_CALL_WITH_RETURN M_RETURN M_CHAIN M_CHAINSUB M_CREATE_SF M_UNWIND_SF Miscellaneous: M_LABEL M_ERASE M_END The remainder of this section describes the abstract syntax and operation of each of the instructions. The syntax is abstract because each instruction really appears as a vector whose first element is a (pointer to) an M-code transalation routine. The descriptions are not complete because some (e.g. stack frame instructions) use data from variables rather than arguments. Addressing modes and test conditions are dealt with in the next section. 2.1 Data Movement Instructions ------------------------------ M_MOVE Move Word Syntax: M_MOVE src dest Description: Move the contents of -src- to -dest-. Operation: dest:int = src:int M_MOVEs Move Unsigned Short Syntax: M_MOVEs src dest Description: Move the unsigned short at -src- to word at -dest-. Operation: dest:<15:0> = src:short dest:<31:16> = 0 M_MOVEb Move Unsigned Byte Syntax: M_MOVEb src dest Description: Move the unsigned byte at -src- to word at -dest-. Operation: dest:<7:0> = src:byte dest:<31:8> = 0 M_MOVEbit Move Unsigned Bit Field Syntax: M_MOVEbit size pos base dest Description: Move the unsigned bit field in word -base- starting at bit -pos- and extending up for -size- bits to the word at -dest-. Operation: dest: = base: dest:<31:size> = 0 M_MOVEss Move Signed Short Syntax: M_MOVEss src dest Description: Move the signed short at -src- to word at -dest-. Operation: dest:<15:0> = src:short dest:<31:16> = src:<15> M_MOVEsb Move Signed Byte Syntax: M_MOVEsb src dest Description: Move the signed byte at -src- to word at -dest-. Operation: dest:<7:0> = src:byte dest:<31:8> = src:<7> M_MOVEsbit Move Signed Bit Field Syntax: M_MOVEsbit size pos base dest Description: Move the signed bit field in word -base- starting at bit -pos- and extending up for -size- bits to the word at -dest-. Operation: dest: = base: dest:<31:size> = base: M_UPDs Update Short Syntax: M_UPDs src dest Description: Move the least significant short at -src- to word at -dest-. Operation: dest:<15:0> = src:<15:0> dest:<31:16> = unaffected M_UPDb Update Byte Syntax: M_UPDb src dest Description: Move the least significant byte at -src- to word at -dest-. Operation: dest:<7:0> = src:<7:0> dest:<31:8> = unaffected M_UPDbit Update Bit Field Syntax: M_UPDbit size pos base src Description: Move the -size- least significant bits from the word at -src- to the bit field in -base- starting at bit position -pos- and extending up for -size- bits. Operation: base: = src: base: = unaffected base:<31:size+pos> = unaffected 2.2 Arithmetic Instructions --------------------------- M_ADD Add Machine Integers Syntax: M_ADD src1 src2 dest Description: Add machine integer contents of -src1- to machine integer contents of -src2- and put machine integer result in -dest-. Operation: dest:int = src2:int + src1:int M_SUB Subtract Machine Integers Syntax: M_SUB src1 src2 dest Description: Subtract machine integer contents of -src1- from machine integer contents of -src2- and put machine integer result in -dest-. Operation: dest:int = src2:int - src1:int M_MULT Multiply Machine Integers Syntax: M_MULT src1 src2 dest Description: Multiply machine integer contents of -src2- by machine integer contents of -src1- and put machine integer result in -dest-. Operation: dest:int = src2:int * src1:int M_NEG Negate Machine Integer Syntax: M_NEG src dest Description: Negate machine integer contents of -src- and put machine integer result in -dest-. Operation: dest:int = 0:int - src:int M_PADD Add POP Integers Syntax: M_PADD src1 src2 dest Description: Add POP integer contents of -src1- to POP integer contents of -src2- and put POP integer result in -dest-. Operation: dest:pint = src2:pint + src1:pint Notes: With normal POP integer representation and machine arithmetic: dest = src2 + (src1 - 0x3) M_PSUB Subtract POP Integers Syntax: M_PSUB src1 src2 dest Description: Subtract POP integer contents of -src1- from POP integer contents of -src1- and put POP integer result in -dest-. Operation: dest:pint = src2:pint - src1:pint Notes: With normal POP integer representation and machine arithmetic: dest = src2 - (src1 - 0x3) M_PADD_TEST Add POP Integers With Test Syntax: M_PADD_TEST src1 src2 cond label Description: Add POP integer contents of -src1- to POP integer contents of -src2- and push the POP integer result on the stack. If the -cond- is true then branch to the -label- else continue. Operation: push (src2:pint + src1:pint) on user stack if cond then PC = label Notes: Calculation as for M_PADD. In practice -cond- is always an overflow test. M_PSUB_TEST Subtract POP Integers With Test Syntax: M_PSUB_TEST src1 src2 cond label Description: Subtract POP integer contents of -src2- from POP integer contents of -src1- and push the POP integer result on the stack. If the -cond- is true then branch to the -label- else continue. Operation: push (src2:pint - src1:pint) on user stack if cond then PC = label Notes: Calculation as for M_PSUB. In practice -cond- is always an overflow test. M_PTR_ADD_OFFS Add Pointer Offset Syntax: M_PTR_ADD_OFFS type off base dest Description: Add offset -off- to pointer in -base- to form pointer at -dest-. Pointers and offsets of type -type-. Operation: dest:ptr = base:ptr + off:offs Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant. M_PTR_SUB_OFFS Subtract Pointer Offset Syntax: M_PTR_SUB_OFFS type off base dest Description: Subtract offset -off- from pointer in -base- to form pointer at -dest-. Pointers and offsets of type -type-. Operation: dest:ptr = base:ptr - off:offs Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant. M_PTR_SUB Subtract Pointer Offset Syntax: M_PTR_SUB type ptr1 ptr2 dest Description: Subtract pointer -ptr1- from pointer -ptr2- to form offset in -dest-. Pointers and offsets of type -type-. Operation: dest:offs = ptr2:ptr - ptr1:ptr Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant. 2.3 Logical and Shift Instructions ---------------------------------- M_BIS Bit Set Syntax: M_BIS src1 src2 dest Description: Set bits in -src2- that are are set in -src1- and put the result in -dest-. Operation: dest:int = src2:int || src1:int M_BIC Bit Clear Syntax: M_BIC src1 src2 dest Description: Clear bits in -src2- that are are set in -src1- and put the result in -dest-. Operation: dest:int = src2:int && ~~ src1:int M_BIM Bit Mask Syntax: M_BIM src1 src2 dest Description: Clear bits in -src2- that are are clear in -src1- and put the result in -dest-. Operation: dest:int = src2:int && src1:int M_LOGCOM Complement Syntax: M_LOGCOM src dest Description: Put complement of -src- in -dest-. Operation: dest:int = ~~ src:int M_ASH Shift Arithmetic Syntax: M_ASH count src dest Description: Perform arithmetic shift of -count- bits on -src- and put result in -dest-. A positive -count- gives a shift to the left. Zeroes are shifted in from the right, and the sign bit from the left. Operation: dest:int = src:int << count:int (arithmetic shift) 2.4 Compare and Test Instructions --------------------------------- M_BIT Bit Test Syntax: M_BIT mask src cond label Description: If logical AND of -mask- and -src- such that -cond- is true then jump to the -label-, else continue. Operation: src:int && mask:int (sets condition codes) if cond then PC = label M_TEST Test Machine Integer Syntax: M_TEST src cond label Description: If -src- compared with zero gives -cond- true then jump to the -label-, else continue. Operation: src:int - 0:int (sets condition codes) if cond then PC = label M_CMP Compare Machine Integers Syntax: M_CMP src1 src2 cond label Description: Compare machine integers -src1- and -src2-. If -cond- is true then jump to -label-, else continue. Operation: src2:int - src1:int (sets condition codes) if cond then PC = label M_PCMP Compare POP Integers Syntax: M_PCMP src1 src2 cond label Description: Compare POP integers -src1- and -src2-. If -cond- is true then jump to -label-, else continue. Operation: src2:pint - src1:pint (sets condition codes) if cond then PC = label Notes: As machine integer compare for current implementations. M_PTR_CMP Compare Pointers Syntax: M_PTR_CMP type src1 src2 cond label Description: Compare pointers -src1- and -src2-. If -cond- is true then jump to -label-, else continue. The pointers are of type -type-. Operation: src2:ptr - src1:ptr (sets condition codes) if cond then PC = label Notes: As machine integer compare for current implementations. M_CMPKEY Compare Key Syntax: M_CMPKEY key src cond label Description: Compare the key -key- with the keyfield of the object -src-. If -cond- is true then jump to -label-, else continue. If the object is simple then the key will not match Operation: if issimple(src) then se condition codes 'not equal' else key(src):key - key:key (set condition codes) endif if cond then PC = label Notes: Only EQ and NEQ conditions are sensible. 2.5 Branch Instructions ----------------------- M_BRANCH Branch Syntax: M_BRANCH label Description: Transfer control to code at -label-. Operation: PC = label M_BRANCH_std Standard Branch Syntax: M_BRANCH_std label Description: Transfer control to code at -label-. Equivalent with M_BRANCH at the M-code level, but guaranteed to generate target branch code of fixed size. Used in procedure code to standardize seperation between two code entry points. Operation: PC = label M_BRANCH_ON Branch On POP Integer Syntax: M_BRANCH_ON switch label_list else_label Description: Transfer control to one of the labels in -label_list- given by the value of the POP integer -switch-, where a value of 1 implies the first label. If the -switch- is out of range then jump to -else_label- (if false then continue). Operation: if switch >= 1 and switch <= length(label_list) then goto label_list(switch) elseif else_label then goto else_label endif M_BRANCH_ON_INT Branch On Machine Integer Syntax: M_BRANCH_ON_INT switch label_list else_label Description: Transfer control to one of the labels in -label_list- given by the value of the machine integer -switch-, where a value of 1 implies the first label. If the -switch- is out of range then jump to -else_label- (if false then continue). Operation: if switch >= 1 and switch <= length(label-list) then goto label-list(switch) elseif else-label then goto else-label endif 2.6 Procedure Call and Stack Frame Instructions ----------------------------------------------- M_CALL Call POP Procedure Syntax: M_CALL call Description: Call execute address of POP procedure -call-. Operation: push return PC on call stack PC = call M_CALLSUB Call Assembler Routine Syntax: M_CALLSUB call Description: Call assembler routine with entry address -call-. Operation: push return PC on call stack PC = call M_CALL_WITH_RETURN Call POP Procedure With Given Return Address Syntax: M_CALL_WITH_RETURN call return Description: Push supplied return address -return- and chain execute address of POP procedure -call-. Operation: push return address on call stack PC = call M_CALLER_RETURN Access/Update Caller's Return Address Syntax: M_CALLER_RETURN update operand Description: If -update- is false, move the return address into the caller of the current procedure to destination -operand-. if -update- is true, set the caller's return address to the value from source -operand-. Operation: operand = caller's return address (update false) caller's return address = operand (udpate true) Notes: Optional: if not defined, caller's return address is assumed to be in an ordinary memory location in the current stack frame. (Currently used only in the SPARC implementation, where caller's return address is held in a register, and is offset by -8 from the actual return address.) M_RETURN Return Syntax: M_RETURN Description: Return from POP procedure. Operation: Pop PC from call stack M_CHAIN Chain POP Procedure Syntax: M_CHAIN chain Description: Chain execute address of POP procedure -chain-. Operation: PC = chain M_CHAINSUB Chain Assembler Routine Syntax: M_CHAINSUB chain Description: Chain assembler routine with entry address -chain-. Operation: PC = chain M_CREATE_SF Create Stack Frame Syntax: M_CREATE_SF Description: Create stack frame for POP procedure on call stack. Operation: save machine registers save dynamic locals allocate and zero locals on stack allocate space for non pop variables save owner pointer Notes: Instruction uses information from variables rather than arguments. M_UNWIND_SF Unwind Stack Frame Syntax: M_UNWIND_SF Description: Unwind stack frame Operation: remove owner pointer and stack variables restore dynamic locals restore machine registers Notes: Instruction uses information from variables rather than arguments. 2.7 Miscellaneous Instructions ------------------------------ M_LABEL Label Syntax: M_LABEL label Description: Define label -label- of next M-code instruction. Operation: Define assembler label M_ERASE Erase Syntax: M_ERASE dest Description: Erase one element from stack specified by auto-indexed operand -dest-. Operation: If -dest- is auto-indexed operand then modify index by normal offset. M_END End Syntax: M_END Description: End of M-code instruction stream. Operation: None 3.0 M-code Operand Addressing Modes ----------------------------------- The representation of addressing modes in M-code instructions has already been described by Simon Nichols. The following table is adapted from his document. M-code operand type VAX Addressing mode Example translation ------------------- ------------------- ------------------- Word Register "R1" --> R1 Integer Immediate 4 $4 Ref Immediate $foo String Absolute 'label' label Vector {^reg 0} Register Deferred {r1 0) (r1) {^reg ^disp} Displacement {r1 4} 4(r1) {^reg ^false} Autodecrement {r1 false} -(r1) {^reg ^true} Autoincrement {r1 true} (r1)+ {^reg ^disp ^reg'} Based Indexed with {a1 4 d2} a1@(4,d2:L) Displacement (68K) Pair [^operand'|^disp] Autoincrement deferred [{r1 0}|0] @0(r1) or displacement deferred (effective address is value of operand plus displacement 3.1 Conditions -------------- The conditions referred to in the compare instructions will be one of the following: EQ equal to NEQ not equal to LT signed less than LEQ signed less than or equal to GT signed greater than GEQ signed greater than or equal to ULT unsigned less than ULEQ unsigned less than or equal to UGT unsigned greater than UGEQ unsigned greater than or equal to NEG negative POS positive OVF overflow NOVF not overflow 3.2 Pointer Types ----------------- The pointer types that are generated for some M-code's are: 1 byte 2 short 3 word As we have seen, for byte addressable machines these can be ignored. Appendix 1.0 - Types -------------------- The operation description of the M-code's made a rather cavalier use of types. They are listed here with the intention that one day they may become more formal: int 32-bit integer short 16-bit integer byte 8-bit integer Bit field from bit n down to bit m in 32-bit integer pint POP integer ptr Pointer offs Offset key Key