In
computer programming
Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
, a P-code machine (portable code machine) is a
virtual machine
In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
designed to execute ''P-code,'' the
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
or
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
of a hypothetical
central processing unit
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
(CPU). The term ''P-code machine'' is applied generically to all such machines (such as the
Java virtual machine
A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally descr ...
(JVM) and
MATLAB
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
pre-compiled code), as well as specific implementations using those machines. One of the most notable uses of P-Code machines is the P-Machine of the
Pascal-P system. The developers of the
UCSD Pascal
UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UC ...
implementation within this system construed the ''P'' in ''P-code'' to mean ''pseudo'' more often than ''portable;'' they adopted a unique label for ''pseudo-code'' meaning instructions for a pseudo-machine.
Although the concept was first implemented circa 1966 as
O-code for the Basic Combined Programming Language (
BCPL
BCPL ("Basic Combined Programming Language") is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still f ...
) and P code for the language
Euler
Leonhard Euler ( ; ; ; 15 April 170718 September 1783) was a Swiss polymath who was active as a mathematician, physicist, astronomer, logician, geographer, and engineer. He founded the studies of graph theory and topology and made influential ...
,
the ''term'' P-code first appeared in the early 1970s. Two early
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s generating P-code were the Pascal-P compiler in 1973, by Kesav V. Nori, Urs Ammann, Kathleen Jensen, Hans-Heinrich Nägeli, and Christian Jacobi,
and the
Pascal-S compiler in 1975, by
Niklaus Wirth
Niklaus Emil Wirth ( IPA: ) (15 February 1934 – 1 January 2024) was a Swiss computer scientist. He designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Tu ...
.
Programs that have been translated to P-code can either be
interpreted by a software program that emulates the behaviour of the hypothetical CPU, or
translated into the machine code of the CPU on which the program is to run and then executed. If there is sufficient commercial interest, a hardware implementation of the CPU specification may be built (e.g., the
Pascal MicroEngine or a version of a
Java processor).
P-code versus machine code
While a typical
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
model is aimed at translating a program code into
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
, the idea of a P-code machine follows a two-stage approach involving translation into P-code and execution by
interpreting
Interpreting is translation from a spoken or signed language into another language, usually in real time to facilitate live communication. It is distinguished from the translation of a written text, which can be more deliberative and make use o ...
or
just-in-time compilation
In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is compilation (of computer code) during execution of a program (at run time) rather than before execution. This may consist of source code transl ...
(JIT) through the P-code machine.
This separation makes it possible to detach the development of a P-code
interpreter from the underlying machine code compiler, which has to consider machine-dependent behaviour in generating its
bytecode
Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...
. This way a P-code interpreter can also be implemented quicker, and the ability to interpret the code at runtime allows for additional
run-time checks which might not be similarly available in native code. Further, as P-code is based on an ideal virtual machine, a P-code program can often be smaller than the same program translated to machine code. Conversely, the two-step interpretation of a P-code-based program leads to a slower execution speed, though this can sometimes be addressed with
just-in-time compilation
In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is compilation (of computer code) during execution of a program (at run time) rather than before execution. This may consist of source code transl ...
, and its simpler structure is easier to
reverse-engineer than native code.
Implementations of P-code
In the early 1980s, at least two
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s achieved
machine independence through extensive use of P-code . The
Business Operating System (BOS) was a cross-platform
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
designed to run P-code programs exclusively. The
UCSD p-System
UCSD Pascal is a Pascal (programming language), Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the Universit ...
, developed at The University of California, San Diego, was a self-compiling and
self-hosting operating system based on P-code optimized for generation by the
Pascal language.
In the 1990s, translation into p-code became a popular strategy for implementations of languages such as
Python,
Microsoft P-Code in
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to:
* Visual Basic (.NET), the current version of Visual Basic launched in 2002 which runs on .NET
* Visual Basic (classic), the original Visual Basic suppo ...
and
Java bytecode
Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact ...
in
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
.
The language
Go uses a generic, portable assembly as a form of p-code, implemented by
Ken Thompson
Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B (programmi ...
as an extension of the work on
Plan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s. Since 2000, Plan 9 has ...
. Unlike
Common Language Runtime
The Common Language Runtime (CLR), the virtual machine component of Microsoft .NET Framework, manages the execution of .NET programs. Just-in-time compilation converts the managed code (compiled intermediate language code) into machine instr ...
(CLR) bytecode or JVM bytecode, there is no stable specification and the Go build tools do not emit a bytecode format to be used at a later time. The Go assembler uses the generic assembly language as an
intermediate representation
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
and the Go executables are machine-specific
statically linked binaries.
UCSD P-Machine
Architecture
Like many other P-code machines, the UCSD P-Machine is a
stack machine
In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a Virtual machine#Process virtual machines, process virtual machine in which the primary interaction is moving short- ...
, which means that most instructions take their operands from a
stack
Stack may refer to:
Places
* Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group
* Blue Stack Mountains, in Co. Donegal, Ireland
People
* Stack (surname) (including a list of people ...
, and place results back on the stack. Thus, the
add
instruction replaces the two topmost elements of the stack with their sum. A few instructions take an immediate argument. Like Pascal, the P-code is
strongly typed, supporting Boolean (b), character (c), integer (i), real (r), set (s), and pointer (a)
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s natively.
Some simple instructions:
Insn. Stack Stack Description
before after
adi i1 i2 i1+i2 add two integers
adr r1 r2 r1+r2 add two reals
inn i1 s1 b1 set membership; b1 = whether i1 is a member of s1
ldi i1 i1 i1 load integer constant
mov a1 a2 a2 move
not b1 b1 -b1 Boolean negation
Environment
Similar to a real target CPU, the P-System has only one stack shared by procedure stack frames (providing
return address
In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient (and sometimes authorized intermediaries) with a means to determine how to respond to the sender of the message ...
, etc.) and the arguments to local instructions. Three of the machine's
registers point into the stack (which grows upwards):
* SP points to the top of the stack (the
stack pointer).
* MP marks the beginning of the active stack frame (the
mark pointer).
* EP points to the highest stack location used in the current procedure (the
extreme pointer).
Also present is a constant area, and, below that, the
heap growing down towards the stack. The NP (the
new pointer) register points to the top (lowest used address) of the heap. When EP gets greater than NP, the machine's memory is exhausted.
The fifth register, PC, points at the current instruction in the code area.
Calling conventions
Stack frames look like this:
EP ->
local stack
SP -> ...
locals
...
parameters
...
return address (previous PC)
previous EP
dynamic link (previous MP)
static link (MP of surrounding procedure)
MP -> function return value
The procedure calling sequence works as follows: The call is introduced with
mst n
where
n
specifies the difference in nesting levels (remember that Pascal supports nested procedures). This instruction will ''mark'' the stack, i.e. reserve the first five cells of the above stack frame, and initialize previous EP, dynamic, and static link. The caller then computes and pushes any parameters for the procedure, and then issues
cup n, p
to call a user procedure (
n
being the number of parameters,
p
the procedure's address). This will save the PC in the return address cell, and set the procedure's address as the new PC.
User procedures begin with the two instructions
ent 1, i
ent 2, j
The first sets SP to MP +
i
, the second sets EP to SP +
j
. So
i
essentially specifies the space reserved for locals (plus the number of parameters plus 5), and
j
gives the number of entries needed locally for the stack. Memory exhaustion is checked at this point.
Returning to the caller is accomplished via
retC
with
C
giving the return type (i, r, c, b, a as above, and p for no return value). The return value has to be stored in the appropriate cell previously. On all types except p, returning will leave this value on the stack.
Instead of calling a user procedure (cup), standard procedure
q
can be called with
csp q
These standard procedures are Pascal procedures like
readln()
(
csp rln
),
sin()
(
csp sin
), etc. Peculiarly
eof()
is a p-Code instruction instead.
Example machine
Niklaus Wirth
Niklaus Emil Wirth ( IPA: ) (15 February 1934 – 1 January 2024) was a Swiss computer scientist. He designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Tu ...
specified a simple p-code machine in the 1976 book ''
Algorithms + Data Structures = Programs''. The machine had 3 registers - a
program counter
The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
''p'', a
base register ''b'' and a
top-of-stack register ''t''. There were 8 instructions:
#
lit 0, ''a''
: load constant
#
opr 0, ''a''
: execute operation (13 operations: RETURN, 5 mathematical functions, and 7 comparison functions)
#
lod ''l'', ''a''
: load variable ,
#
sto ''l'', ''a''
: store variable ,
#
cal ''l'', ''a''
: call procedure at level
#
int 0, ''a''
: increment t-register by
#
jmp 0, ''a''
: jump to
#
jpc 0, ''a''
: jump conditional to
This is the code for the machine, written in Pascal:
const
amax=2047;
levmax=3;
cxmax=200;
type
fct=(lit,opr,lod,sto,cal,int,jmp,jpc);
instruction=packed record
f:fct;
l:0..levmax;
a:0..amax;
end;
var
code: array ..cxmaxof instruction;
procedure interpret;
const stacksize = 500;
var
p, b, t: integer;
i: instruction;
s: array ..stacksizeof integer;
function base(l: integer): integer;
var b1: integer;
begin
b1 := b;
while l > 0 do begin
b1 := s 1
l := l - 1
end;
base := b1
end ;
begin
writeln(' start pl/0');
t := 0; b := 1; p := 0;
s := 0; s := 0; s := 0;
repeat
i := code p := p + 1;
with i do
case f of
lit: begin t := t + 1; s := a end;
opr:
case a of
0:
begin
t := b - 1; p := s + 3 b := s + 2
end;
1: s := -s
2: begin t := t - 1; s := s + s + 1end;
3: begin t := t - 1; s := s - s + 1end;
4: begin t := t - 1; s := s * s + 1end;
5: begin t := t - 1; s := s div s + 1end;
6: s := ord(odd(s );
8: begin t := t - 1; s := ord(s = s + 1 end;
9: begin t := t - 1; s := ord(s <> s + 1 end;
10: begin t := t - 1; s := ord(s < s + 1 end;
11: begin t := t - 1; s := ord(s >= s + 1 end;
12: begin t := t - 1; s := ord(s > s + 1 end;
13: begin t := t - 1; s := ord(s <= s + 1 end;
end;
lod: begin t := t + 1; s := s ase(l) + aend;
sto: begin s ase(l)+a:= s writeln(s ; t := t - 1 end;
cal:
begin
s + 1:= base(l); s + 2:= b; s + 3:= p;
b := t + 1; p := a
end;
int: t := t + a;
jmp: p := a;
jpc: begin if s = 0 then p := a; t := t - 1 end
end
until p = 0;
writeln(' end pl/0');
end ;
This machine was used to run Wirth's
PL/0, a Pascal subset compiler used to teach compiler development.
Microsoft P-code
P-code is a name for several of
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's proprietary
intermediate language
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
s. They provided an alternate binary format to
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
. At various times, Microsoft has said P-code is an abbreviation for either ''packed code'' or ''pseudo code''.
Microsoft P-code was used in
Visual C++
Microsoft Visual C++ (MSVC) is a compiler for the C, C++, C++/CLI and C++/CX programming languages by Microsoft. MSVC is proprietary software; it was originally a standalone product but later became a part of Visual Studio and made available ...
and
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to:
* Visual Basic (.NET), the current version of Visual Basic launched in 2002 which runs on .NET
* Visual Basic (classic), the original Visual Basic suppo ...
. Like other P-code implementations, Microsoft P-code enabled a more compact
executable
In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
at the expense of slower execution.
Other implementations
See also
*
Bytecode
Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...
*
Intermediate representation
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
*
Joel McCormack, designer of the NCR Corporation version of the p-code machine
*
Runtime system
In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time ...
*
Token threading
*
City & Guilds Mnemonic Code
*
References
Further reading
*
* (NB. Has Pascal sources of the
P4 compiler and interpreter, usage instructions.)
* (NB. Has the P-code of the
P4 compiler, generated by itself.)
*
* , including packaging and pre-compiled binaries; a friendly fork of the
*
*
*
*
* (NB. Especially see the articles ''Pascal-P Implementation Notes'' and ''Pascal-S: A Subset and its Implementation''.)
External links
*
{{DEFAULTSORT:P-Code Machine
Stack-based virtual machines
Pascal (programming language)
*
Programming language implementation
Articles with example Pascal code