Information on the UCSD Pascal system

Complied by Charles T.'Dr. Tom' Turley
9/11/97 
(from Reference and html sources - see end of article for listings)

Subject: Pascal II: Operand Formats

 DISCUSSION 

 When you need to send data (especially complex data formats, such as strings)
 to an assembly routine from a Pascal host program, it can be very useful to be
 familiar with the internal structure of Pascal variables. This article
 describes a few of the more commonly used variable types; for a complete
 description of the more complex variables, including records and arrays, see
 pp. 227-228 of the Apple Pascal Operating System Reference Manual.

 Machine language (assembly) routines are commonly used either when (a) speed is
 critical, or (b) when the code must access other assembly routines (such as
 PROMs or I/O drivers) that can't be reassembled as part of the program. Also,
 bit manipulations such as right-shift are much easier to do in assembly than in
 Pascal.

 In the UCSD Pascal system, it's fairly easy to create short assembly programs
 which can be linked into a Pascal host program. In some cases, it may be
 sufficient to merely call the assembly routine; most routines require that data
 be passed to them, though. Data is passed to or from routines by means of a
 "parameter", a temporary variable created by Pascal specifically for that
 purpose. The term "Var parameter" implies that the address of the actual
 variable is passed to the routine as a parameter instead of its value.

 Certain types of variables may be passed by value, but any variable may be
 passed by name by simply declaring it to be a Var parameter. Pascal does not
 allow parameters of variable length (with the exception of certain sets and
 long integers) to be passed on the CPU stack, since doing so could end up
 filling the stack to capacity and thereby crashing the operating system. These
 parameters, therefore, are automatically used as if defined as Var parameters.
 A good explanation of the various methods of passing parameters may be found in
 Peter Grogono's book, "Programming in Pascal".

 Before delving into the details, let's define some terms and conventions
 which we'll use later on:

 Bit = a binary digit (0 or 1). A bit is the smallest unit of
 information which can be stored in a computer.
 Nybble = 4 bits (half a byte). A hexadecimal digit is one nybble
 (pronounced "nibble").
 Byte = 8 bits (2 nybbles). This is the unit of storage which the
 6502 processor uses.
 Word = 2 bytes (16 bits). A word is the unit of information which
 Pascal uses.
 LSB = least significant bit
 MSB = most significant bit

 decimal 65535 0
 hexadecimal $FFFF <--------memory---------> $0000 addresses
 MSB LSB

 This diagram of memory structure is useful for understanding the format of
 variables: although we're used to writing numbers from left to right, Pascal
 reads data from memory FROM RIGHT TO LEFT, starting at the least significant
 byte.


 Integers:

 Integers in UCSD Pascal are whole numbers between -32768 to +32767, inclusive.
 They are stored in one word (2 bytes). Negative integers are represented in
 "two's complement," which means that they appear to have positive values
 greater than 32767; the negative integer is arrived at by subtracting 2 ^ 16
 (65536) from this positve value. Similarly, large positive integers are stored
 as a complementary negative numbers (cf. Integer BASIC). The sign bit (MSB) is
 0 if positive, 1 if negative.

 <-------byte-----------> <-------byte--------->
 15 14 . . . . . 8 7 . . . . . . 0 <== 16 bits
 Sign Integer Value

 Example: the number 3 is represented in binary as:

 MSB 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 LSB

 However, -3 is represented as:

 MSB 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 LSB

 which also reads as 65533 (or 65536-3)!

 Integers may be passed by value or as Var parameters.

 Reals:

 Real numbers, in UCSD Pascal, are floating point numbers between +/-1.17550E-38
 to +/-3.40282E+38, inclusive. Real numbers take up four bytes (2 words) of
 storage. Their binary representations are similar to the proposed IEEE standard
 for floating point numbers:

 31 30 . . . . . . 23 22 . . . . . . . . . 0 <== 32 bits
 Sign Exponent Mantissa

 "Mantissa" is the name given to the decimal portion of a number; by convention,
 it's expressed in scientific (exponential) notation. The "exponent" indicates
 the power to which the mantissa is raised. The exponent is represented in base
 2 (2^n). The number 3 x 10^2, for instance, is defined as having a mantissa of
 3, an exponent of 2, in base 10 (decimal).

 The sign bit refers to the sign of the mantissa; it's 0 if positive, 1 if
 negative. The exponent is "offset" by 127; that is, a value of 127 in the
 exponent field corresponds to an exponent of 0. Similarly, if the value is 1,
 the exponent is -126, and if the field is 254, the exponent is +127. A value
 of 0 indicates that the real number is 0.

 The mantissa of the real number is stored in normalized format in bits 0-22.
 "Normalizing" a number means adjusting it so that the highest bit is
 significant (that is, set to 1). The exponent indicates how many times (and in
 which direction) the value was shifted during normalization.

 Notice that the MSB of the mantissa of any non-zero number that has been
 normalized is always a one. Zero can be treated as a special case: the
 exponent is simply set to zero. So, to gain additional precision, the mantissa
 has an implied "1" that is not stored, resulting in a functional 24-bit
 mantissa, even though only 23 bits are actually used. This gives slightly more
 than a 6-decimal-place (single precision) accuracy.

 To make this clearer, let's look at some examples:

 Real number = 1
 MSB 0 01111111 00000000000000000000000 LSB
 Exponent = 127 (2^0) Mantissa = 1 (the implied 1 isn't stored)

 Real number = -9.9
 MSB 1 10000010 00111100110011001100110 LSB
 Exponent = 130 (2^3) Mantissa = 99000015

 In the second example, the real number (in binary) appears as 1001.1110011...
 During normalization, the decimal point is moved to the left 3 times
 (incrementing the exponent), and the most significant bit becomes implied. The
 sign bit is 1, indicating that the number is negative.

 Real numbers may be passed by value, or else they may be defined as Var
 parameters and then passed by address.

 Booleans:

 The Boolean (binary) variable can have two values: TRUE and FALSE. Booleans
 are most commonly used in determining yes/no conditions, such as equality or
 set inclusion. Boolean variables are stored in one word, though only the LSB
 (least significant bit) is used. TRUE is represented by a 1; FALSE is
 represented by a 0.

 MSB 15 . . . . . . 8 7 . . . . . . 0 LSB
 Boolean

 UCSD Pascal does not allow direct printing of Boolean variables. For example:

 Program PrintBoolean;
 Var A: boolean;
 Begin
 A := FALSE;
 Writeln (A); (* this is illegal *)
 If A = FALSE Then Writeln ('FALSE') Else Writeln ('TRUE');
 (* this is correct *)
 End.

 Booleans are most efficient in packed arrays, where each bit of the word is
 utilized. DrawBlock is probably the best-known example of this use. An
 excellent example of the use of boolean packed arrays is in the GrafDemo
 program on the Apple Pacal diskette APPLE3.

 Boolean variables may be passed by value or by address.

 Other Types:

 In addition to all the above standard types, Pascal allows the programmer to
 define a wide variety of non-standard variable types. Probably the most
 popular example of this is the SET.

 A set is an arbitrary collection of elements with each element assigned an
 ordinal position (that is, represented by a number). Each element of the set
 is represented by a name; you may choose any word for this name, except for (a)
 words reserved by Pascal, and (b) other variable definitions already in use.
 Each name is then associated with one bit in the data definition, beginning
 with bit 0. The set is stored in memory as a series of bits identified by the
 ordinal position of the element in the type definition. A set must end on a
 word boundary: for example, 17 elements would take up 2 words, even though only
 one bit of the second word is actually used.

 Example:

 Type Colors = (Red,Green,Blue,Yellow,Black,White);
 ColorSet = Set of Colors;

 is a set of colors. Red occupies position 0, and white has position 5.

 <-------------------one word------------------>
 MSB 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 LSB
 W B Y B G R
 h l e l r e
 i a l u e d
 t c l e e
 e k o n
 w

 Sets may be passed either by address or by value, with certain restrictions.
 See p. 203 of the Pascal reference manual for details.

 In general, complex record types consist of one or more standard types, each
 stored as described above. For the last word on Pascal data types, read
 Niklaus Wirth's Report in "User Manual and Report" by Jensen and Wirth.
 
 
 The calculation 10 ^ X is not included in the UCSD Pascal definition. The
 system intrinsic PWROFTEN ("power of ten") returns 10 ^ X, provided X is an
 integer in the range 0..37. (Please refer to page 45 in the Pascal Language
 Reference manual.)

 The function EXP (in the library unit TRANSCEN) is of the form e ^ X, where X
 is a real number. The relationship between 10 ^ X and e ^ X is:

 10 ^ X = e ^ (X LN 10) (LN = natural log)


 Pascal: Exponents 


 Here is a simple program which illustrates the use of Pascal exponents:

 PROGRAM EXPONENT;

 USES TRANSCEND;

 BEGIN
 WRITELN ('10 ^ 3 = ',PWROFTEN(3));
 WRITELN ('e ^ 3 = ',EXP(3));
 WRITELN ('10 ^ 3 by the conversion = ',EXP(3*LN(10)));
 END.


 Pascal: Real number format 


 Real numbers in UCSD Pascal are floating point numbers between +/-1.17550E-38
 and +/-3.40282E+38, inclusive. Real numbers take up four bytes (2 words) of
 storage. Their binary representation is similar to the proposed IEEE standard
 for floating point numbers:

 31 30 . . . . . . 23 22 . . . . . . . . . 0 <== 32 bits
 Sign Exponent Mantissa

 "Mantissa" is the name given to the decimal portion of a number; by convention,
 it's expressed in scientific (exponential) notation. The "exponent" indicates
 the power to which the mantissa is raised; it's represented in base 2 (2^N).
 The number 3 x 10^2, for example, is defined as having a mantissa of 3 and an
 exponent of 2, in base 10 (decimal).

 The sign bit refers to the sign of the mantissa; it's 0 if positive, 1 if
 negative. The exponent is "offset" by 127--that is, a value of 127 in the
 exponent field corresponds to an exponent of 0. Similarly, if the value is 1,
 the exponent is -126, and if the field is 254, the exponent is +127. A value
 of 0 indicates that the real number is 0.

 The mantissa of the real number is stored in normalized format in bits 0-22.
 "Normalizing" a number means adjusting it so that the highest bit is
 significant (that is, set to 1). The exponent indicates how many times, and in
 which direction, the value was shifted during normalization.

 Notice that the MSB of the mantissa of any non-zero number that has been
 normalized is always a one. Zero can be treated as a special case--the
 exponent is simply set to zero. For the sake of additional precision, then,
 the mantissa has an implied "1" that is not stored, resulting in a functional
 24-bit mantissa, even though only 23 bits are actually used. This structure
 yields slightly more than a 6-decimal-place (single precision) accuracy.

 Real numbers may be formatted for output by means of field-width designations.
 As described on pp. 36-37 of the Apple Pascal language Reference Manual, the
 output specification has the following form:

 Real : FieldWidth : FractionLength

 where FieldWidth is the minimum number of characters written, including the
 decimal point (default=1). FractionLength is the number of digits to be
 written after the decimal place (default=5). Thus, a field specification of
 R:8:3 indicates the real variable R, printed within a field size of 8, with 3
 of those digits appearing to the right of the decimal. A FractionLength of zero
 is illegal.

 If the field size necessary for displaying the variable accurately is greater
 than the formatting specification, the formatting is ignored. If the size is
 smaller than FieldWidth, the field is padded with blanks to the left of the
 variable; the variable is thereby right-justified. 


 References:

Apple Computer, Inc.: http://til.info.apple.com/techinfo.nsf/

 --Apple Pascal Reference Manual, by Apple Computer Inc. 1979.
 --Apple Pascal Language Reference Manual, by Apple Computer, 1980.
 --Apple Pascal Operating System Reference Manual, by Apple Computer, 1980.
 --Programming in Pascal, by Peter Grogono, Addison Wesley, 1978.
 --User Manual and Report, by Kathleen Jensen and Niklaus Wirth,Springer-Verlag,
 1974.