Information on the UCSD Pascal system Complied by Charles T.'Dr. Tom' Turley 9/11/97 (from Reference and html sources - see end of article for listings) Subject: Pascal II: Operand Formats DISCUSSION When you need to send data (especially complex data formats, such as strings) to an assembly routine from a Pascal host program, it can be very useful to be familiar with the internal structure of Pascal variables. This article describes a few of the more commonly used variable types; for a complete description of the more complex variables, including records and arrays, see pp. 227-228 of the Apple Pascal Operating System Reference Manual. Machine language (assembly) routines are commonly used either when (a) speed is critical, or (b) when the code must access other assembly routines (such as PROMs or I/O drivers) that can't be reassembled as part of the program. Also, bit manipulations such as right-shift are much easier to do in assembly than in Pascal. In the UCSD Pascal system, it's fairly easy to create short assembly programs which can be linked into a Pascal host program. In some cases, it may be sufficient to merely call the assembly routine; most routines require that data be passed to them, though. Data is passed to or from routines by means of a "parameter", a temporary variable created by Pascal specifically for that purpose. The term "Var parameter" implies that the address of the actual variable is passed to the routine as a parameter instead of its value. Certain types of variables may be passed by value, but any variable may be passed by name by simply declaring it to be a Var parameter. Pascal does not allow parameters of variable length (with the exception of certain sets and long integers) to be passed on the CPU stack, since doing so could end up filling the stack to capacity and thereby crashing the operating system. These parameters, therefore, are automatically used as if defined as Var parameters. A good explanation of the various methods of passing parameters may be found in Peter Grogono's book, "Programming in Pascal". Before delving into the details, let's define some terms and conventions which we'll use later on: Bit = a binary digit (0 or 1). A bit is the smallest unit of information which can be stored in a computer. Nybble = 4 bits (half a byte). A hexadecimal digit is one nybble (pronounced "nibble"). Byte = 8 bits (2 nybbles). This is the unit of storage which the 6502 processor uses. Word = 2 bytes (16 bits). A word is the unit of information which Pascal uses. LSB = least significant bit MSB = most significant bit decimal 65535 0 hexadecimal $FFFF <--------memory---------> $0000 addresses MSB LSB This diagram of memory structure is useful for understanding the format of variables: although we're used to writing numbers from left to right, Pascal reads data from memory FROM RIGHT TO LEFT, starting at the least significant byte. Integers: Integers in UCSD Pascal are whole numbers between -32768 to +32767, inclusive. They are stored in one word (2 bytes). Negative integers are represented in "two's complement," which means that they appear to have positive values greater than 32767; the negative integer is arrived at by subtracting 2 ^ 16 (65536) from this positve value. Similarly, large positive integers are stored as a complementary negative numbers (cf. Integer BASIC). The sign bit (MSB) is 0 if positive, 1 if negative. <-------byte-----------> <-------byte---------> 15 14 . . . . . 8 7 . . . . . . 0 <== 16 bits Sign Integer Value Example: the number 3 is represented in binary as: MSB 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 LSB However, -3 is represented as: MSB 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 LSB which also reads as 65533 (or 65536-3)! Integers may be passed by value or as Var parameters. Reals: Real numbers, in UCSD Pascal, are floating point numbers between +/-1.17550E-38 to +/-3.40282E+38, inclusive. Real numbers take up four bytes (2 words) of storage. Their binary representations are similar to the proposed IEEE standard for floating point numbers: 31 30 . . . . . . 23 22 . . . . . . . . . 0 <== 32 bits Sign Exponent Mantissa "Mantissa" is the name given to the decimal portion of a number; by convention, it's expressed in scientific (exponential) notation. The "exponent" indicates the power to which the mantissa is raised. The exponent is represented in base 2 (2^n). The number 3 x 10^2, for instance, is defined as having a mantissa of 3, an exponent of 2, in base 10 (decimal). The sign bit refers to the sign of the mantissa; it's 0 if positive, 1 if negative. The exponent is "offset" by 127; that is, a value of 127 in the exponent field corresponds to an exponent of 0. Similarly, if the value is 1, the exponent is -126, and if the field is 254, the exponent is +127. A value of 0 indicates that the real number is 0. The mantissa of the real number is stored in normalized format in bits 0-22. "Normalizing" a number means adjusting it so that the highest bit is significant (that is, set to 1). The exponent indicates how many times (and in which direction) the value was shifted during normalization. Notice that the MSB of the mantissa of any non-zero number that has been normalized is always a one. Zero can be treated as a special case: the exponent is simply set to zero. So, to gain additional precision, the mantissa has an implied "1" that is not stored, resulting in a functional 24-bit mantissa, even though only 23 bits are actually used. This gives slightly more than a 6-decimal-place (single precision) accuracy. To make this clearer, let's look at some examples: Real number = 1 MSB 0 01111111 00000000000000000000000 LSB Exponent = 127 (2^0) Mantissa = 1 (the implied 1 isn't stored) Real number = -9.9 MSB 1 10000010 00111100110011001100110 LSB Exponent = 130 (2^3) Mantissa = 99000015 In the second example, the real number (in binary) appears as 1001.1110011... During normalization, the decimal point is moved to the left 3 times (incrementing the exponent), and the most significant bit becomes implied. The sign bit is 1, indicating that the number is negative. Real numbers may be passed by value, or else they may be defined as Var parameters and then passed by address. Booleans: The Boolean (binary) variable can have two values: TRUE and FALSE. Booleans are most commonly used in determining yes/no conditions, such as equality or set inclusion. Boolean variables are stored in one word, though only the LSB (least significant bit) is used. TRUE is represented by a 1; FALSE is represented by a 0. MSB 15 . . . . . . 8 7 . . . . . . 0 LSB Boolean UCSD Pascal does not allow direct printing of Boolean variables. For example: Program PrintBoolean; Var A: boolean; Begin A := FALSE; Writeln (A); (* this is illegal *) If A = FALSE Then Writeln ('FALSE') Else Writeln ('TRUE'); (* this is correct *) End. Booleans are most efficient in packed arrays, where each bit of the word is utilized. DrawBlock is probably the best-known example of this use. An excellent example of the use of boolean packed arrays is in the GrafDemo program on the Apple Pacal diskette APPLE3. Boolean variables may be passed by value or by address. Other Types: In addition to all the above standard types, Pascal allows the programmer to define a wide variety of non-standard variable types. Probably the most popular example of this is the SET. A set is an arbitrary collection of elements with each element assigned an ordinal position (that is, represented by a number). Each element of the set is represented by a name; you may choose any word for this name, except for (a) words reserved by Pascal, and (b) other variable definitions already in use. Each name is then associated with one bit in the data definition, beginning with bit 0. The set is stored in memory as a series of bits identified by the ordinal position of the element in the type definition. A set must end on a word boundary: for example, 17 elements would take up 2 words, even though only one bit of the second word is actually used. Example: Type Colors = (Red,Green,Blue,Yellow,Black,White); ColorSet = Set of Colors; is a set of colors. Red occupies position 0, and white has position 5. <-------------------one word------------------> MSB 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 LSB W B Y B G R h l e l r e i a l u e d t c l e e e k o n w Sets may be passed either by address or by value, with certain restrictions. See p. 203 of the Pascal reference manual for details. In general, complex record types consist of one or more standard types, each stored as described above. For the last word on Pascal data types, read Niklaus Wirth's Report in "User Manual and Report" by Jensen and Wirth. The calculation 10 ^ X is not included in the UCSD Pascal definition. The system intrinsic PWROFTEN ("power of ten") returns 10 ^ X, provided X is an integer in the range 0..37. (Please refer to page 45 in the Pascal Language Reference manual.) The function EXP (in the library unit TRANSCEN) is of the form e ^ X, where X is a real number. The relationship between 10 ^ X and e ^ X is: 10 ^ X = e ^ (X LN 10) (LN = natural log) Pascal: Exponents Here is a simple program which illustrates the use of Pascal exponents: PROGRAM EXPONENT; USES TRANSCEND; BEGIN WRITELN ('10 ^ 3 = ',PWROFTEN(3)); WRITELN ('e ^ 3 = ',EXP(3)); WRITELN ('10 ^ 3 by the conversion = ',EXP(3*LN(10))); END. Pascal: Real number format Real numbers in UCSD Pascal are floating point numbers between +/-1.17550E-38 and +/-3.40282E+38, inclusive. Real numbers take up four bytes (2 words) of storage. Their binary representation is similar to the proposed IEEE standard for floating point numbers: 31 30 . . . . . . 23 22 . . . . . . . . . 0 <== 32 bits Sign Exponent Mantissa "Mantissa" is the name given to the decimal portion of a number; by convention, it's expressed in scientific (exponential) notation. The "exponent" indicates the power to which the mantissa is raised; it's represented in base 2 (2^N). The number 3 x 10^2, for example, is defined as having a mantissa of 3 and an exponent of 2, in base 10 (decimal). The sign bit refers to the sign of the mantissa; it's 0 if positive, 1 if negative. The exponent is "offset" by 127--that is, a value of 127 in the exponent field corresponds to an exponent of 0. Similarly, if the value is 1, the exponent is -126, and if the field is 254, the exponent is +127. A value of 0 indicates that the real number is 0. The mantissa of the real number is stored in normalized format in bits 0-22. "Normalizing" a number means adjusting it so that the highest bit is significant (that is, set to 1). The exponent indicates how many times, and in which direction, the value was shifted during normalization. Notice that the MSB of the mantissa of any non-zero number that has been normalized is always a one. Zero can be treated as a special case--the exponent is simply set to zero. For the sake of additional precision, then, the mantissa has an implied "1" that is not stored, resulting in a functional 24-bit mantissa, even though only 23 bits are actually used. This structure yields slightly more than a 6-decimal-place (single precision) accuracy. Real numbers may be formatted for output by means of field-width designations. As described on pp. 36-37 of the Apple Pascal language Reference Manual, the output specification has the following form: Real : FieldWidth : FractionLength where FieldWidth is the minimum number of characters written, including the decimal point (default=1). FractionLength is the number of digits to be written after the decimal place (default=5). Thus, a field specification of R:8:3 indicates the real variable R, printed within a field size of 8, with 3 of those digits appearing to the right of the decimal. A FractionLength of zero is illegal. If the field size necessary for displaying the variable accurately is greater than the formatting specification, the formatting is ignored. If the size is smaller than FieldWidth, the field is padded with blanks to the left of the variable; the variable is thereby right-justified. References: Apple Computer, Inc.: http://til.info.apple.com/techinfo.nsf/ --Apple Pascal Reference Manual, by Apple Computer Inc. 1979. --Apple Pascal Language Reference Manual, by Apple Computer, 1980. --Apple Pascal Operating System Reference Manual, by Apple Computer, 1980. --Programming in Pascal, by Peter Grogono, Addison Wesley, 1978. --User Manual and Report, by Kathleen Jensen and Niklaus Wirth,Springer-Verlag, 1974.