DEC RADIX 50
RADIX 50[1][2][3] or RAD50[3] (also referred to as RADIX50,[4] RADIX-50[5] or RAD-50), is an uppercase-only character encoding created by Digital Equipment Corporation (DEC) for use on their DECsystem, PDP, and VAX computers.
RADIX 50's 40-character repertoire (050 in octal) can encode six characters plus four additional bits into one 36-bit machine word (PDP-6, PDP-10/DECsystem-10, DECSYSTEM-20), three characters plus two additional bits into one 18-bit word (PDP-9,[2] PDP-15),[6] or three characters into one 16-bit word (PDP-11, VAX).[3]
The actual encoding differs between the 36-bit and 16-bit systems.
36-bit systems
[edit]In 36-bit DEC systems RADIX 50 was commonly used in symbol tables for assemblers or compilers which supported six-character symbol names from a 40-character alphabet. This left four bits to encode properties of the symbol.
For its similarities to the SQUOZE character encoding scheme used in IBM's SHARE Operating System for representing object code symbols, DEC's variant was also sometimes called DEC Squoze,[7] however, IBM SQUOZE packed six characters of a 50-character alphabet plus two additional flag bits into one 36-bit word.[6]
RADIX 50 was not normally used in 36-bit systems for encoding ordinary character strings; file names were normally encoded as six six-bit characters, and full ASCII strings as five seven-bit characters and one unused bit per 36-bit word.
Most significant bits | Least significant bits | |||||||
---|---|---|---|---|---|---|---|---|
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
000 | space | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
001 | 7 | 8 | 9 | A | B | C | D | E |
010 | F | G | H | I | J | K | L | M |
011 | N | O | P | Q | R | S | T | U |
100 | V | W | X | Y | Z | . | $ | % |
18-bit systems
[edit]RADIX 50 (also called Radix 508 format[2]) was used in Digital's 18-bit PDP-9 and PDP-15 computers to store symbols in symbol tables, leaving two extra bits per 18-bit word ("symbol classification bits").[2]
16-bit systems
[edit]Some strings in DEC's 16-bit systems were encoded as 8-bit bytes, while others used RADIX 50 (then also called MOD40).[3][8]
In RADIX 50, strings were encoded in successive words as needed, with the first character within each word located in the most significant position.
For example, using the PDP-11 encoding, the string "ABCDEF", with character values 1, 2, 3, 4, 5, and 6, would be encoded as a word containing the value 1×402 + 2×401 + 3×400 = 1683, followed by a second word containing the value 4×402 + 5×401 + 6×400 = 6606. Thus, 16-bit words encoded values ranging from 0 (three spaces) to 63999 ("999"). When there were fewer than three characters in a word, the last word for the string was padded with trailing spaces.[3]
There were several minor variations of this encoding with differing interpretations of the 27, 28, 29 code points. Where RADIX 50 was used for filenames stored on media, the code points represent the $
, %
, *
characters, and will be shown as such when listing the directory with utilities such as DIR.[9] When encoding strings in the PDP-11 assembler and other PDP-11 programming languages the code points represent the $
, .
, %
characters, and are encoded as such with the default RAD50 macro in the global macros file, and this encoding was used in the symbol tables. Some early documentation for the RT-11 operating system considered the code point 29 to be undefined.[3]
The use of RADIX 50 was the source of the filename size conventions used by Digital Equipment Corporation PDP-11 operating systems. Using RADIX 50 encoding, six characters of a filename could be stored in two 16-bit words, while three more extension (file type) characters could be stored in a third 16-bit word. Similary, a three-character device name such as "DL1" could also be stored in a 16-bit word. The period that separated the filename and its extension, and the colon separating a device name from a filename, was implied (i.e., was not stored and always assumed to be present).
Most significant bits | Least significant bits | |||||||
---|---|---|---|---|---|---|---|---|
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
000 | space | A | B | C | D | E | F | G |
001 | H | I | J | K | L | M | N | O |
010 | P | Q | R | S | T | U | V | W |
011 | X | Y | Z | $ | % . | * % | 0 | 1 |
100 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
See also
[edit]- Base 40
- Base conversion
- Chen–Ho encoding
- Densely packed decimal (DPD)
- Hertz encoding
- Packed BCD
- Six-bit character code
- Split octal
References
[edit]- ^ a b "Chapter VI: The Loader - The Radix 50 Representation of Symbols". PDP-6 Multiprogramming System Manual (PDF). Maynard, Massachusetts, USA: Digital Equipment Corporation (DEC). 1965. p. 57. DEC-6-0-EX-SYS-UM-IP-PRE00. Archived (PDF) from the original on 2014-07-14. Retrieved 2014-07-10. (1+84+10 pages)
- ^ a b c d "Appendix 1". PDP-9 Utility Programs--Advanced Software System--Programmer's Reference Manual (PDF). Maynard, Massachusetts, USA: Digital Equipment Corporation. 1968. Order No. DEC-9A-GUAB-D. Archived (PDF) from the original on 2020-06-04. Retrieved 2020-06-04.
- ^ a b c d e f g "8.10 .RAD50". PAL-11R Assembler - Programmer's Manual - Program Assembly Language and Relocatable Assembler for the Disk Operating System (2nd revised printing ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. May 1971 [February 1971]. p. 8-8. DEC-11-ASDB-D. Retrieved 2020-06-18. p. 8-8:
[…] PDP-11 systems programs often handle symbols in a specially coded form called RADIX 50 (this form is sometimes referred to as MOD40). This form allows 3 characters to be packed into 16 bits; therefore, any 6-character symbol can be held in two words. The single operand is of the form /CCC/ where the slash (the delimiter) can be any printable character except for = and : . The delimiters enclose the characters to be converted which may be A through Z, 0 through 9, dollar ($), dot (.) and space ( ). If there are fewer than 3 characters they are considered to be left justified and trailing spaces are assumed. […] The packing algorithm is as follows: […] A. Each character is translated into its RADIX 50 equivalent as indicated in the following table: Character - RADIX 50 Equivalent (octal): (space) - 0, A–Z - 1–32, $ - 33, . - 34, 0–9 - 36–47. Note that another character could be defined for code 35. […] B. The RADIX 50 equivalents for characters 1 through 3 (C1,C2,C3) are combined as follows: RESULT=((C1*50)+C2)*50+C3 […]
[1] - ^ a b Durda IV., Frank (2004). "RADIX50 Character Code Reference". Archived from the original on 2005-03-31. Retrieved 2005-03-31.
- ^ a b "Appendix B.3: Radix-50 Constants and Character Set". Compaq Fortran 77 Language Reference Manual. Compaq Computer Corporation. 1999. Archived from the original on 2012-10-14. Retrieved 2012-10-14.
- ^ a b Jones, Douglas W. (2018). "Lecture 7, Object Codes, Loaders and Linkers - Final steps on the road to machine code". Operating Systems, Spring 2018. Part of the CS:3620 Operating Systems Collection. Department of Computer Science, The University of Iowa. Archived from the original on 2020-06-06. Retrieved 2020-06-06.
- ^ Murrell, Stephen J. (2005). "DEC/PDP Character Codes". rabbit.eng.miami.edu. University of Miami. DEC Squoze Character Table. Archived from the original on 2020-06-19. Retrieved 2020-06-19.
- ^ PDP-11 Getting DOS on the Air (1 ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. August 1971. DEC-11-SYDC-D. Retrieved 2020-06-18. [2]
- ^ "RT11 Radix50 Demo".
Further reading
[edit]- Williams, Al (2016-11-22). "Squoze your data". Hackaday. Archived from the original on 2020-06-06. Retrieved 2020-06-06.