Discussion:
Today I realised where 6-character Fortran linker symbols came from ...
(too old to reply)
John Dallman
2024-08-15 21:36:00 UTC
Permalink
A moderately experienced developer asked me why we have a formal
31-character limit on the length of identifiers in our published API, and
why there are some C macros with names longer than that which don't cause
problems.

I gave him a short version of the history of linkers and the gradually
increasing limits of symbol name length over the decades. I explained how
C macros get replaced: he's a mathematician by education, and took a
while to appreciate that practical computing involves compromises rather
than truly implementing mathematical abstractions.

In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.

John
Lawrence D'Oliveiro
2024-08-16 03:55:01 UTC
Permalink
Post by John Dallman
In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.
6 whole bits for a character of a simple symbol?? Luxury.

On the PDP-11, we had to pack 6 characters into just 4 bytes, using a
special limited encoding (just for symbols, filenames and the like) called
“Radix-50”.
Lars Poulsen
2024-08-16 04:09:39 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.
6 whole bits for a character of a simple symbol?? Luxury.
On the PDP-11, we had to pack 6 characters into just 4 bytes, using a
special limited encoding (just for symbols, filenames and the like) called
“Radix-50”.
The -50 of course was Octal 050 (decimal 40). Which means that it was a
VERY limited character set.
John Levine
2024-08-16 14:20:36 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.
6 whole bits for a character of a simple symbol?? Luxury.
On the PDP-11, we had to pack 6 characters into just 4 bytes, using a
special limited encoding (just for symbols, filenames and the like) called
“Radix-50”.
Yeah, they had a version of SQUOZE in compilers on IBM mainframes,
too. The PDP-11 version was based on the PDP-6/10 version which
encoded the six characters into 32 bits leaving the other four bits
for flags.
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Rich Alderson
2024-08-16 22:07:50 UTC
Permalink
Post by John Dallman
A moderately experienced developer asked me why we have a formal
31-character limit on the length of identifiers in our published API, and
why there are some C macros with names longer than that which don't cause
problems.
I gave him a short version of the history of linkers and the gradually
increasing limits of symbol name length over the decades. I explained how
C macros get replaced: he's a mathematician by education, and took a
while to appreciate that practical computing involves compromises rather
than truly implementing mathematical abstractions.
In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.
In point of fact, the use of Radix-50 (base 40 arithmetic, in octal) to encode
symbol names arose in the IBM 36 bit systems, so that linker flags could be
kept in the leftover 4 bits. (See also John Levine's followup regarding PDP-6/10
vs. the PDP-11.)

Friends of mine who were taking a compiler writing class at Ohio State in the
1974 time frame were complaining about having to implement Radix-50 in their
parsers (though they were using a 370/165 at the time).
--
Rich Alderson ***@alderson.users.panix.com
Audendum est, et veritas investiganda; quam etiamsi non assequamur,
omnino tamen proprius, quam nunc sumus, ad eam perveniemus.
--Galen
Kerr-Mudd, John
2024-08-17 08:28:26 UTC
Permalink
On Thu, 15 Aug 2024 22:36 +0100 (BST)
Post by John Dallman
A moderately experienced developer asked me why we have a formal
31-character limit on the length of identifiers in our published API, and
why there are some C macros with names longer than that which don't cause
problems.
I gave him a short version of the history of linkers and the gradually
increasing limits of symbol name length over the decades. I explained how
C macros get replaced: he's a mathematician by education, and took a
while to appreciate that practical computing involves compromises rather
than truly implementing mathematical abstractions.
In the process I realised that 6-character names would fit into the
36-bit words (using 6-bit bytes) of IBM 700/7000 series machines, where
Fortran was originally developed, and this is probably the origin of the
6-character limit.
It was Dec wot started it:

from
https://en.wikipedia.org/wiki/RADIX-50


The use of RADIX 50 was the source of the filename size conventions used
by Digital Equipment Corporation PDP-11 operating systems. Using RADIX 50
encoding, six characters of a filename could be stored in two 16-bit
words, while three more extension (file type) characters could be stored
in a third 16-bit word. Similary, a three-character device name such as
"DL1" could also be stored in a 16-bit word. The period that separated the
filename and its extension, and the colon separating a device name from a
filename, was implied (i.e., was not stored and always assumed to be
present).


TL;DnR: 3 x 16bit words.==filename limit of 6.3
--
Bah, and indeed Humbug.
Lawrence D'Oliveiro
2024-08-17 08:47:04 UTC
Permalink
Post by Kerr-Mudd, John
Using RADIX
50 encoding, six characters of a filename could be stored in two 16-bit
words, while three more extension (file type) characters could be stored
in a third 16-bit word. Similary, a three-character device name such as
"DL1" could also be stored in a 16-bit word.
Ah, memories of the “.fss” (filespec string scan) service from RSTS/E ...

But no, on that OS at least, the device name (if it was valid) was not
Radix-50 encoded. The 2 characters of the physical device name were passed
back as is, while the unit number was converted to an integer and passed
back in another byte. There was also a flag byte indicating whether the
unit number had been explicitly specified or not. This allowed, e.g. “SY:”
to be distinguished from “SY0:”.

This also happened if a logical name was specified that could be
translated to a physical device name. If it could not be translated (but
was syntactically valid), then it would be returned Radix-50-encoded, with
a suitable flag bit to indicate this had happened.

One odd thing was, the docs I read (up to RSTS/E v7), always described the
Radix-50 code that was used for the “?” character as “undefined”.
John Levine
2024-08-17 17:13:15 UTC
Permalink
from https://en.wikipedia.org/wiki/RADIX-50
Um, that article says it was preceded by SQUOZE on the IBM 709 in 1958.

DEC's first 36 bit machine, the PDP-6, was shipped in 1964.

The early Fortran compilers stored variable names as six BCD characters
in a word, according to this document:

https://bitsavers.org/pdf/ibm/fortran/FORTRAN_704_709_Systems_Manual-1960.pdf

I don't think we need to look any farther back.
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Kerr-Mudd, John
2024-08-19 08:04:36 UTC
Permalink
On Sat, 17 Aug 2024 17:13:15 -0000 (UTC)
Post by John Levine
from https://en.wikipedia.org/wiki/RADIX-50
Um, that article says it was preceded by SQUOZE on the IBM 709 in 1958.
Thanks; I sit corrected; that's before my time (literally!).
Post by John Levine
DEC's first 36 bit machine, the PDP-6, was shipped in 1964.
The early Fortran compilers stored variable names as six BCD characters
https://bitsavers.org/pdf/ibm/fortran/FORTRAN_704_709_Systems_Manual-1960.pdf
I don't think we need to look any farther back.
--
Bah, and indeed Humbug.
Loading...