Origin Of Filename Extension Dot Separator

Discussion:

Origin Of Filename Extension Dot Separator

(too old to reply)

Lawrence D'Oliveiro

2024-08-05 01:05:10 UTC

I think several OSes had the idea, from the early days, of having some
part of a file name identify the kind of file it is. But it was DEC, I
believe, that originated the convention of having a dot separator enforced
by the filespec syntax: “«name».«ext»”. (Some other OSes, possibly IBM if
I recall rightly, used a space to separate the parts.)

Unix also copied the dot idea, but purely as a convention: the dot was
just another character that was valid in filenames, and the recognition of
some part of a filename as an “extension” was done only in particular
userland tools that needed to do filename transformations (e.g. make), not
by the OS itself.

From there, Gary Kildall was inspired somewhat by the DEC systems he was
using in the early development of CP/M, simplifying the filespec format
down a bit: only single-character device names and no directories (to
begin with) or file versioning, but keeping the core “«dev»:«name».«ext»”
idea. And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

John Levine

2024-08-05 02:41:47 UTC

Post by Lawrence D'Oliveiro
I think several OSes had the idea, from the early days, of having some
part of a file name identify the kind of file it is. But it was DEC, I
believe, that originated the convention of having a dot separator enforced
by the filespec syntax: “«name».«ext»”. (Some other OSes, possibly IBM if
I recall rightly, used a space to separate the parts.)

Multics also used a dot to separate parts of filenames as did OS/360.
They were writen about the same time as the PDP-6 monitor so it's
anyone's guess which was first or which influenced what. Multics and
the PDP-6 were both created by people from MIT so there's likely some
crossover there.

CMS. the single-user OS that was part of CP/67 and its successors did
(still does I assume) use a space separator.

Post by Lawrence D'Oliveiro
Unix also copied the dot idea, but purely as a convention: the dot was
just another character that was valid in filenames, and the recognition of
some part of a filename as an “extension” was done only in particular
userland tools that needed to do filename transformations (e.g. make), not
by the OS itself.

Right. The only characters that are special are slash to separate
components of pathnames and nul for the end of the name. There is
a firmly enforced system convention that the first entry in a directory
is . pointing to the directory itself and the second is .. pointing
to the parent.

--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Charlie Gibbs

2024-08-05 02:46:19 UTC

Post by John Levine

Post by Lawrence D'Oliveiro
I think several OSes had the idea, from the early days, of having some
part of a file name identify the kind of file it is. But it was DEC, I
believe, that originated the convention of having a dot separator enforced
by the filespec syntax: “«name».«ext»”. (Some other OSes, possibly IBM if
I recall rightly, used a space to separate the parts.)

Multics also used a dot to separate parts of filenames as did OS/360.
They were writen about the same time as the PDP-6 monitor so it's
anyone's guess which was first or which influenced what. Multics and
the PDP-6 were both created by people from MIT so there's likely some
crossover there.
CMS. the single-user OS that was part of CP/67 and its successors did
(still does I assume) use a space separator.

Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

--
/~\ Charlie Gibbs | We'll go down in history as the
\ / <***@kltpzyxm.invalid> | first society that wouldn't save
X I'm really at ac.dekanfrus | itself because it wasn't cost-
/ \ if you read it the right way. | effective. -- Kurt Vonnegut

Lawrence D'Oliveiro

2024-08-05 03:04:50 UTC

Apple file systems, from the Classic Mac onwards, store a file type in a
separate field in the metadata.

Not sure if that’s still true. I think OS X (or “macOS” as it’s called
now) is much more dependent on filename extensions than old MacOS.

One of the things that put me off OS X when it first came in was the
multiple translation layers, at least for backward-compatible APIs: from
the lower-level HFS/HFS+ structures, into the hacks in the file handling
in the BSD-derived kernel, and then back to the old-style APIs above that.

Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there are
useful conventions which can be bent if necessary).

Current Linux filesystems do have the concept of additional user-definable
(and some predefined) “attributes”. For example, there was a bit of a
security flap over GNU wget some years ago, when it was discovered that it
was saving the source URL as a custom attribute on a downloaded file; if
this had any password information in it, that could of course be retrieved
(accidentally or otherwise) later ...

R Daneel Olivaw

2024-08-08 19:32:32 UTC

Post by Charlie Gibbs

Post by John Levine

Post by Lawrence D'Oliveiro
I think several OSes had the idea, from the early days, of having some
part of a file name identify the kind of file it is. But it was DEC, I
believe, that originated the convention of having a dot separator enforced
by the filespec syntax: “«name».«ext»”. (Some other OSes, possibly IBM if
I recall rightly, used a space to separate the parts.)

Multics also used a dot to separate parts of filenames as did OS/360.
They were writen about the same time as the PDP-6 monitor so it's
anyone's guess which was first or which influenced what. Multics and
the PDP-6 were both created by people from MIT so there's likely some
crossover there.
CMS. the single-user OS that was part of CP/67 and its successors did
(still does I assume) use a space separator.

Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

Unisys (Sperry, Univac) did that as well.
The "type" field is 3 bits, the values (in no particular order) are:
Symbolic, Relocatable, Absolute (an executable), Assembler Procedure,
Cobol Procedure, Fortran Procedure, "Omnibus" (= "something else") and I
can't remember the last one.
They also have Subtypes - a 6-bit field - which specify what kind of
Symbolic or Omnibus it is, only the first 50 or so are defined and some
of those subtype values have been obsolete for decades.

Scott Lurndal

2024-08-08 22:42:06 UTC

Post by R Daneel Olivaw

Post by Charlie Gibbs

Post by John Levine

Post by Lawrence D'Oliveiro
I think several OSes had the idea, from the early days, of having some
part of a file name identify the kind of file it is. But it was DEC, I
believe, that originated the convention of having a dot separator enforced
by the filespec syntax: âÂ«nameÂ».Â«extÂ»â. (Some other OSes, possibly IBM if
I recall rightly, used a space to separate the parts.)

Multics also used a dot to separate parts of filenames as did OS/360.
They were writen about the same time as the PDP-6 monitor so it's
anyone's guess which was first or which influenced what. Multics and
the PDP-6 were both created by people from MIT so there's likely some
crossover there.
CMS. the single-user OS that was part of CP/67 and its successors did
(still does I assume) use a space separator.

Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

Unisys (Sperry, Univac) did that as well.
Symbolic, Relocatable, Absolute (an executable), Assembler Procedure,
Cobol Procedure, Fortran Procedure, "Omnibus" (= "something else") and I
can't remember the last one.

Burroughs medium systems had EDITOR, COBOL, SPRITE, SPRASM, BASIC,
FORTRAN, WFL, ICM (Independently Compiled Module, i.e. object file),
and a few others.

Lawrence D'Oliveiro

2024-08-09 00:05:24 UTC

Post by R Daneel Olivaw

Apple file systems, from the Classic Mac onwards, store a file type in
a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there are
useful conventions which can be bent if necessary).

Unisys (Sperry, Univac) did that as well.
Symbolic, Relocatable, Absolute (an executable), Assembler Procedure,
Cobol Procedure, Fortran Procedure, "Omnibus" (= "something else") and I
can't remember the last one.
They also have Subtypes - a 6-bit field - which specify what kind of
Symbolic or Omnibus it is, only the first 50 or so are defined and some
of those subtype values have been obsolete for decades.

Old Mac OS had separate “type” and “creator” fields. The latter was an
indication as to which app to launch when you tried to open the file in
the Finder, while the former gave format information.

I think BeOS had a similar idea, as well.

Dave Yeo

2024-08-11 02:06:15 UTC

Post by Charlie Gibbs
Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

Further back, the Apple ][ DOS had 4 (actually 8 but only 4 were
supported) file types in the metadata and the Apple /// SOS had 256
filetypes each with 64k subtypes which had various uses depending on
type. That file system was also used by Apple // Prodos, both 8 bit and
16 bit versions.
Dave

Peter Flass

2024-08-12 19:18:01 UTC

Post by Dave Yeo

Post by Charlie Gibbs
Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

Further back, the Apple ][ DOS had 4 (actually 8 but only 4 were
supported) file types in the metadata and the Apple /// SOS had 256
filetypes each with 64k subtypes which had various uses depending on
type. That file system was also used by Apple // Prodos, both 8 bit and
16 bit versions.
Dave

OS/2 has always used “extended attributes” (aka metadata ) for this stuff.
Very handy.

--
Pete

Dave Yeo

2024-08-13 22:46:46 UTC

Post by Peter Flass

Post by Dave Yeo

Post by Charlie Gibbs
Apple file systems, from the Classic Mac onwards, store a file type
in a separate field in the metadata. Thus, as with *n*x file systems,
there are no magical parts to the file name itself (although there
are useful conventions which can be bent if necessary).

Further back, the Apple ][ DOS had 4 (actually 8 but only 4 were
supported) file types in the metadata and the Apple /// SOS had 256
filetypes each with 64k subtypes which had various uses depending on
type. That file system was also used by Apple // Prodos, both 8 bit and
16 bit versions.
Dave

OS/2 has always used “extended attributes” (aka metadata ) for this stuff.
Very handy.

Yes, but only in the WPS, cmd.exe uses the filename extension to decide
if a file is an executable. And even the WPS often uses the extension to
decide what to do with a file.
Things could also get weird with the .LONGNAME attribute on file systems
that supported long names when they got out of sync.
I always also wondered about security, foo.txt could have a .TYPE of
executable, double click it and it would run.
Dave

Charlie Gibbs

2024-08-14 02:48:36 UTC

Post by Dave Yeo

Post by Peter Flass
OS/2 has always used “extended attributes” (aka metadata ) for this stuff.
Very handy.

Yes, but only in the WPS, cmd.exe uses the filename extension to decide
if a file is an executable. And even the WPS often uses the extension to
decide what to do with a file.
Things could also get weird with the .LONGNAME attribute on file systems
that supported long names when they got out of sync.
I always also wondered about security, foo.txt could have a .TYPE of
executable, double click it and it would run.

It's bad enough that Windows defaults to hiding file extensions,
so some poor punter can double-click on a file named mytune.mp3.exe
and run some malware rather than listening to music.

--
/~\ Charlie Gibbs | We'll go down in history as the
\ / <***@kltpzyxm.invalid> | first society that wouldn't save
X I'm really at ac.dekanfrus | itself because it wasn't cost-
/ \ if you read it the right way. | effective. -- Kurt Vonnegut

Lawrence D'Oliveiro

2024-08-05 02:58:55 UTC

There is a firmly enforced [*nix] system convention that the first entry
in a directory is . pointing to the directory itself and the second
is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their special-case
handling built into the kernel filename-parsing code (which has to be done
to some extent anyway), there is no need to waste all that on-disk space
for those entries.

Joe Pfeiffer

2024-08-05 03:21:09 UTC

Post by Lawrence D'Oliveiro

There is a firmly enforced [*nix] system convention that the first entry
in a directory is . pointing to the directory itself and the second
is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their special-case
handling built into the kernel filename-parsing code (which has to be done
to some extent anyway), there is no need to waste all that on-disk space
for those entries.

While you're right that it's wasteful and unnecessary, you're talking
about two entries per directory, which is *tiny*. Yeah, I can see
someone without nothing better to do optimizing them away (assuming it
hasn't already happened -- I didn't check) but that would be so far down
on my list of things to improve the daisies over my grave will be rising
long before this reaches the top of my priority queue.

Lawrence D'Oliveiro

2024-08-05 04:15:07 UTC

Post by Joe Pfeiffer
While you're right that it's wasteful and unnecessary, you're talking
about two entries per directory, which is *tiny*.

It’s also something else that has to be checked as a requirement for
filesystem inconsistency, so it’s something else to go wrong.
Unnecessarily.

John Levine

2024-08-06 02:18:16 UTC

Post by Lawrence D'Oliveiro

Post by Joe Pfeiffer
While you're right that it's wasteful and unnecessary, you're talking
about two entries per directory, which is *tiny*.

It’s also something else that has to be checked as a requirement for
filesystem inconsistency, so it’s something else to go wrong.
Unnecessarily.

I dunno, adding special case code that has to run every time you look
up a filename to avoid code in fsck that you run maybe twice a year
doesn't sound like a great tradeoff.

Also, I see how you could fake the "." entry since you already know
what inode you're working on, but how do you fake ".."? Keep a chain
of links in RAM from the root to every inode that is a directory?
What about those system calls that let you give the inode to start
the name search?

We understand that by definition you never make a mistake but you're
really going to have to help us out here.

--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Lawrence D'Oliveiro

2024-08-06 02:45:45 UTC

Also, I see how you could fake the "." entry since you already know what
inode you're working on, but how do you fake ".."? Keep a chain of
links in RAM from the root to every inode that is a directory?

That is how the Linux kernel seems to do it. It creates a “dentry” struct
for every directory on the path to the item being accessed.

What about those system calls that let you give the inode to start the
name search?

There aren’t any.

I dunno, adding special case code that has to run every time you look up
a filename to avoid code in fsck that you run maybe twice a year doesn't
sound like a great tradeoff.

You need to special-case the handling of “..” anyway. Remember, in the
directory at the root of a filesystem, the “..” entry points to itself.
But the kernel has to ignore this when that filesystem is mounted in a
non-root directory, and interpret “..” as going up into the parent of that
directory.

Conversely, Linux, for example, supports filesystem namespaces, where a
particular directory can look like the root of the filesystem when it is
not. To enforce this, the kernel has to ignore attempts to escape out of
the namespace via “..”, and interpret that as pointing back to the
namespace root.

John Levine

2024-08-05 03:23:46 UTC

Post by Lawrence D'Oliveiro

There is a firmly enforced [*nix] system convention that the first entry
in a directory is . pointing to the directory itself and the second
is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their special-case
handling built into the kernel filename-parsing code (which has to be done
to some extent anyway), there is no need to waste all that on-disk space
for those entries.

The only time it would make a difference is if those two entries made
a directory that didn't quite fill up N disk blocks expand to N+1. My
laptop has a terabyte of disk with a 4K block size. I'm guessing those
two entries take perhaps 12 bytes. How often do you think a directory
is between 4084 an 4095 bytes so they'd add an extra disk block? My
estimate rounds to zero.

R's,
John

--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Sebastian

2024-08-27 05:18:33 UTC

Post by Lawrence D'Oliveiro

There is a firmly enforced [*nix] system convention that the first entry
in a directory is . pointing to the directory itself and the second
is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their special-case
handling built into the kernel filename-parsing code (which has to be done
to some extent anyway), there is no need to waste all that on-disk space
for those entries.

Given the way the original UNIX worked, if there wasn't an actual entry
in the directory pointing to the parent, there'd be no way for a process
to reach the parent unless it had the full path to the parent in memory,
and then it would have to start from root to get there, visiting the inode
of each ancestor directory, and then linearly scanning that directory
to find the inode number of the next level in the tree, until it would
finally arrive at the parent directory's inode.

The "." entry would've saved a linear scan of the parent directory if
you needed the inode number of the current directory, since it's the
first entry.

Lawrence D'Oliveiro

2024-08-27 06:57:05 UTC

Post by Sebastian

Post by Lawrence D'Oliveiro

There is a firmly enforced [*nix] system convention that the first
entry in a directory is . pointing to the directory itself and the
second is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their
special-case handling built into the kernel filename-parsing code
(which has to be done to some extent anyway), there is no need to waste
all that on-disk space for those entries.

Given the way the original UNIX worked, if there wasn't an actual entry
in the directory pointing to the parent, there'd be no way for a process
to reach the parent unless it had the full path to the parent in memory,
and then it would have to start from root to get there, visiting the
inode of each ancestor directory, and then linearly scanning that
directory to find the inode number of the next level in the tree, until
it would finally arrive at the parent directory's inode.
The "." entry would've saved a linear scan of the parent directory if
you needed the inode number of the current directory, since it's the
first entry.

None of which required the “.” or “..” entries to be visible to user
processes as part of a directory scan.

Sebastian

2024-08-27 07:33:40 UTC

Post by Sebastian

Post by Lawrence D'Oliveiro

There is a firmly enforced [*nix] system convention that the first
entry in a directory is . pointing to the directory itself and the
second is .. pointing to the parent.

This is an entirely separate issue, but I find that just redundant and
unnecessary, taking up space in every directory with these two
backpointers. By all means reserve the names, but have their
special-case handling built into the kernel filename-parsing code
(which has to be done to some extent anyway), there is no need to waste
all that on-disk space for those entries.

Given the way the original UNIX worked, if there wasn't an actual entry
in the directory pointing to the parent, there'd be no way for a process
to reach the parent unless it had the full path to the parent in memory,
and then it would have to start from root to get there, visiting the
inode of each ancestor directory, and then linearly scanning that
directory to find the inode number of the next level in the tree, until
it would finally arrive at the parent directory's inode.
The "." entry would've saved a linear scan of the parent directory if
you needed the inode number of the current directory, since it's the
first entry.

None of which required the ?.? or ?..? entries to be visible to user
processes as part of a directory scan.

There was no reason to hide them, and on the PDP-11 there was every
reason to not implement extra behavior just to hide them. And then
user programs came to expect them to be there, so now you'd have
to simulate them even if they weren't really there.

Lawrence D'Oliveiro

2024-08-27 07:36:23 UTC

Post by Lawrence D'Oliveiro
None of which required the “.” or “..” entries to be visible to user
processes as part of a directory scan.

There was no reason to hide them ...

This was a system which already went to a lot of trouble to hide the fact
that file space is allocated in units of whole blocks, they couldn’t put
in code to skip over some internal details in directory structures??

Sebastian

2024-08-27 08:19:28 UTC

Post by Lawrence D'Oliveiro

None of which required the ?.? or ?..? entries to be visible to user
processes as part of a directory scan.

There was no reason to hide them ...

This was a system which already went to a lot of trouble to hide the fact
that file space is allocated in units of whole blocks, they couldn?t put
in code to skip over some internal details in directory structures??

That code would've had to have gone in the read(2) system call, because
one difference between V7 UNIX and modern systems was that to list
a directory, you just opened it like a file and read the bytes representing
the file data directly from the disk. That structure only contained
two fields, name and inode number, which had to be carried over into
opendir(3) and friends because that was originally just a library to
abstract over using open(2) and read(2).

The simplest way to implement it would've been for open(2) to detect
it was opening a directory, and then respond by seeking forward. And
then lseek(2) would have to seek to that same position if it was
asked to seek to position 0.

And doing all of this would require there to be extra system calls
to do things that could otherwise be done by accessing the "." and
".." entries. For example, "cd .." would NOT have worked, nor
chdir("..") in C. So now there has to be a "cdup(2)" system call.

Lawrence D'Oliveiro

2024-08-27 23:57:20 UTC

Post by Sebastian

Post by Lawrence D'Oliveiro

None of which required the ?.? or ?..? entries to be visible to user
processes as part of a directory scan.

There was no reason to hide them ...

This was a system which already went to a lot of trouble to hide the
fact that file space is allocated in units of whole blocks, they
couldn?t put in code to skip over some internal details in directory
structures??

That code would've had to have gone in the read(2) system call, because
one difference between V7 UNIX and modern systems was that to list a
directory, you just opened it like a file and read the bytes
representing the file data directly from the disk.

But that all went away as soon as Unix acquired a VFS abstraction layer,
which happened back in the 1980s. I think BSD had already invented
readdir(3) and friends by then.

Sebastian

2024-08-28 03:42:52 UTC

Post by Lawrence D'Oliveiro

Post by Sebastian

Post by Lawrence D'Oliveiro

None of which required the ?.? or ?..? entries to be visible to user
processes as part of a directory scan.

There was no reason to hide them ...

This was a system which already went to a lot of trouble to hide the
fact that file space is allocated in units of whole blocks, they
couldn?t put in code to skip over some internal details in directory
structures??

That code would've had to have gone in the read(2) system call, because
one difference between V7 UNIX and modern systems was that to list a
directory, you just opened it like a file and read the bytes
representing the file data directly from the disk.

But that all went away as soon as Unix acquired a VFS abstraction layer,
which happened back in the 1980s. I think BSD had already invented
readdir(3) and friends by then.

readdir(3) and friends were libc functions that were implemented
using open(2) and read(2). They were simple functions that could be
implemented on V7 in an hour or less (and it wouldn't surprise me if the
original BSD implementation could be compiled on V7 without modification). The
idea that directories were special things that should never be accessed
except by that suite of functions came later, and by then, there were
already programs that skipped the first two entries in every directory
they iterated over, so there was little incentive to get rid of them
and at least a little reason not to.

Of course, the modern Linux community does not care about backward
compatibility, so the only thing preventing "." and ".." from being
removed from Linux is Linus's rule against breaking userspace, which will
disappear as soon as he retires, and then we'll probably see each Linux
releases that break everything from libc on up, requiring everything to
be rewritten from scratch, just like we see every 5 years or so in the
GNOME and KDE worlds.

Lawrence D'Oliveiro

2024-08-07 00:09:30 UTC

At one time (1982 or so) I had the chance to evaluate an HP 9836 for a
proposed software development project (which fell through).

This was running a variant of the UCSD P-system. Only, instead of the
Pascal compiler producing interpreted P-code, it generated actual Motorola
68000 machine language.

File names could be 15 characters long, which was more than most other
micro-based systems of the time. But then the system conventions used
ridiculously long extensions, like .PASCAL for Pascal source files,
and .SYSTEM for parts of the low-level OS (that I can remember).

I remember thinking at the time, what a waste of those 15 characters ...

But there were other cool features of the system.

Dave Yeo

2024-08-11 02:09:37 UTC

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

Does Windows no longer have NUL: CON: etc device names?
Dave

Lawrence D'Oliveiro

2024-08-11 03:04:22 UTC

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.

Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html>,
but I have it on good authority that they missed a few.

Dave Yeo

2024-08-11 23:16:29 UTC

Post by Lawrence D'Oliveiro

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html>,
but I have it on good authority that they missed a few.

In DOS, they're actual device names. console, com ports, the null device
etc. Could initialize a modem by "echo ATZ > com1:" for example, or use
the mode command to set up the port, "mode COM1:24" for 2400baud, or
redirect stdout to nul: "foo.exe > NUL: 2>error.log".
I guess with Windows NT, those commands wouldn't work.
Dave

Lawrence D'Oliveiro

2024-08-12 00:28:44 UTC

Post by Dave Yeo
In DOS, they're actual device names.

In Windows, on the other hand (quoting from the Google blog post):

Now if just specifying these paths explicitly was all that this
process handled it would be annoying but not the end of the world.
However it’s much worse. The conversion process actively tries to
convert any path with the device name last, even if the path is a
Drive Absolute path. To make matters even worse the device name can
have arbitrary trailing characters as long the trailing characters
are separated from the device by a dot or a colon. The name can
then also have trailing spaces.

...

Why it does the check is beyond me as it seems to serve no actual
purpose. Also note the removal of trailing suffixes, which can come
in handy if something is actively trying to guard against this
behavior. For example, if an application was mindful and was
checking for a filename that matched one of the reserved names you
can just bypass that check by appending an arbitrary suffix.

Bob Eager

2024-08-12 12:57:56 UTC

Post by Dave Yeo

Post by Lawrence D'Oliveiro

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations
of which (like single-letter device names) continue to afflict
Windows to this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-

win32-to-nt.html>,

Post by Dave Yeo

Post by Lawrence D'Oliveiro
but I have it on good authority that they missed a few.

In DOS, they're actual device names. console, com ports, the null device
etc. Could initialize a modem by "echo ATZ > com1:" for example, or use
the mode command to set up the port, "mode COM1:24" for 2400baud, or
redirect stdout to nul: "foo.exe > NUL: 2>error.log".
I guess with Windows NT, those commands wouldn't work.

Just tried them on Windows 10. NUL works as expected. Copying to COM1
hangs, so probably works too.

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Charlie Gibbs

2024-08-12 22:48:29 UTC

Post by Bob Eager

Post by Dave Yeo

Post by Lawrence D'Oliveiro

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations
of which (like single-letter device names) continue to afflict
Windows to this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-
win32-to-nt.html>,
but I have it on good authority that they missed a few.

In DOS, they're actual device names. console, com ports, the null device
etc. Could initialize a modem by "echo ATZ > com1:" for example, or use
the mode command to set up the port, "mode COM1:24" for 2400baud, or
redirect stdout to nul: "foo.exe > NUL: 2>error.log".
I guess with Windows NT, those commands wouldn't work.

Just tried them on Windows 10. NUL works as expected. Copying to COM1
hangs, so probably works too.

In true Microsoft style, the trailing colon (e.g. CON:) was optional,
resurrecting the whole reserved-word nightmare that people like to
laugh at COBOL for having. Had the colon been made mandatory, it
would put device names into a totally different name space, and
we wouldn't still be futzing with reserved words.

--
/~\ Charlie Gibbs | We'll go down in history as the
\ / <***@kltpzyxm.invalid> | first society that wouldn't save
X I'm really at ac.dekanfrus | itself because it wasn't cost-
/ \ if you read it the right way. | effective. -- Kurt Vonnegut

Lawrence D'Oliveiro

2024-08-13 01:00:49 UTC

Post by Charlie Gibbs
In true Microsoft style, the trailing colon (e.g. CON:) was optional,
resurrecting the whole reserved-word nightmare that people like to laugh
at COBOL for having. Had the colon been made mandatory, it would put
device names into a totally different name space, and we wouldn't still
be futzing with reserved words.

The mistake was in trying to handle these devices as “reserved” file
names, instead of as device names (with the colon delimiter). The original
DEC systems that inspired CP/M handled this properly with their
multicharacter device names, but somewhere along the way to MS-DOS, of
simplifying device names down to one letter, then realizing that this
wasn’t sufficient for certain other devices, the original idea got lost,
and had to be reinvented ... badly.

Scott Lurndal

2024-08-13 13:17:14 UTC

Post by Charlie Gibbs

Post by Bob Eager

Post by Dave Yeo

Post by Dave Yeo

And then Tim Patterson copied CP/M in his âQDOSâ for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations
of which (like single-letter device names) continue to afflict
Windows to this day.

Does Windows no longer have NUL: CON: etc device names?

Those are âreservedâ file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those âreservedâ names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-
win32-to-nt.html>,
but I have it on good authority that they missed a few.

In DOS, they're actual device names. console, com ports, the null device
etc. Could initialize a modem by "echo ATZ > com1:" for example, or use
the mode command to set up the port, "mode COM1:24" for 2400baud, or
redirect stdout to nul: "foo.exe > NUL: 2>error.log".
I guess with Windows NT, those commands wouldn't work.

Just tried them on Windows 10. NUL works as expected. Copying to COM1
hangs, so probably works too.

In true Microsoft style, the trailing colon (e.g. CON:) was optional,
resurrecting the whole reserved-word nightmare that people like to
laugh at COBOL for having. Had the colon been made mandatory, it
would put device names into a totally different name space, and
we wouldn't still be futzing with reserved words.

Ah, the colon. One of the few characters disallowed in filenames
on windows filesystems (e.g. FATx). Very annoying.

Lawrence D'Oliveiro

2024-08-13 22:03:52 UTC

Post by Scott Lurndal
Ah, the colon. One of the few characters disallowed in filenames
on windows filesystems (e.g. FATx). Very annoying.

On *nix systems, a colon is commonly used by network-oriented commands
like scp and rsync to indicate a filespec on a remote machine, e.g.

scp other-machine:remote-file local-file

So I get into the habit of using “∶” instead of “:” if I want a colon in
my filenames. Similarly I use “∕” instead of “/” so it doesn’t get taken
for a path separator.

Pip R.

2024-08-13 23:39:41 UTC

Post by Lawrence D'Oliveiro
So I get into the habit of using “∶” instead of “:” if I want a colon in
my filenames. Similarly I use “∕” instead of “/” so it doesn’t get taken
for a path separator.

*squints*

*rubs eyes*

*squints*

https://www.compart.com/en/unicode/U+2236
https://www.compart.com/en/unicode/U+2215

Unicode is wild, man.

--
Pip R. <***@plixels.net>

"Life is like a good pour-over--find the perfect balance."

Scott Lurndal

2024-08-14 02:12:55 UTC

Post by Pip R.

So I get into the habit of using ââ¶â instead of â:â if I want a colon in
my filenames. Similarly I use âââ instead of â/â so it doesnât get taken
for a path separator.

*squints*
*rubs eyes*
*squints*
https://www.compart.com/en/unicode/U+2236
https://www.compart.com/en/unicode/U+2215
Unicode is wild, man.

No, it's unreadable, as you point out.

Don Poitras

2024-08-14 12:42:00 UTC

Post by Scott Lurndal

Post by Pip R.

So I get into the habit of using ????????? instead of ???:??? if I want a colon in
my filenames. Similarly I use ????????? instead of ???/??? so it doesn???t get taken
for a path separator.

*squints*
*rubs eyes*
*squints*
https://www.compart.com/en/unicode/U+2236
https://www.compart.com/en/unicode/U+2215
Unicode is wild, man.

No, it's unreadable, as you point out.

Quite readable. As his post had set 'Content-Type: text/plain; charset=UTF-8; format=flowed'
Yours however, does not.

--
Don Poitras

Anssi Saari

2024-08-14 14:05:01 UTC

Post by Don Poitras
Quite readable. As his post had set 'Content-Type: text/plain; charset=UTF-8; format=flowed'
Yours however, does not.

There are a couple of ancient xrn diehards around, Scott is one of them
so Usenet is ASCII only for him.

Nuno Silva

2024-08-14 17:03:04 UTC

Post by Anssi Saari

Post by Don Poitras
Quite readable. As his post had set 'Content-Type: text/plain;
charset=UTF-8; format=flowed'
Yours however, does not.

There are a couple of ancient xrn diehards around, Scott is one of them
so Usenet is ASCII only for him.

Even with UCS support, that doesn't make it more readable. While I do
see a square box because this font doesn't support the other "colon"
char, there will be plenty of cases where, even with the glyph being
rendered, it's outright unreadable, and/or hard to tell apart (kind of a
feature in this colon case, I suppose :-) ).

Recently, I came across the story of how at least one low-resolution
japanese emoji glyph was probably copied to UCS [1], leading to an
incorrect rendering: U+1F51E "No one under 18 sign" is frequently drawn
as "no 18". That's not just some font doing it in a less good way,
that's, AFAIK, how it's included in UCS [2].

In general, when stuff like smileys, say ":-)" is condensed in a single
glyph, that tends to make things progressively more difficult to
read.

Proportional fonts and colorized glyphs may make this less of a problem,
though, and interfacing the Internet with monospaced fonts (well,
sometimes), and from monospaced monochrome terminal emulators and a
terminal without unicode support are all personal choices of mine...

[1] https://mastodon.social/@CharlotteBuff/112806149720738139 (requires
javascript; alternative (and properly threaded) view at
https://threadtree.xyz/112803632014992740#tweet-112806149765530842 )

[2] http://unicode.org/L2/L2010/10046-02n4123_fpdam8-all.pdf (p83)
(If someone knows more about this and there is e.g. some other
UCS source for the recommended glyph design, I'd be interested
in learning about that!)

--
Nuno Silva (njsg)

Scott Lurndal

2024-08-14 17:51:30 UTC

Post by Nuno Silva

Post by Anssi Saari

Post by Don Poitras
Quite readable. As his post had set 'Content-Type: text/plain;
charset=UTF-8; format=flowed'
Yours however, does not.

There are a couple of ancient xrn diehards around, Scott is one of them
so Usenet is ASCII only for him.

Even with UCS support, that doesn't make it more readable. While I do
see a square box because this font doesn't support the other "colon"
char, there will be plenty of cases where, even with the glyph being
rendered, it's outright unreadable, and/or hard to tell apart (kind of a
feature in this colon case, I suppose :-) ).

Indeed. Xrn supports all the standard X11
font files, such as the ISO 8859-1 et alia fonts so it is
more than just ASCII.

I find the unnecessary use of UCS soi disant smart
quotes egregiously annoying, even when a client does
display them correctly.

Mike Spencer

2024-08-15 21:04:22 UTC

Post by Scott Lurndal
I find the unnecessary use of UCS soi disant smart
quotes egregiously annoying, even when a client does
display them correctly.

Yes, total PITA. Rendered in may mail and news readers as octal bytes
or irrelevant glyphs from whatever charset but not as quotes.
Ellipses and other punctuation similar.

The whole charset/UTS/whatever thing itself a total PITA for anyone
who works exclusively in English and doesn't have to prepare
publication- or lawcourt-ready documents.

The most egregious example I het recently is that newest instantiation
of xpdf (popular Linux PDF reader). If the author of a PDF document
has used ff ligatures, the xpdf reader is unable to search and find
(say) "Kauffman" because no one types "ff-ligature" into as search
pane and the reader fails to cope intelligently with the situation.

--
Mike Spencer Nova Scotia, Canada

Scott Lurndal

2024-08-15 22:59:46 UTC

Post by Mike Spencer

Post by Scott Lurndal
I find the unnecessary use of UCS soi disant smart
quotes egregiously annoying, even when a client does
display them correctly.

Yes, total PITA. Rendered in may mail and news readers as octal bytes
or irrelevant glyphs from whatever charset but not as quotes.
Ellipses and other punctuation similar.
The whole charset/UTS/whatever thing itself a total PITA for anyone
who works exclusively in English and doesn't have to prepare
publication- or lawcourt-ready documents.
The most egregious example I het recently is that newest instantiation
of xpdf (popular Linux PDF reader). If the author of a PDF document
has used ff ligatures, the xpdf reader is unable to search and find
(say) "Kauffman" because no one types "ff-ligature" into as search
pane and the reader fails to cope intelligently with the situation.

yes, that's egregious. It's also, hopefully, rare.

I'm also a fan of xpdf - far superior UI (and more secure) than acrobat reader.

Lawrence D'Oliveiro

2024-08-16 00:21:42 UTC

Post by Scott Lurndal

Post by Mike Spencer
The most egregious example I het recently is that newest instantiation
of xpdf (popular Linux PDF reader). If the author of a PDF document has
used ff ligatures, the xpdf reader is unable to search and find (say)
"Kauffman" because no one types "ff-ligature" into as search pane and
the reader fails to cope intelligently with the situation.

yes, that's egregious. It's also, hopefully, rare.

That’s an example of why such ligatures are a bad idea. Which is why we
have modern OpenType-based text layout engines that perform such glyph
substitution at rendering time, not by changes to the original text.

Chris Ahlstrom

2024-08-16 11:45:46 UTC

Post by Scott Lurndal

Post by Mike Spencer

Post by Scott Lurndal
I find the unnecessary use of UCS soi disant smart
quotes egregiously annoying, even when a client does
display them correctly.

Yes, total PITA. Rendered in may mail and news readers as octal bytes
or irrelevant glyphs from whatever charset but not as quotes.
Ellipses and other punctuation similar.
The whole charset/UTS/whatever thing itself a total PITA for anyone
who works exclusively in English and doesn't have to prepare
publication- or lawcourt-ready documents.
The most egregious example I het recently is that newest instantiation
of xpdf (popular Linux PDF reader). If the author of a PDF document
has used ff ligatures, the xpdf reader is unable to search and find
(say) "Kauffman" because no one types "ff-ligature" into as search
pane and the reader fails to cope intelligently with the situation.

yes, that's egregious. It's also, hopefully, rare.
I'm also a fan of xpdf - far superior UI (and more secure) than acrobat reader.

I use zathura. Haven't used xpdf in a decade or so, but it does look cool
these days.

--
Truth is the most valuable thing we have -- so let us economize it.
-- Mark Twain

Lawrence D'Oliveiro

2024-08-16 00:20:01 UTC

The whole charset/UTS/whatever thing itself a total PITA for anyone who
works exclusively in English and doesn't have to prepare publication- or
lawcourt-ready documents.

That’s a very naïve attitude to take vis-à-vis the fact that the
pervasiveness of the Internet has made international coöperation pretty
much routine, indeed obligatory, nowadays.

j***@carcosa.net

2024-08-16 17:26:41 UTC

Post by Lawrence D'Oliveiro

The whole charset/UTS/whatever thing itself a total PITA for anyone who
works exclusively in English and doesn't have to prepare publication- or
lawcourt-ready documents.

That’s a very naïve attitude to take vis-à-vis the fact that the
pervasiveness of the Internet has made international coöperation pretty
much routine, indeed obligatory, nowadays.

Honestly, the situation now with Unicode is a lot better than it was in
the early-mid 90s, when you would run into different national codepages
without any kind of tagging on the regular. At minimum CP437, CP850,
Mac-Roman, ISO 8559-1, ISO 8859-2, Windows-1252, Shift JIS, and
KOI8-R. If you were reasonably lucky, your Usenet client could handle
any of them, but you often had to guess what was in use in any
particular message based on contextual clues and switch manually.

A lot of the hate for "smart quotes" comes from Windows-1252 text being
sent claiming to be ISO 8859-1.

Today, almost everything is UTF-8, or less commonly UTF-16. I might
sometimes use a client that doesn't support UTF-8 or supports it very
minimally, but if so, I'm intentionally doing retrocomputing or
minimalist computing, and I know what I'm getting into.

--
Jason McBrayer | “Strange is the night where black stars rise,
***@carcosa.net | and strange moons circle through the skies,
| but stranger still is lost Carcosa.”
| ― Robert W. Chambers,The King in Yellow

Ahem A Rivet's Shot

2024-08-16 17:58:00 UTC

On Fri, 16 Aug 2024 13:26:41 -0400

Post by j***@carcosa.net
A lot of the hate for "smart quotes" comes from Windows-1252 text being
sent claiming to be ISO 8859-1.

At a PPOE I created code to spot this happening and switch the
decoding.

--
Steve O'Hara-Smith
Odds and Ends at http://www.sohara.org/
For forms of government let fools contest
Whate're is best administered is best - Alexander Pope

Lawrence D'Oliveiro

2024-08-16 23:45:13 UTC

A lot of the hate for "smart quotes" ...

I don’t call them “smart quotes”. I call them “paired quotes” or
“typographic quotes”. Professional typographers wince at the unpaired,
typewriter-style quotes that computers have been foisting on us all for
those first few decades. There’s no reason to put up with them any more.

Today, almost everything is UTF-8, or less commonly UTF-16.

UTF-16 is a legacy of certain platforms (*cough* Windows NT, Java *cough*)
embracing Unicode at just the wrong time.

Peter Flass

2024-08-17 23:04:31 UTC

Post by Lawrence D'Oliveiro

A lot of the hate for "smart quotes" ...

I don’t call them “smart quotes”. I call them “paired quotes” or
“typographic quotes”. Professional typographers wince at the unpaired,
typewriter-style quotes that computers have been foisting on us all for
those first few decades. There’s no reason to put up with them any more.

Unless you’re writing code. What else is there?

Post by Lawrence D'Oliveiro

Today, almost everything is UTF-8, or less commonly UTF-16.

UTF-16 is a legacy of certain platforms (*cough* Windows NT, Java *cough*)
embracing Unicode at just the wrong time.

--
Pete

Lawrence D'Oliveiro

2024-08-18 00:24:42 UTC

Post by Peter Flass

Post by Lawrence D'Oliveiro

A lot of the hate for "smart quotes" ...

I don’t call them “smart quotes”. I call them “paired quotes” or
“typographic quotes”. Professional typographers wince at the unpaired,
typewriter-style quotes that computers have been foisting on us all for
those first few decades. There’s no reason to put up with them any more.

Unless you’re writing code.

I use them in code, too.

Scott Lurndal

2024-08-14 02:12:27 UTC

Post by Lawrence D'Oliveiro

Post by Scott Lurndal
Ah, the colon. One of the few characters disallowed in filenames
on windows filesystems (e.g. FATx). Very annoying.

On *nix systems, a colon is commonly used by network-oriented commands
like scp and rsync to indicate a filespec on a remote machine, e.g.

The colon has no defined meaning in command arguments, other than any
semantics applied by an individual command or a shell. ssh(1) and scp(1) have
a specific interpretation that applies to the arguments provided them.

For other applications, git(1) for instance, the colon has completely
different meaning.

Scott Lurndal

2024-08-14 14:09:53 UTC

Post by Scott Lurndal

Post by Lawrence D'Oliveiro

Post by Scott Lurndal
Ah, the colon. One of the few characters disallowed in filenames
on windows filesystems (e.g. FATx). Very annoying.

On *nix systems, a colon is commonly used by network-oriented commands
like scp and rsync to indicate a filespec on a remote machine, e.g.

The colon has no defined meaning in command arguments, other than any
semantics applied by an individual command or a shell. ssh(1) and scp(1) have
a specific interpretation that applies to the arguments provided them.
For other applications, git(1) for instance, the colon has completely
different meaning.

And, of course, the colon is also used to demarc shell labels, and as
a null command in POSIX shells.

Peter Flass

2024-08-12 19:18:02 UTC

Post by Lawrence D'Oliveiro

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html>,
but I have it on good authority that they missed a few.

What a complete stew of nonsense!

--
Pete

Charlie Gibbs

2024-08-13 00:22:19 UTC

Post by Peter Flass

Post by Lawrence D'Oliveiro

Post by Dave Yeo

Post by Lawrence D'Oliveiro
And then Tim Patterson copied CP/M in his “QDOS” for x86, which
Microsoft bought and made the basis of their MS-DOS, the limitations of
which (like single-letter device names) continue to afflict Windows to
this day.

Does Windows no longer have NUL: CON: etc device names?

Those are “reserved” file names, distinct from drive letters in the
DOS/Windows filename syntax.
Those “reserved” names are their own complete can of worms. Google
tried to enumerate all the possible ways they are interpreted
<http://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html>,
but I have it on good authority that they missed a few.

What a complete stew of nonsense!

It was certainly enough to make my eyes glaze over, and it
re-inforces my "Bill Gates is a Martian" theory, to wit:

The explosive technological progress on Earth in the 1960s
and '70s was observed by the Martians with increasing alarm.
We made it to the moon; before long we'd be invading their
planet. So they sent one of their own to Earth, who founded
a company called Microsoft. And then nobody went to the Moon
anymore; we were all too busy re-booting, re-formatting, and
re-installing Windows. The threat was averted.

But speaking of paths, my pet peeve is the way that when
you execute another program in the current directory (i.e.
CreateProcess() with lpCurrentDirectory set to NULL), there is a
small but finite chance that the child program will not inherit
the calling program's current directory, but will have its
current directory set to a random location. The odds are small,
but even a 0.01% chance means that for a program that's run daily
in 1000 locations, you're going to get anguished calls every
week or two. I observed it personally several times on a test
machine at the office; it was obvious because my programs look
for files in the current directory and if they fail will write
an error message to a log file in the current directory.
I was finding bits and pieces of log files all over the disk.
I first noticed this problem in Windows 95, and I'm pretty sure
I saw it in the NT family as well. I don't know whether it still
happens, but I've been afraid to remove the hack I put into all
my programs that jams the current directory to the one from which
the program loaded. (We normally keep our executables in the
directory containing the data they process; it would be nice if
we could put them into a "bin" directory somewhere on the path,
but c'est la guerre.)

--
/~\ Charlie Gibbs | We'll go down in history as the
\ / <***@kltpzyxm.invalid> | first society that wouldn't save
X I'm really at ac.dekanfrus | itself because it wasn't cost-
/ \ if you read it the right way. | effective. -- Kurt Vonnegut

John Ames

2024-08-13 15:23:26 UTC

On Tue, 13 Aug 2024 00:22:19 GMT

Post by Charlie Gibbs
But speaking of paths, my pet peeve is the way that when
you execute another program in the current directory (i.e.
CreateProcess() with lpCurrentDirectory set to NULL), there is a
small but finite chance that the child program will not inherit
the calling program's current directory, but will have its
current directory set to a random location. [...] I don't know
whether it still happens [...]

It definitely still does; I've run into this on the job.

Trying to access files on an SMB share programmatically is another
amazingly non-deterministic problem in this category; even apart from
"security" & permissions issues in newer versions of Windows (which
make up ~20% of our weekly call volume at my workplace,) but it can
make a difference whether you're running a GUI or command-line program
(not even a DOS program, but Win32 command-line,) whether it's a mapped
drive letter or a straight UNC path...ye *gods.*

(Win10 also does a thing where it'll randomly convert desktop shortcuts
on mapped drives to UNC path, for extra surprise fun, but that has more
to do with "helpful" Explorer behavior than NT kernel stuff, I'm sure.)

Charlie Gibbs

2024-08-13 17:17:32 UTC

Post by John Ames
On Tue, 13 Aug 2024 00:22:19 GMT

Post by Charlie Gibbs
But speaking of paths, my pet peeve is the way that when
you execute another program in the current directory (i.e.
CreateProcess() with lpCurrentDirectory set to NULL), there is a
small but finite chance that the child program will not inherit
the calling program's current directory, but will have its
current directory set to a random location. [...] I don't know
whether it still happens [...]

It definitely still does; I've run into this on the job.

Thanks for that. That's the first time I've heard of it
happening to anyone else. I've searched the web extensively,
and never came up with anything (although I might have just
not found the magical incantation).

Post by John Ames
Trying to access files on an SMB share programmatically is another
amazingly non-deterministic problem in this category; even apart from
"security" & permissions issues in newer versions of Windows (which
make up ~20% of our weekly call volume at my workplace,) but it can
make a difference whether you're running a GUI or command-line program
(not even a DOS program, but Win32 command-line,) whether it's a mapped
drive letter or a straight UNC path...ye *gods.*

I think "non-deterministic" is a pretty good description of Windows
in general. I've always thought of Windows quality criteria as
"Sort of works, most of the time."

Post by John Ames
(Win10 also does a thing where it'll randomly convert desktop shortcuts
on mapped drives to UNC path, for extra surprise fun, but that has more
to do with "helpful" Explorer behavior than NT kernel stuff, I'm sure.)

Ah yes, their famous "helpful" behaviour - like when I discovered that
if you create a listbox but leave it empty, Windows will "helpfully"
disable subclassed keyboard input so your shortcuts don't work. Grrr...

--
/~\ Charlie Gibbs | We'll go down in history as the
\ / <***@kltpzyxm.invalid> | first society that wouldn't save
X I'm really at ac.dekanfrus | itself because it wasn't cost-
/ \ if you read it the right way. | effective. -- Kurt Vonnegut

Lawrence D'Oliveiro

2024-08-13 22:06:49 UTC

... it can make a
difference whether you're running a GUI or command-line program (not
even a DOS program, but Win32 command-line,) whether it's a mapped drive
letter or a straight UNC path...ye *gods.*

Did you try a mount point as well?

Just that, every time I point out Microsoft’s “26 drive letters ought to
be enough for anybody” philosophy, somebody tries to claim that Windows
does *nix-style mount points as well.

(Win10 also does a thing where it'll randomly convert desktop shortcuts
on mapped drives to UNC path, for extra surprise fun, but that has more
to do with "helpful" Explorer behavior than NT kernel stuff, I'm sure.)

I don’t think there is a clear distinction between the “NT kernel” and the
rest of the parts of Windows any more. That’s why they had to implement
WSL the way they did.

56 Replies
23 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Lawrence D'Oliveiro 2024-08-05 01:05:10 UTC

John Levine 2024-08-05 02:41:47 UTC

Charlie Gibbs 2024-08-05 02:46:19 UTC

Lawrence D'Oliveiro 2024-08-05 03:04:50 UTC

R Daneel Olivaw 2024-08-08 19:32:32 UTC

Scott Lurndal 2024-08-08 22:42:06 UTC

Lawrence D'Oliveiro 2024-08-09 00:05:24 UTC

Dave Yeo 2024-08-11 02:06:15 UTC

Peter Flass 2024-08-12 19:18:01 UTC

Dave Yeo 2024-08-13 22:46:46 UTC

Charlie Gibbs 2024-08-14 02:48:36 UTC

Lawrence D'Oliveiro 2024-08-05 02:58:55 UTC

Joe Pfeiffer 2024-08-05 03:21:09 UTC

Lawrence D'Oliveiro 2024-08-05 04:15:07 UTC

John Levine 2024-08-06 02:18:16 UTC

Lawrence D'Oliveiro 2024-08-06 02:45:45 UTC

John Levine 2024-08-05 03:23:46 UTC

Sebastian 2024-08-27 05:18:33 UTC

Lawrence D'Oliveiro 2024-08-27 06:57:05 UTC

Sebastian 2024-08-27 07:33:40 UTC

Lawrence D'Oliveiro 2024-08-27 07:36:23 UTC

Sebastian 2024-08-27 08:19:28 UTC

Lawrence D'Oliveiro 2024-08-27 23:57:20 UTC

Sebastian 2024-08-28 03:42:52 UTC

Lawrence D'Oliveiro 2024-08-07 00:09:30 UTC

Dave Yeo 2024-08-11 02:09:37 UTC

Lawrence D'Oliveiro 2024-08-11 03:04:22 UTC

Dave Yeo 2024-08-11 23:16:29 UTC

Lawrence D'Oliveiro 2024-08-12 00:28:44 UTC

Bob Eager 2024-08-12 12:57:56 UTC

Charlie Gibbs 2024-08-12 22:48:29 UTC

Lawrence D'Oliveiro 2024-08-13 01:00:49 UTC

Scott Lurndal 2024-08-13 13:17:14 UTC

Lawrence D'Oliveiro 2024-08-13 22:03:52 UTC

Pip R. 2024-08-13 23:39:41 UTC

Scott Lurndal 2024-08-14 02:12:55 UTC

Don Poitras 2024-08-14 12:42:00 UTC

Anssi Saari 2024-08-14 14:05:01 UTC

Nuno Silva 2024-08-14 17:03:04 UTC

Scott Lurndal 2024-08-14 17:51:30 UTC

Mike Spencer 2024-08-15 21:04:22 UTC

Scott Lurndal 2024-08-15 22:59:46 UTC

Lawrence D'Oliveiro 2024-08-16 00:21:42 UTC

Chris Ahlstrom 2024-08-16 11:45:46 UTC

Lawrence D'Oliveiro 2024-08-16 00:20:01 UTC

j***@carcosa.net 2024-08-16 17:26:41 UTC

Ahem A Rivet's Shot 2024-08-16 17:58:00 UTC

Lawrence D'Oliveiro 2024-08-16 23:45:13 UTC

Peter Flass 2024-08-17 23:04:31 UTC

Lawrence D'Oliveiro 2024-08-18 00:24:42 UTC

Scott Lurndal 2024-08-14 02:12:27 UTC

Scott Lurndal 2024-08-14 14:09:53 UTC

Peter Flass 2024-08-12 19:18:02 UTC

Charlie Gibbs 2024-08-13 00:22:19 UTC

John Ames 2024-08-13 15:23:26 UTC

Charlie Gibbs 2024-08-13 17:17:32 UTC

Lawrence D'Oliveiro 2024-08-13 22:06:49 UTC

about - legalese

Loading...