Discussion:
Mem86+ and floppy drives
(too old to reply)
Kerr-Mudd, John
2024-05-22 15:46:29 UTC
Permalink
Note: xposted to afc

On Wed, 22 May 2024 09:23:02 -0400
Windows 7 supports GPT partitioning, which removes the 32-bit MBR limitation
when defining storage. If you were using MSDOS partitioning on Windows 7, that
has a 2.2TB limit. When you buy a big drive, you use GPT so that all of
the disk can be used without a problem.
Is it possible to have both types of hard drive
partitioning on your system, GPT drives and MBR drives and your modern
BIOS would accomodate both and Windows Explorer would just 'see' both
drives without complaint?
Yes, absolutely.
It's nice when stuff works :-) That's for sure.
*******
The next area of interest, coming up, is NTFS only has four billion clusters.
2^64 clusters − 1 cluster (format); [No idea what this means, 2^64 is a CDB limit maybe]
256 TB size − 64 KB clusters ( early Windows 10 or less )
8 PB − 2 MB clusters ( late Windows 10 had more cluster sizes added, NotBackwardCompatible )
( if you show this partition to Win7, it "offers to format it", out of spite )
From this, we might conclude
16 TB size - C: drive Windows 10 install wants only 4 KB clusters (enables compression)
- It's possible Win7 might still accept 64 KB clusters for a Windows Install.
As a test of VirtualBox, I tried to create a large disk using a .vdi container.
This is the best I could do. In the Properties pie chart area for F: ,
File Explorer was unable to work out how much free space there was,
whereas good old Command Prompt worked it out.
F:\>dir
Volume in drive F is BIG
Volume Serial Number is EE25-2764
Directory of F:\
05/22/2024 07:21 AM 82,110 test.txt
1 File(s) 82,110 bytes
0 Dir(s) 562,949,287,706,624 bytes free
F:\>
We may be approaching a point, where some things "act up again",
just after the "trauma of 2.2TB" had passed :-)
Anyway, this little investigation makes me wonder what the OS
did with my backup drive. I had a few problems with partitioning it,
and may have stepped into this issue, without recognizing what
was going on. The OS is pretty crafty, and if you offer a large
volume (in modern Win10), it just restricts your cluster size
choices and does not offer any info you might benefit from,
for planning purposes. The above 512TB volume is using 128KB clusters,
as a guess as to how that worked out.
Size: 82.110 bytes
Size on disk: 131,072 bytes <== 128KB cluster, can't work with Win7
Paul
WIWAL, as a "HelpDesk Consultant"! some dept wanted to store huge number
of I dunno, some things, trouble was that they ran out of space much faster
than they imagined as every file on the Big Disk was say c. 1kB of
text, but the cluster size was (again making it up) c. 16kB.
--
Bah, and indeed Humbug.
Paul
2024-05-22 17:27:14 UTC
Permalink
Post by Kerr-Mudd, John
Note: xposted to afc
On Wed, 22 May 2024 09:23:02 -0400
Windows 7 supports GPT partitioning, which removes the 32-bit MBR limitation
when defining storage. If you were using MSDOS partitioning on Windows 7, that
has a 2.2TB limit. When you buy a big drive, you use GPT so that all of
the disk can be used without a problem.
Is it possible to have both types of hard drive
partitioning on your system, GPT drives and MBR drives and your modern
BIOS would accomodate both and Windows Explorer would just 'see' both
drives without complaint?
Yes, absolutely.
It's nice when stuff works :-) That's for sure.
*******
The next area of interest, coming up, is NTFS only has four billion clusters.
2^64 clusters − 1 cluster (format); [No idea what this means, 2^64 is a CDB limit maybe]
256 TB size − 64 KB clusters ( early Windows 10 or less )
8 PB − 2 MB clusters ( late Windows 10 had more cluster sizes added, NotBackwardCompatible )
( if you show this partition to Win7, it "offers to format it", out of spite )
From this, we might conclude
16 TB size - C: drive Windows 10 install wants only 4 KB clusters (enables compression)
- It's possible Win7 might still accept 64 KB clusters for a Windows Install.
As a test of VirtualBox, I tried to create a large disk using a .vdi container.
This is the best I could do. In the Properties pie chart area for F: ,
File Explorer was unable to work out how much free space there was,
whereas good old Command Prompt worked it out.
F:\>dir
Volume in drive F is BIG
Volume Serial Number is EE25-2764
Directory of F:\
05/22/2024 07:21 AM 82,110 test.txt
1 File(s) 82,110 bytes
0 Dir(s) 562,949,287,706,624 bytes free
F:\>
We may be approaching a point, where some things "act up again",
just after the "trauma of 2.2TB" had passed :-)
Anyway, this little investigation makes me wonder what the OS
did with my backup drive. I had a few problems with partitioning it,
and may have stepped into this issue, without recognizing what
was going on. The OS is pretty crafty, and if you offer a large
volume (in modern Win10), it just restricts your cluster size
choices and does not offer any info you might benefit from,
for planning purposes. The above 512TB volume is using 128KB clusters,
as a guess as to how that worked out.
Size: 82.110 bytes
Size on disk: 131,072 bytes <== 128KB cluster, can't work with Win7
Paul
WIWAL, as a "HelpDesk Consultant"! some dept wanted to store huge number
of I dunno, some things, trouble was that they ran out of space much faster
than they imagined as every file on the Big Disk was say c. 1kB of
text, but the cluster size was (again making it up) c. 16kB.
Sounds like "someone didn't budget for enough disk drives" :-)

Disk drives are cheaper than consultants.

For me, this would be a research project. XFS-on-Dokan ? NTFS new compression ?
Storing <700 byte files in the $MFT 1KB slot ? I've had four million files
in a partition, where they were all stored in the $MFT. No clusters needed.

This sounds like a "somebody elses problem" type problem :-)

Stick a bunch of them in a .cab file, and use the .cab integration ?
(Haven't tested that. Don't particularly like .cab) Maybe the Federated
Search can look inside the cab, and then the Windows.edb file blows out.

I think Windows also has .tar integration now, if you want to test that.
Stick your small files in a tar file. Have the Federated Search index it.

So much untested stuff.

But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.

Paul
Ahem A Rivet's Shot
2024-05-22 18:52:22 UTC
Permalink
On Wed, 22 May 2024 13:27:14 -0400
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
A battery backed RAM based FS could easily be optimised for small
files.
--
Steve O'Hara-Smith
Odds and Ends at http://www.sohara.org/
For forms of government let fools contest
Whate're is best administered is best - Alexander Pope
Paul
2024-05-22 21:52:48 UTC
Permalink
Post by Ahem A Rivet's Shot
On Wed, 22 May 2024 13:27:14 -0400
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
A battery backed RAM based FS could easily be optimised for small
files.
It's the file system stack that's the issue.

This is what I use for storage testing. HDTune really isn't
meant for testing a thing like this, because the program
came from the HDD era. This is drive D: . Naturally, there
are PCIe Rev5 NVMe faster than this.

[Picture]

Loading Image...

If I pick a bench from the collection -

4:53:30 4:58:09 Windows Defender running, writeidx3.exe 1048576 1048576 / 279 = 3758 files/sec

D:\TEMP

4.00 GB (4,294,967,296 bytes) 1048576 files, 4096 zeros each
1,048,576 Files, 69,906 Folders

delete rate via shift-delete, 6700 items per second

5:04:30 5:06:05 WD off, writeidx3.exe 1048576 1048576 / 95 = 11038 files/sec

Last files written: D:\TEMP\0\0\0\F\F\F\F\000FFFF0.txt .. 000FFFFF.txt

File creation would go slower, if they were all in the same directory (flat test).

On TMPFS on Linux, I've hit 186,000 created files a second,
for comparison. The only problem with TMPFS, is the performance
does not stay that fast forever. It's only good after a fresh boot.
But if you need to bench to make impressive figures, that is the
better way to do it. And that's not a recent TMPFS test, so it's
not really fair to Linux. The other result, is fresh off W11.

Paul
Kerr-Mudd, John
2024-05-22 20:16:39 UTC
Permalink
On Wed, 22 May 2024 13:27:14 -0400
Post by Paul
Post by Kerr-Mudd, John
Note: xposted to afc
On Wed, 22 May 2024 09:23:02 -0400
Windows 7 supports GPT partitioning, which removes the 32-bit MBR limitation
when defining storage. If you were using MSDOS partitioning on Windows 7, that
has a 2.2TB limit. When you buy a big drive, you use GPT so that all of
[]
Post by Paul
Post by Kerr-Mudd, John
WIWAL, as a "HelpDesk Consultant"! some dept wanted to store huge number
of I dunno, some things, trouble was that they ran out of space much faster
than they imagined as every file on the Big Disk was say c. 1kB of
text, but the cluster size was (again making it up) c. 16kB.
Sounds like "someone didn't budget for enough disk drives" :-)
Disk drives are cheaper than consultants.
This was way back - When I Was A Lad - the Big Disk was a big spend.
around 1986?
Post by Paul
For me, this would be a research project. XFS-on-Dokan ? NTFS new compression ?
Storing <700 byte files in the $MFT 1KB slot ? I've had four million files
in a partition, where they were all stored in the $MFT. No clusters needed.
This sounds like a "somebody elses problem" type problem :-)
Stick a bunch of them in a .cab file, and use the .cab integration ?
(Haven't tested that. Don't particularly like .cab) Maybe the Federated
Search can look inside the cab, and then the Windows.edb file blows out.
pre-cab
Post by Paul
I think Windows also has .tar integration now, if you want to test that.
pre-Windows also I think.
Post by Paul
Stick your small files in a tar file. Have the Federated Search index it.
They had bought a Program that saved everything into these small individual
files - horrible, I know.
Post by Paul
So much untested stuff.
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
Paul
--
Bah, and indeed Humbug.
Lawrence D'Oliveiro
2024-05-22 22:24:47 UTC
Permalink
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
Linux copes with them quite nicely, as long as you don’t try to use some
GUI tool to view the resulting directory. In the early days of online
video, I did a system for a client which would split the raw camera
footage into individual JPEG frames, from which bespoke movies could be
automatically generated for customers according to a template. You could
easily end up with 100,000 video-frame files in a single directory.

Another aspect in which NTFS is showing its age?
Paul
2024-05-23 02:18:39 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
Linux copes with them quite nicely, as long as you don’t try to use some
GUI tool to view the resulting directory. In the early days of online
video, I did a system for a client which would split the raw camera
footage into individual JPEG frames, from which bespoke movies could be
automatically generated for customers according to a template. You could
easily end up with 100,000 video-frame files in a single directory.
Another aspect in which NTFS is showing its age?
They did try to spin another file system, and they
weren't happy with the results. So what are you
going to do ?

It has a lot of "features". An IT person once tried to
explain how permissions work, and there must have been
fifty pages of information on the web site. And near the
end, the guy said "there are a couple more features,
but nobody has likely ever heard of them, so there is
no point in writing them up". It's like a bottomless
pit full of handcuffs.

Why should a thing like that be fast ? If you change anything
on it, you'd hear "you broke my whatsit, how dare you". As
a result, you cannot exactly simplify it. There could be
a paying customer handcuffed to it.

And a particular test case to break it, only requires the
transfer of 50-60GB of data, after which the message
"insufficient system resources" appears on the screen.
That is just one reason for inventing a second compression
feature for the file system ("New Compression"). I assume it
passes the test case, or they wouldn't have created it. There
are two types of compression, plus there are reparse points
to add "custom features" on top of the "standard" they've created.

Paul
Lawrence D'Oliveiro
2024-05-23 02:33:33 UTC
Permalink
Post by Lawrence D'Oliveiro
Another aspect in which NTFS is showing its age?
They did try to spin another file system, and they weren't happy with
the results. So what are you going to do ?
You mean ReFS?

At least Windows NT could have had a virtual filesystem layer, like Unix
had as far back as the 1980s, before NT was even thought of. Linux also
has this, and this allows it to support such a great variety of
filesystems.
And a particular test case to break it, only requires the transfer of
50-60GB of data, after which the message "insufficient system resources"
appears on the screen.
Interesting that the problems with copying large numbers of files seem to
be more due to limitations in Windows itself, rather than NTFS: Linux
itself can do those things on NTFS just fine
<https://www.theregister.com/2010/09/24/sysadmin_file_tools/>.
Andy Burns
2024-05-23 07:25:22 UTC
Permalink
Post by Lawrence D'Oliveiro
They did try to spin another file system, and they weren't happy with
the results. So what are you going to do ?
You mean ReFS?
ResFS, it dropped a few features (8.3 filenames, compression) and adds a
few (disk scrubbing to proactively detect bad sectors, fixing volumes
while online, rather than offline chkdsk)

I've used it on a few hyper-V servers, where the SAN has gone for cost
reasons and storage is all local on the server with h/w RAID)
Paul
2024-05-23 07:42:40 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Lawrence D'Oliveiro
Another aspect in which NTFS is showing its age?
They did try to spin another file system, and they weren't happy with
the results. So what are you going to do ?
You mean ReFS?
At least Windows NT could have had a virtual filesystem layer, like Unix
had as far back as the 1980s, before NT was even thought of. Linux also
has this, and this allows it to support such a great variety of
filesystems.
And a particular test case to break it, only requires the transfer of
50-60GB of data, after which the message "insufficient system resources"
appears on the screen.
Interesting that the problems with copying large numbers of files seem to
be more due to limitations in Windows itself, rather than NTFS: Linux
itself can do those things on NTFS just fine
<https://www.theregister.com/2010/09/24/sysadmin_file_tools/>.
One way to move files from one place to another like that is: Macrium.

Or for that matter, any of 20-30 backup/clone/restore utilities.

They move files at the cluster level.

When I attempt pathological cases here, for test, I store them using
Macrium so that it is easier (and faster) to reconstitute later.

Robocopy has retry capability, so if the problem was hardware related,
you might get a complete set that way. It also keeps a log, and you can
see which files it had trouble with.

It's even possible the source drive needed a CHKDSK. Or, there was
a bad block in there. With that many files in there, you'd want to be
careful about *which* version of CHKDSK you ran. I would want to know
your "backup status", before suggesting such a thing. CHKDSK, like
an fsck, can destroy a partition, and if "someone elses data" is involved,
that's the first thing I want to know, is how prepared you are for a disaster.

Macrium can also be used for defragmentation, if you restore a "larger"
partition to a "smaller" space, on purpose. It, in effect, switches
to file-by-file restoration, and the fragmentation is significantly
reduced. You would backup to a solid file, carry the solid file to
the second machine, then do the restore there.

60 million files is a lot of files. I can well imagine the challenge
of doing that. Not easy. A scaling problem. And the Wikipedia idea that NTFS
can handle 4 billion files (via some sort of "theoretical calculation") is
absurd. The thing would slow down so much, it would be heat death
of the universe before it finished. But as a technical person,
I would want to record the reason for each of these failures during
the 60 million copy, to see if we can really blame some aspect of
the implementation, versus some other reason. If you switch to a
cluster-level copy, it's an entirely different experience.

I hope this wasn't their backup strategy :-) The file-copying idea.
One reason I don't do things that way, is that's hard on the disk drive
if using conventional HDD. That's the impetus to do cluster-level transfer.
A cluster-level transfer, moves sequentially over the surface of the
disk, and the only "thumping" from the drive, is due to the use
of NTFS TXF (transactional NTFS). We don't control that, it's the
developers of the backup program who use that. But it's only two long
seeks per second, versus the "quiet" data transfer process.

Conventional HDD wear out in about a year, if you thrash them constantly.
I've not seen a post-mortem analysis as to whether the ribbon to the
head snapped, or what happened to the device.

Robocopy started out as a project by one employee on the side, and
today, it is an officially supported program shipped with the OS.
If it is having trouble with such a copy operation, you can use
the Feedback Hub and report a bug against it.

Paul
Lawrence D'Oliveiro
2024-05-23 07:49:48 UTC
Permalink
60 million files is a lot of files. I can well imagine the challenge of
doing that. Not easy. A scaling problem.
And yet rsync handled it all without skipping a beat.
Ahem A Rivet's Shot
2024-05-23 15:47:38 UTC
Permalink
On Thu, 23 May 2024 07:49:48 -0000 (UTC)
Post by Lawrence D'Oliveiro
60 million files is a lot of files. I can well imagine the challenge of
doing that. Not easy. A scaling problem.
And yet rsync handled it all without skipping a beat.
Yes rsync is a very well written piece of code.
--
Steve O'Hara-Smith
Odds and Ends at http://www.sohara.org/
For forms of government let fools contest
Whate're is best administered is best - Alexander Pope
Scott Lurndal
2024-05-23 16:25:50 UTC
Permalink
Post by Ahem A Rivet's Shot
On Thu, 23 May 2024 07:49:48 -0000 (UTC)
Post by Lawrence D'Oliveiro
60 million files is a lot of files. I can well imagine the challenge of
doing that. Not easy. A scaling problem.
And yet rsync handled it all without skipping a beat.
Yes rsync is a very well written piece of code.
It is worth pointing out that this thread is cross-posted to
the windows7 newsgroup. Windows filesystems basically suck.
Lawrence D'Oliveiro
2024-05-23 22:24:55 UTC
Permalink
Post by Scott Lurndal
Windows filesystems basically suck.
They do, but even in spite of that, Linux seems to do a better job of
managing them than Windows can manage.
Paul
2024-05-23 18:28:24 UTC
Permalink
Post by Lawrence D'Oliveiro
60 million files is a lot of files. I can well imagine the challenge of
doing that. Not easy. A scaling problem.
And yet rsync handled it all without skipping a beat.
This is my test log.

D:\TEMP>writeidx4 67108864 # These files are very small. My storage device is the limitation. All files in $MFT.
Writing 67108864 files
Start time (epoch seconds) 1716478386
Stop time (epoch seconds) 1716483434
Total time 5048
Done 67108864 files

Used space 86,366,027,776 7.45GB left

writeidx4-64million-Ddrive-736391-00-00.mrimg 2,381,377,130 bytes

It took Macrium 7 minutes to scan the file system and "make a list"
of the 60+ million files (it's future-proofing, for when it is
asked during a restore, to resize the partition). And then it took
only 3 minutes to write the clusters. It can't go faster than around
300MB/sec because it computes hashes as it goes. The file is protected
by hashes, to detect corruption.

I don't think I have anything else even remotely close as a recipe.
That's basically handling 100,000 files a second.

Yes, NTFS works. A little bit :-) It's like a premium piece of
swampland in Florida, above water at least twice a year.

And we can thank the availability of larger RAM sticks,
for making this project worth trying.

Just about any other interaction with the file collection,
takes too long.

Paul
j***@astraweb.com
2024-05-23 14:52:12 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
Linux copes with them quite nicely, as long as you don’t try to use some
GUI tool to view the resulting directory. In the early days of online
video, I did a system for a client which would split the raw camera
footage into individual JPEG frames, from which bespoke movies could be
automatically generated for customers according to a template. You could
easily end up with 100,000 video-frame files in a single directory.
Another aspect in which NTFS is showing its age?
And I would bet a dollar to a donut that someone, somewhere, has either created or modified a file
manager so that it will collapse files in a directory that meets a literal file portion in a user
defined template -- maybe in the style: <template>. * (9610)
Or, alternatively, there are at least two that will display only files meeting a user defined template
in either xtree or ztree -- the "Filespec" function (F).

--

jim
Paul
2024-05-23 19:40:45 UTC
Permalink
Post by j***@astraweb.com
Post by Lawrence D'Oliveiro
Post by Paul
But I agree, that small files are not a lot of fun. They're a curse,
and no matter how fast your computer is, the results will not be pretty.
Linux copes with them quite nicely, as long as you don’t try to use some
GUI tool to view the resulting directory. In the early days of online
video, I did a system for a client which would split the raw camera
footage into individual JPEG frames, from which bespoke movies could be
automatically generated for customers according to a template. You could
easily end up with 100,000 video-frame files in a single directory.
Another aspect in which NTFS is showing its age?
And I would bet a dollar to a donut that someone, somewhere, has either created or modified a file
manager so that it will collapse files in a directory that meets a literal file portion in a user
defined template -- maybe in the style: <template>. * (9610)
Or, alternatively, there are at least two that will display only files meeting a user defined template
in either xtree or ztree -- the "Filespec" function (F).
--
jim
D:\TEMP>dir /s 0 # 3,885,286,833 bytes worth of file listing...

Directory of D:\TEMP\0\0\3\F\F\F\F

05/23/2024 12:57 PM <DIR> .
05/23/2024 12:57 PM <DIR> ..
05/23/2024 12:57 PM 7 000003FFFF00.txt
...
05/23/2024 12:57 PM 7 000003FFFFFF.txt <=== the 67108864th or so file (256 files per bottom level directory)
256 File(s) 1,792 bytes

Total Files Listed:
67108864 File(s) 469,762,048 bytes
838865 Dir(s) 7,819,739,136 bytes free

Some things which are using GetNextFile, they
smoke File Explorer and make File Explorer look stoopid.

D:\TEMP>dir /s 0 | D:\pigz.exe -c > d:\dir-out.txt.gz

The faster case, is when you have a large number of
files in a single folder. dir can do things that
just kill File Explorer.

File Explorer, at least on one machine with plenty of RAM,
stops malloc'ing at around 15GB, while trying to "explore"
a large flat directory, and the "wheel of death" just
spins forever. I think it's neat that somebody slapped
a quota on that puppy. I wanted it to consume all the memory
on the machine, because that's what the memory is for.

Paul
Lawrence D'Oliveiro
2024-05-23 22:11:52 UTC
Permalink
Post by j***@astraweb.com
And I would bet a dollar to a donut that someone, somewhere, has either
created or modified a file manager so that it will collapse files in a
directory that meets a literal file portion in a user defined template
Some filesystems have support for “tail packing”. That is, that last
sector that is not completely filled by the last part of one file is
shared among multiple files. Being done in the filesystem, this is
completely transparent to applications.
Carlos E.R.
2024-05-23 22:26:25 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by j***@astraweb.com
And I would bet a dollar to a donut that someone, somewhere, has either
created or modified a file manager so that it will collapse files in a
directory that meets a literal file portion in a user defined template
Some filesystems have support for “tail packing”. That is, that last
sector that is not completely filled by the last part of one file is
shared among multiple files. Being done in the filesystem, this is
completely transparent to applications.
Reiserfs.

Which behaves very well with small files, millions of them in a single
direrctory.
--
Cheers, Carlos.
Loading...