Post by Anne & Lynn Wheeler
Original CP67 was FIFO, at the univ. as undergraduate in the 60s, I
changed it to ordered seek ... also did chained requests for paging in
one channel program (ordered by rotation, if didn't require seek)
... instead of separate channel program/SIO for every page (which was
one of the things retained for vm370, but a lot of other stuff was
dropped and/or at least drastically simplified).
cp67 peaked around 80 page I/O transfers per second with 2301 fixed head
drums. with rotational chaining got it up to 270 page I/O transfers per
second (nine transfer per two rotations, drum formatted nine 4k pages
per pair of tracks with one of the 4k records spanning the end of one
track and the start of the next).
2301 & 2303 fixed head drums, except 2301 read/wrote on four heads in
parallel, 1/4 the number of "tracks", each track four times larger, and
four times the transfer rate of 2303 ... 60 revs/second, 9 page
transfers per pair of revolutions; (60/2)*9 = 270/sec.
CP67 had a special CHFREE function ... that was invoked by the
interrupt handler has soon as device handler got past initial
phase ... which drastically cut the device redrive latency
(for queued requests).
One of the things that got simplified in morph to VM370 ... queued
request redrive wasn't checked until previous device interrupt had been
completely handled ... significantly increasing latency for starting
After transfer from cambridge science center to san jose research in
later part of the 70s ... I got to wander around most IBM and customer
locations in silicon valley ... including bldg14 (disk engineering) and
bldg15 (disk product test) across the street from SJR. At the time 14/15
were running dedicated, prescheduled, stand-alone mainframe testing,
7x24. The had recently tried MVS (for some concurrent testing), but MVS
had 15min mean-time-between failure in that environment. I offerred to
rewrite input/output supervisor to make in bullet proof and never fail
... so they could do any amount of on-demand concurrent testing (creatly
Downside was they started pointing the figure at my software when ever
there was a problem ... and I spent a lot of time playing disk engineer
shooting their hardware problems. I had also effectively reimplemented
CHFREE in VM370 ... significantly cutting redrive latency.
This turned up another problem in the new 3880 disk controller. While it
supported 3mbye/sec transfer with special hardware bypass ... everything
else was handled by a really slow JIB-prime processor (making everything
but actual data transfer much slower than 3830 controller that had a
fast horisontal microprogram processor). Trying to mask how slow the
3880 had become they tried to present end of channel program interrupt
... before the 3880 was actually done ... hoping that the extra
processing would be hidden between the 3880 queued the end-of-operation
interrupt and the time the system tried to redrive a new I/O.
Bldg15 product test got #2 or #3 operational engineering processor for
doing disk i/o channel testing and had the first 3033 outside POK and
the first 4341 outside Endicott. Since product channel i/o testing used
trivial amounts of CPU ... we put up private online service on the 3033
with a 3830 and two spare strings of 3330 (16) drives.
Early monday morning, I got irate call from bldg15 asking what I had
done to the online service software ... online response had horribly
deteriorated. They repeatedly denied making any change ... until I
tracked down that they had swapped the 3830 controller for a test 3880
controller. 3880 was presenting ending interrupt ... I was almost
immediately responding with SIOF for queued request, because 3880 was
still busy, it responded with cc=1, SM+BUSY (controller busy), and i had
to requeue the request and wait for the CUE (control unit busy end)
interrupt before retrying the request again. This was six months before
any 3880s shipped to customers and they came up with some 3880 microcode
changes that tried to do a better job of masking the problem.
I write an IBM internal only report about the work for bldg14&15 and
happen to mention the MVS 15min MTBF ... for which the MVS group
attempts to get me separated from the company ... we that fails, they
try and make my career in IBM irritable in other ways.
Note that 3090 had design number of channels based on total channel busy
for each channel assuming 3830 controller performance, However, when
they started real live testing ... they found 3880 drastically increased
channel busy for each operation ... and as a result 3090 had to
significantly increase the number of channels (trying to achieve desired
total system throughput). The increase in number of channels required an
extra TCM ... and 3090 product group semi-facetiously claimed that the
3880 product group had to credit 3090 group for the manufacturing cost
of the additional TCM for each 3090.
Note IBM marketing then respun the significant increase in number of
3090 channels as it being a marvelous I/O throughput machine (rather
than the increase in channels to compensate for the enormous increase in
channel busy caused by the slow 3880 controller).
Other channel trivia: In 1980, STL was bursting at the seams and
planning on moving 300 people from the IMS group to offsite bldg, with
dataprocessing service back to STL datacenter. The people had tried
remote 3270 support and found the human factors totally unacceptable.
I get con'ed into doing channel extender support so they can have
local channel attached 3270 controllers at the offsite bldg (weren't
able to see response difference between offsite and in STL).
Hardware vendor tries to get IBM to approve allowing them to ship my
support ... but there is group in POK playing with some serial stuff
that gets releasing my stuff vetoed (they were afraid that if it was in
the market, it would make it more difficult to ship their stuff).
In 1988, I'm asked to help LLNL standardize some serial stuff they are
playing with that quickly becomes Fibre Channel Standard (including some
of the stuff I had done in 1980). Then in 1990, the POK people get their
stuff released with ES/9000 as ESCON (when it is already obsolete). Then
some of the POK people become involved with Fibre Channel Standard
and define an extremely heavy weight protocol that drastically reduces
the native throughput ... which is eventually released as FICON. The
most recent mainframe PEAK IO benchmark I've found is Z196 that used 104
FICON to get 2M IOPS. At that time, there was a Fibre Channel announced
for E5-2600 blade claiming over million IOPS (two such Fibre Channel
have higher throughput than 104 FICON running over 104 Fibre Channel).
virtualization experience starting Jan1968, online at home since Mar1970