Post by Peter FlassIBM never used MIPS either, but rated processors relative to each other. I
always thought they did this to avoid comparisons to other vendors’
machines, but it was probably as much because it was meaningless, as you
say.
IBM tries stamp out industry standard benchmark numbers for their
mainframes. I had worked with Jim Gray at IBM Research on System/R
(original SQl/Relational implementation) and he then palmed off some
stuff on me when he left for Tandem ... where he pioneers DBMS TPC
benchmarks
http://www.tpc.org/information/who/gray5.asp
Periodically some mainframe numbers manage to leak. Industry standard
MIPS benchmark wasn't instructions/sec but number of program iterations
compared to the standard's processor ... assumed to be one MIP machine
(and easier to find for IBM's non-mainframe systems).
IBM mainframe this century
z900, 16 processors, 2.5BIPS (156MIPS/proc), Dec2000
z990, 32 processors, 9BIPS, (281MIPS/proc), 2003
z9, 54 processors, 18BIPS (333MIPS/proc), July2005
z10, 64 processors, 30BIPS (469MIPS/proc), Feb2008
z196, 80 processors, 50BIPS (625MIPS/proc), Jul2010
EC12, 101 processors, 75BIPS (743MIPS/proc), Aug2012
z13, 140 processors, 100BIPS (710MIPS/proc), Jan2015
z14, 170 processors, 150BIPS (862MIPS/proc), Aug2017
z15, 190 processors, 190BIPS* (1000MIPS/proc), Sep2019
* pubs say z15 1.25 times z14 (1.25*150BIPS or 190BIPS)
* z16, 200?? processors, ???BIPS (???MIPS/proc),
Max configured z196 was $30M ($600,000/BIPS) and in that timeframe,
cloud megadatacenters (with half million or more systems) standard was
E5-2600 blades benchmarked at 500BIPS. This was shortly before IBM sold
off its blade server business ... but IBM had E5-2600 base list price of
$1815 ($3.60/BIPS). However, cloud megadatacenters had been saying for
at least a decade that they assembled their own systems for 1/3rd the
cost of brand name blade servers ($1.20/BIPS) ... likely contributing to
IBM selling off its blade server business ... this was also about the
time that server chip maker press said that they were shipping half
their product directly to cloud megadatacenters.
z196 pubs also claimed that over half the per processor performance
improvement from z10 to z196 was the introduction of cache miss (memory
latency) compensation features (that had been in other platforms in some
cases for decades) ... out-of-order execution, branch prediction,
speculative execution, etc.
Big cloud operators (with dozen or more megadatacenters around the
world, each with half million or more systems) had so drastically
reduced their system costs that power & cooling were becoming
increasingly large part of their costs ... and they were putting
pressure on Intel/AMD to significantly increase computational power
efficiency (and looking at moving to ARM, originally designed for low
power, battery use, computational power efficiency offsetting increasing
the number of systems). They had so decreased cost of systems that they
could justify complete upgrade of all systems when there was improvement
computational power efficiency. Also start seeing TPC including
computational efficiency in benchmarks, being able to calculate
electrical power cost per transaction ... and IBM still participates for
their non-mainframe systems.
https://www.tpc.org/tpcc/results/tpcc_results5.asp?orderby=hardware
trivia: 50+ yrs ago, IBM 195 mainframe had out-of-order execution but
conditional branches drained the pipeline ... so most 360&370 codes ran
at half the 195 throughput. Shortly after joining IBM, the 195 group
tried to suck me into helping with hyperthreading the 195 ... simulating
two processor with two threads, each running at half machine throughput.
In this description of the shutdown of ACS/360 (executives were afraid
it would advance the state of art too fast and IBM would loose control
of the market) ... there is reference to multithreading patents
https://people.cs.clemson.edu/~mark/acs_end.html
new work for 195 (including multithread) was canceled when it was
decided to add "virtual memory" to all 370s ... and it wasn't believed
that adding virtual memory to 195 was practical. Also claims that at the
time, MVT/MVS two-processor SMP systems were claimed to only have
1.2-1.5 system throughput of single processor (because of their
multiprocessor system software overhead) ... which would have more than
offset any benefit of having a multithreaded (simulated two processor)
195.
other trivia: in the morph from cp67->vm370, they simplified and/or
dropped (including dropping cp67 multiprocessor support). after joining
IBM, one of my hobbies was enhanced production operating systems for
internal datacenters ... including world-wide, sales&market support
"HONE" systems. The US HONE systems had been consolidated in Palo Alto
in the mid-70s (when facebook 1st moves into silicon valley, it was into
a new bldg built next door to the former US HONE datacenter). Their
VM370 was enhanced to have eight loosely-coupled (cluster, shared DASD)
single-system image, aka cluster with load balancing and fall-over
across the complex. I had added lots of CP67 features/function back into
VM370 and then (initially for HONE) added tightly-coupled (shared
memory) multiprocessor into VM370 release 3 ... giving them 16 processor
complex (at the time I considered largest IBM mainframe single-system
image complex; some ACP/TPF complexes may have had eight loosely-coupled
system complex but ACP/TPF didn't have tightly-coupled support so was
limited to one processor/system), With very short multiprocessor
pathlengths and some games with "cache affinity" ... could get two
processor machine with twice the throughput of single processor machine
(improved cache hit rate offsetting the multiprocessor software
overhead).
--
virtualization experience starting Jan1968, online at home since Mar1970