MPI-SpeedUp for SpinPack (matrix at memory)

SPinpack MPI-SpeedUp measured
    Checked for up to 1000 cores now (Jul08)!

    linear in plot: x=log(CPUs) y=log(t1/t)
     lg2(t1/t) =     b * lg2(CPUs)   # t1 extrapolated 1CPU-time
         t1/t  = (2^(b))^lg2(CPUs)   # 2^b = SpeedUP2 (CPU-doubling)
         t1/t2 = (2^(b))^lg2(2) = 2^b = SpeedUp2
      BWFactor = (t1/t(1CPU))        # Band Width Factor

    OverallSpeedUp = SMPSpeedUP * MPISpeedUp
        SMPSpeedUp =            SMPSpeedUp2 ^ lg2(SMPCores)
        MPISpeedUp = BWFactor * MPISpeedUp2 ^ lg2(MPINodes)

    SpeedUp2
      v2.33: SMP = 1.66 (up to 32 CPUs),  MPI = 1.46 (up to 64 nodes)
      v2.36: SMP = 1.66 (up to 32 CPUs),  MPI = 1.69 (up to 50 nodes)

      BWFactor 100Mbit/s = ca.  40% ( 40% float, 25% double, 2*2GHz)
                 1Gbit/s = ca. 100% (100% float, 70% double, 4*2Ghz)
              2*10Gbit/s = ca. 100% (estimated, BW*Cores/Node)

    extrapolation:
      v2.33: SpeedUp = 1.66^lg2(SMPCores) * 1.46^lg2(MPINodes)
      v2.36: SpeedUp = 1.66^lg2(SMPCores) * 1.69^lg2(MPINodes) ??