Benchmark ========= Input Sequence Size = :math:`16 \times 1024` Single GPU ---------------- +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | Model | Device | Runtime (ms) | Memory (GB) | Energy (J/token) | Throughput (token/sec) | +=====================+======================+==============+===========+=========+=============+===========+========+=============+===========+=========+=============+===========+=========+ | | | Measured | Estimated | Error | Measured | Estimated | Error | Measured | Estimated | Error | Measured | Estimated | Error | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-1.3b | NVIDIA RTX A5000 | 693.77 | 740.79 | 6.35% | 5.53 | 5.52 | -0.17% | 0.010 | 0.010 | 6.35% | 42.34 | 45.21 | 6.35% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-6.7b | NVIDIA RTX A5000 | 3275.30 | 3300.85 | 0.77% | 15.48 | 15.47 | -0.06% | 0.046 | 0.046 | 0.77% | 199.91 | 201.47 | 0.77% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-60m | NVIDIA RTX A5000 | 121.46 | 112.05 | -8.40% | 6.16 | 6.15 | -0.16% | 0.002 | 0.002 | -8.40% | 7.41 | 6.84 | -8.40% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-200m | NVIDIA RTX A5000 | 241.97 | 226.40 | -6.87% | 6.50 | 6.49 | -0.30% | 0.003 | 0.003 | -6.87% | 14.77 | 13.82 | -6.87% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-400m | NVIDIA RTX A5000 | 380.49 | 355.55 | -7.01% | 6.87 | 6.85 | -0.30% | 0.005 | 0.005 | -7.01% | 23.22 | 21.70 | -7.01% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-1.1b | NVIDIA RTX A5000 | 747.17 | 716.71 | -4.25% | 8.17 | 8.13 | -0.51% | 0.010 | 0.010 | -4.25% | 45.60 | 43.74 | -4.25% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-1.3b | NVIDIA RTX A6000 | 510.96 | 594.34 | 14.03% | 5.53 | 5.52 | -0.17% | 0.009 | 0.011 | 14.03% | 31.19 | 36.28 | 14.03% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-6.7b | NVIDIA RTX A6000 | 2150.14 | 2121.35 | -1.36% | 15.48 | 15.47 | -0.06% | 0.039 | 0.039 | -1.36% | 131.23 | 129.48 | -1.36% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-60m | NVIDIA RTX A6000 | 109.11 | 76.87 | -41.93% | 6.16 | 6.15 | -0.16% | 0.002 | 0.001 | -41.93% | 6.66 | 4.69 | -41.93% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-200m | NVIDIA RTX A6000 | 204.46 | 178.60 | -14.48% | 6.50 | 6.49 | -0.30% | 0.004 | 0.003 | -14.48% | 12.48 | 10.90 | -14.48% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-400m | NVIDIA RTX A6000 | 315.30 | 295.36 | -6.75% | 6.87 | 6.85 | -0.30% | 0.006 | 0.005 | -6.75% | 19.24 | 18.03 | -6.75% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-1.1b | NVIDIA RTX A6000 | 600.65 | 613.92 | 2.16% | 8.17 | 8.13 | -0.51% | 0.011 | 0.011 | 2.16% | 36.66 | 37.47 | 2.16% | +---------------------+----------------------+--------------+-----------+---------+-------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ Multi GPU ---------------- +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | Model | Device | Runtime (ms) | Memory (GB) | Energy (J/token) | Throughput (token/s) | +=====================+======================+=============+===========+=========+============+===========+========+=============+===========+=========+=============+===========+=========+ | | | Measured | Estimated | Error | Measured | Estimated | Error | Measured | Estimated | Error | Measured | Estimated | Error | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-1.3b | NVIDIA RTX A5000 x 8 | 1311.08 | 1415.36 | 7.37% | 4.99 | 4.98 | -0.16% | 0.018 | 0.020 | 7.37% | 80.02 | 86.39 | 7.37% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-6.7b | NVIDIA RTX A5000 x 8 | 4241.64 | 4287.28 | 1.06% | 16.04 | 16.03 | -0.05% | 0.060 | 0.060 | 1.06% | 258.89 | 261.67 | 1.06% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-60m | NVIDIA RTX A5000 x 8 | 124.78 | 131.60 | 5.18% | 7.62 | 7.68 | -0.12% | 0.002 | 0.002 | 5.18% | 7.62 | 8.03 | 5.18% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-200m | NVIDIA RTX A5000 x 8 | 245.71 | 239.86 | -2.44% | 7.87 | 7.85 | -0.26% | 0.004 | 0.004 | -2.44% | 15.00 | 14.64 | -2.44% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-400m | NVIDIA RTX A5000 x 8 | 356.11 | 360.63 | 1.25% | 8.10 | 8.08 | -0.25% | 0.007 | 0.007 | 1.25% | 21.74 | 22.01 | 1.25% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-1.1b | NVIDIA RTX A5000 x 8 | 673.65 | 639.71 | -5.30% | 8.91 | 8.89 | -0.20% | 0.012 | 0.012 | -5.30% | 41.12 | 39.04 | -5.30% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-1.3b | NVIDIA RTX A6000 x 8 | 1199.32 | 1219.91 | 1.69% | 4.99 | 4.98 | -0.16% | 0.022 | 0.022 | 1.69% | 73.20 | 74.46 | 1.69% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | facebook/opt-6.7b | NVIDIA RTX A6000 x 8 | 3755.24 | 3319.92 | -13.11% | 16.04 | 16.03 | -0.05% | 0.069 | 0.061 | -13.11% | 229.20 | 202.63 | -13.11% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-60m | NVIDIA RTX A6000 x 8 | 115.16 | 96.43 | -19.42% | 7.62 | 7.61 | -0.12% | 0.002 | 0.002 | -19.42% | 7.03 | 5.89 | -19.42% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-200m | NVIDIA RTX A6000 x 8 | 212.33 | 189.00 | -12.35% | 7.87 | 7.85 | -0.26% | 0.004 | 0.003 | -12.35% | 12.96 | 11.54 | -12.35% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-400m | NVIDIA RTX A6000 x 8 | 309.74 | 294.85 | -5.05% | 8.09 | 8.07 | -0.25% | 0.006 | 0.005 | -5.05% | 18.91 | 18.00 | -5.05% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+ | AlCrossSim/clm-1.1b | NVIDIA RTX A6000 x 8 | 568.35 | 569.74 | 0.24% | 8.91 | 8.89 | -0.20% | 0.010 | 0.010 | 0.24% | 34.69 | 34.77 | 0.24% | +---------------------+----------------------+-------------+-----------+---------+------------+-----------+--------+-------------+-----------+---------+-------------+-----------+---------+