AWS EC2 推出 M5a/R5a/T3a 等使用 AMD CPU 的 主機分別有大約一年/半年左右的時間了,因為平時使用到美國境內區域的比例比較少、亞洲更新的時間相對還是比較晚,所以幾個月前才陸續開始使用這幾個系列的機器 (截至目前為止東京都還不是每個 AZ 都可以開 T3a 的機器),除了官方宣稱的 10% 左右價差 (跟 M5/R5/T3 相比) 以外,對於效能這方面並沒有提供太多資訊,想像上以及一些其他管道的資訊多是和同系列但使用 Intel 處理器的主機效能約略相同、或少個 10% 以內的效能,但實際上如何好像沒有太多的討論或比較,這邊就來用 sysbench 這套工具很簡單的比較一下幾款比較常用到、不同系列的 AWS EC2 主機 CPU 效能,方便起見,使用預設值下取跑、且都拿 .large 這個大小的機器來比、只看單個執行緒的測試結果
測試環境:
- Ubuntu: 18.04.3 LTS
- Linux kernel: 4.15.0-1052-aws
- sysbench: 1.0.18
比較對象如下,共十一種機型:
- General purpose
- T2
- T3
- T3a
- M4
- M5
- M5a
- Compute optimized
- C4
- C5
- Memory optimized
- R4
- R5
- R5a
因為要測試的是 CPU 的效能,一些跟磁碟、網路有關衍生出來的機型例如 C5d / C5n / M5dn / R5dn 等也就不列入測試範圍了。另外 T1 / C3 / M3 / M4 這種前幾個世代比較舊的機器也不列入這次測試,現在要開機器應該不會開這麼舊的來用了,沒事不會同時跟錢包還有效能過不去,除非有什麼比較久遠的 AMI 沒辦法在新機器上開起來非得用舊機器跑不可 …
其中 T2 / T3 / T3a 系列要靠 CPU Credits 才能使用到最佳效能的機器,在測試時都是開 Unlimited Mode 下去測的 (實際上 sysbench 預設只會跑十秒左右,連 1 CPU Credit 都用不滿) ,拿沒有被卡住的效能比較才比較有
關於各機型的規格/價格比較,建議可以參考 EC2Instances.info ,我覺得這網站還不錯很方便:
接下來就直接看測試結果:
# t2.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 909.23
General statistics:
total time: 10.0008s
total number of events: 9095
Latency (ms):
min: 1.09
avg: 1.10
max: 1.17
95th percentile: 1.10
sum: 9983.83
Threads fairness:
events (avg/stddev): 9095.0000/0.00
execution time (avg/stddev): 9.9838/0.00
# t3.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1082.23
General statistics:
total time: 10.0008s
total number of events: 10825
Latency (ms):
min: 0.88
avg: 0.92
max: 9.74
95th percentile: 0.95
sum: 9995.95
Threads fairness:
events (avg/stddev): 10825.0000/0.00
execution time (avg/stddev): 9.9960/0.00
# t3a.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1234.78
General statistics:
total time: 10.0001s
total number of events: 12350
Latency (ms):
min: 0.78
avg: 0.81
max: 1.28
95th percentile: 0.83
sum: 9995.83
Threads fairness:
events (avg/stddev): 12350.0000/0.00
execution time (avg/stddev): 9.9958/0.00
# m4.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 912.43
General statistics:
total time: 10.0009s
total number of events: 9127
Latency (ms):
min: 1.09
avg: 1.09
max: 4.21
95th percentile: 1.10
sum: 9986.15
Threads fairness:
events (avg/stddev): 9127.0000/0.00
execution time (avg/stddev): 9.9861/0.00
# m5.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1096.93
General statistics:
total time: 10.0009s
total number of events: 10972
Latency (ms):
min: 0.91
avg: 0.91
max: 1.03
95th percentile: 0.92
sum: 9998.14
Threads fairness:
events (avg/stddev): 10972.0000/0.00
execution time (avg/stddev): 9.9981/0.00
# m5a.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1433.26
General statistics:
total time: 10.0007s
total number of events: 14336
Latency (ms):
min: 0.66
avg: 0.70
max: 1.57
95th percentile: 0.80
sum: 9997.21
Threads fairness:
events (avg/stddev): 14336.0000/0.00
execution time (avg/stddev): 9.9972/0.00
# r4.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 912.11
General statistics:
total time: 10.0012s
total number of events: 9124
Latency (ms):
min: 1.09
avg: 1.09
max: 1.35
95th percentile: 1.10
sum: 9986.20
Threads fairness:
events (avg/stddev): 9124.0000/0.00
execution time (avg/stddev): 9.9862/0.00
# r5.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1095.93
General statistics:
total time: 10.0008s
total number of events: 10962
Latency (ms):
min: 0.91
avg: 0.91
max: 1.27
95th percentile: 0.92
sum: 9997.97
Threads fairness:
events (avg/stddev): 10962.0000/0.00
execution time (avg/stddev): 9.9980/0.00
# r5a.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1329.11
General statistics:
total time: 10.0007s
total number of events: 13294
Latency (ms):
min: 0.67
avg: 0.75
max: 0.88
95th percentile: 0.83
sum: 9997.24
Threads fairness:
events (avg/stddev): 13294.0000/0.00
execution time (avg/stddev): 9.9972/0.00
# c4.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1032.31
General statistics:
total time: 10.0010s
total number of events: 10326
Latency (ms):
min: 0.96
avg: 0.97
max: 1.06
95th percentile: 0.97
sum: 9988.51
Threads fairness:
events (avg/stddev): 10326.0000/0.00
execution time (avg/stddev): 9.9885/0.00
# c5.large
$ sysbench cpu run
sysbench 1.0.18 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1189.51
General statistics:
total time: 10.0001s
total number of events: 11897
Latency (ms):
min: 0.83
avg: 0.84
max: 1.03
95th percentile: 0.86
sum: 9997.20
Threads fairness:
events (avg/stddev): 11897.0000/0.00
execution time (avg/stddev): 9.9972/0.00
跑測試的時候其實都會跑三次以避免有誤差太大的情況,不過實際上看起來誤差都非常小,甚至有測出來兩次的一樣的情況,所以就取三次裡面結果落在中間的那次貼就好。
最後拿 CPU speed: events per second 做成簡單的長條圖以便視覺化比較
滿有趣的是 T3a / R5a / M5a 都比同系列的 T3/R5/M5 效能來的更勝一籌,差距甚至比跟上一代主機(T2 / R4 / M4)的效能落差來的還大,難道是因為 Intel 的 CPU 不斷被爆出安全性漏洞後的修補結果嗎 XD