MLPerf Results Show Advances in Machine Learning Inference Performance and Efficiency
Today MLCommons™, an open engineering consortium, released new results for three MLPerf™ benchmark suites - Inference v2.0, Mobile v2.0, and Tiny v0.7. These three benchmark suites measure the performance of inference - applying a trained machine learning model to new data. Inference enables adding intelligence to a wide range of applications and systems. Collectively, these benchmark suites scale from ultra-low power devices that draw just a few microwatts all the way up to the most powerful datacenter computing platforms. The latest MLPerf results demonstrate wide industry participation, an emphasis on energy efficiency, and up to 3.3X greater performance, ultimately paving the way for more capable intelligent systems to benefit society at large.
The MLPerf benchmarks are full system tests that stress machine learning models, software, and hardware and optionally measure power usage. The open-source and peer-reviewed benchmark suites provide a level playing field for competition that drives innovation, performance, and energy-efficiency for the entire industry.
"This was an outstanding effort by the ML community with so many new participants and the tremendous increase in the number and diversity of submissions," said David Kanter, Executive Director of MLCommons. "I'm especially excited to see greater adoption of power and energy measurements, highlighting the industry's focus on efficient AI."
The MLPerf Inference benchmarks primarily focus on datacenter and edge systems and submitters include Alibaba, ASUSTeK, Azure, Deci.ai, Dell, Fujitsu, FuriosaAI, Gigabyte, H3C, Inspur, Intel, Krai, Lenovo, Nettrix, Neuchips, NVIDIA, Qualcomm Technologies, Inc., Supermicro, and ZhejiangLab. This round set new records with over 3,900 performance results and 2,200 power measurements, respectively 2X and 6X more than the prior round, demonstrating the momentum of the community.
The MLPerf Mobile benchmark suite targets smartphones, tablets, notebooks, and other client systems with the latest submissions highlighting an average 2X performance gain over the pevious round. MLPerf Mobile v2.0 includes a new image segmentation model, MOSAIC, that was developed by Google Research with feedback from MLCommons. The MLPerf Mobile application and the corresponding source code, which incorporates the latest updates and submitting vendors' backends, are expected to be available in the second quarter of 2022.
The MLPerf Tiny benchmark suite is intended for the lowest power devices and smallest form factors, such as deeply embedded, intelligent sensing, and internet-of-things applications. The second round of MLPerf Tiny results showed tremendous growth in collaboration with submissions from Alibaba, Andes, hls4ml-FINN team, Plumerai, Renesas, Silicon Labs, STMicroelectronics, and Syntiant. Collectively, these organizations submitted 19 different systems with 3X more results than the first round and over half the results incorporating energy measurements, an impressive achievement for the first benchmarking round with energy measurement.
MLCommons would like to congratulate first time MLPerf Inference submitters ASUSTeK, H3C, and ZhejiangLab and also Gigabyte and Fujitsu for their first power measurements along with first time MLPerf Tiny submitters Alibaba, Andes, Plumerai, Renesas, Silicon Labs, and STMicroelectronics and also the hls4ml-FINN team and Syntiant on their first energy measurements.
To view the results and find additional information about the benchmarks please visit:
MLCommons is an open engineering consortium with a mission to benefit society by accelerating innovation in machine learning. The foundation for MLCommons began with the MLPerf benchmark in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. In collaboration with its 50+ founding partners - global technology providers, academics and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire machine learning industry through benchmarks and metrics, public datasets and best practices.