ARM published Cortex-M55 MCU andEthos-U55 micro NPU

AIOT, the combination of Internet of Things (IoT) and Artificial Intelligence (AI), is a hot word in the Internet of Things industry recently. The traditional Internet of Things refers to a large number of sensors deployed in a specific space to collect and transmit environmental data, such as temperature, pressure and sound data on a regular basis. With the integration of artificial intelligence, the Internet of Things system can not only perceive environmental data, but also evolve into an intelligent Internet of Things through AI's deep learning technology.

The application of AIOT technology can not only reduce the labor cost, but also reflect the value of data analysis. After the deployment of the intelligent perception system, the operation of the enterprise will accumulate data every day, and the data will be converted into the decision basis by the high-efficiency computing unit, so that the enterprise can continuously learn and evolve, just like IoT makes the computer have a sense of touch and direction. AI gives computers vision and learning.

ARM has released a new Cortex-M55 processor to speed up the spread of AI to a wide range of devices. The new Cortex-M55 processor combines DSP, ML processing power, and a more efficient ML processor that can be used with the ethos-U55 NPU neural network processor used by Cortex-M.

Following the launch of two Ethos NPU processors last year, ARM is expanding its AI portfolio again with the release of the Cortex-M55 processor and the Ethos-U55, a tiny NPU neural network processor that will accelerate AI to billions of devices.

The Cortex-M55 is the first processor to feature ARM V8.1-M architecture and is equipped with Armthenial technology's M-Profile Vector Extension (MVE), which provides high performance and power efficiency Vector computing capabilities. It can improve DSP (digital signal processing) performance by 5 times and machine learning performance by 15 times compared to the previous Cortex-M.

ETHOS-U55 is ARM's first NPU neural network processor for Cortex-M. It can be used with Cortex-M55, Cortex-M33, Cortex-M7, Cortex-M4 and other processors. Reduce power consumption and significantly reduce the size of machine learning models. Developers can also Multiply Accumulate 32, 64, 128 and 256 MACs depending on the scenarios used. Enhance machine learning capabilities for volume-constrained embedded and IoT devices.

The Cortex-M55, paired with Ethos-U55, will provide gesture recognition, biometric recognition, speech recognition and other machine learning capabilities to significantly improve the intelligent application of terminal devices. When the terminal device collects data, it also has the ability of machine learning model inference. In addition to speeding up the response speed, it can also reduce the dependence on cloud AI and network.

In order to reduce the workload of terminal device development and accelerate the development of terminal AI, ARM also makes the original Cortex-M development tools support Cortex-M55 and ETHOS-U55, integrating the development process of traditional CPU, DSP and ML, and emphasizes starting from TensorFlow Lite Micro. Integrate and optimize the new machine learning framework.

Last year ARM launched two NPU processors for smart phones, digital televisions and home smart door locks. Together with the Cortex-A series processor and Mali GPU, ARM provides real-time video recognition and object classification on the device side. The arrival of the Cortex-M55 processor this year, along with the Cortex-M-dedicated ETHOS-U55, echoes ARM's goal of making AI ubiquitous, bringing AI processing power to billions or more of devices.

ARM Cortex-M55 technology features

Architecture - Armv8.1 -m

Bus interface -- AMBA 5 AXI5 64-bit master (compatible to AXI4 IPS)

Pipeline -- 4-stage (for main integer Pipeline)

Security -- ARM TrustZone Technology (Optional)

DSP Extension -- 32-bit DSP/SIMD Extension

M-Profile Vector Extension (MVE) -- Helium (optional)

Optional Floating-point Unit (FPU)

Coprocessor Interface -- 64-bit (optional)

Instruction Cache -- Up to 64KB with ECC (Optional)

Data Cache -- Up to 64KB with ECC (optional)

Instruction TCM (ITCM) -- Up to 16MB with ECC (Optional)

Data TCM (DTCM) -- Up to 16MB with ECC (Optional)

Interrupts -- Up to 480 Interrupts + non-maskable interrupt (NMI)

WAKE - Up Interrupt Controller (WIC) -- Internal and/or External (Optional)

Multiply-accumulate (MAC) /cycle -- Up to: 2 x 32-bit MACs/cycle, 4 x 16-bit MACs/cycle, 8 x 8-bit MACs/cycle

Sleep modes -- Multiple power domains, Sleep modes (Sleep and deep Sleep), sleep-on-exit, Optional retention support for memories and logic

DEBUG -- Hardware and Software Breakpoints, Performance Monitoring Unit (PMU)

Trace -- Optional Instruction Trace with Embedded Trace Macrocell (ETM), Data Trace (DWT) (selective Data - Trace), and Instrumentation Trace (ITM) (software trace)

ARM Custom Instructions -- Optional (Available in 2021)

Robustness -- ECC on instruction cache, data cache, instruction TCM, data TCM (optional); Bus interface protection (optional); PMC-100 (Programmable MBIST Controller, optional); Reliability, availability, and serviceability (RAS) extension