The MPPA-256 is the first member of the MPPA MANYCORE family and is composed of an array of clusters and I/O subsystems, themselves connected by two NoCs.
The MPPA® core is a 32-bit Very Long Instruction Word (VLIW)
processor made of:
- One Branch/Control Unit
- Two Arithmetic Logic Units
- One Load/Store Unit including simplified ALU
- One Multiply-Accumulate (MAC) / FPU including a simplified ALU
- Standard IEEE 754-2008 FPU with advanced Fused Multiply-Add (FMA) and dot product operators
- One Memory Management Unit (MMU)
This enables to execute up to five 32bit RISC like integer operations every clock cycle.
Each compute cluster is composed of:
- 16 identical cores with private FPU and MMU
- Dynamic Voltage and Frequency Scaling (DVFS) and Dynamic Power Switch off (DPS) support
- 1 system core with private FPU and MMU
- An instruction and data L1-cache per core
- 1 smart Direct Memory Access (DMA)
- A shared memory
- 1 Debug Support Unit
The cores are connected to a multibank memory enabling low latency access or bank private access depending on the configuration.
Network on Chip
The NoC is a 2D-wrapped-around torus structure providing a full duplex bandwidth up to 3.2 GB/s between each adjacent cluster. The NoC implements a Quality of Service mechanism, thus guaranteeing predictable latencies for all data transfers.
The MPPA MANYCORE processor communicates with the external devices through I/O subsystems located at the periphery of the NoC. The I/O subsystems implement various
Following is the description of the MPPA-256 interfaces:
- Two DDR3 channels: each channel is 64-bit with optional ECC and delivers up to 12,8GB/s.
- Two PCIe Gen3 X8: each interface embeds an advanced DMA with scatter/gather supports providing an efficient data transfer as a PCIe Bus master.
- Two smart Ethernet Controllers: each controller can be configured to provide 4x1GbE, 4x10GbE or 1x40GbE interface.
- A universal Static Memory Controller: this controller enables to connect up to five external devices like the NAND/NOR flash, the serial flash or the asynchronous SRAM memories
- Two banks of 64 General Purpose I/Os: each bank can be configured in PWM, UARTs, SPI or I2C.These banks can work also in Direct Network Access mode, providing a very low latency interface to directly stream data from/to the processing array.
- NoC eXpress interfaces (NoCX): it provides an aggregate bandwidth of 40Gb/s, the NoCXenables to easily scale the number of cores by connecting multiple MPPA MANYCORE processors on the same board. The NoCX is also an efficient way to couple the MPPA MANYCORE with an external FPGA used as a co-processor or interface bridge.