Most of known sphere decoders have been implemented as application specific integrated circuits (ASICs), but the need for high degree of flexibility in MIMO detection makes interesting also application specific instruction set processor (ASIP) implementations. A single programmable ASIP can hardly reach the same processing speed as a fully dedicated ASIC; thus, parallel architectures with multiple concurrent ASIPs must be conceived to guarantee sufficient data throughput.
The objective of this paper is to present a new ASIP-based implementation for the detection of MIMO signals. The processor supports multiple lattice modulation schemes (up to 64-QAM) and up to four transmitting antennas and it is able to run both ML and close to ML algorithms. A parallel architecture has been also designed with multiple ASIPs, which concurrently execute the detection algorithm on received symbols, a central unit acting as task scheduler, and a buffer for the compensation of non constant throughput. A dedicated bus handles the communication among allocated units. Each ASIP occupies a silicon area of 0.093 mm2 and runs at 400 MHz when implemented on a 90 nm CMOS technology. Achievable throughput depends on the adopted MIMO system and on the number of allocated ASIPs: a detector with 10 ASIPs programmed to run ML detection on a 4 脳 4 MIMO system with 64-QAM modulation offers a throughput of 78 Mbps at signal-to-noise ratio SNR = 18 dB.