摘要
现代商用多核处理器缺少硬件支持的处理核间通信机制,多个处理核间必须通过加锁保护的共享内存传递数据。为此,设计一种基于软件的无锁队列作为核间通信机制,通过无锁数据结构提高软件队列的性能。当数据到达速率较低时,队列自适应地减小队列长度,从而占用较小的内存空间,进而更好地利用处理器高速缓存;当数据到达速率较高时,队列自适应地增加队列长度,以避免数据丢失。实验结果表明,在数据到达速率变化较大的实际应用场景中,该队列较FastForward、MCRingBuffer和B-Queue队列具有更高的数据处理性能。
Modern commercial multicore processors lack a hardware-supported mechanism for processing inter-core communication,and multiple processing cores must pass data through locked shared memory. Therefore,a software-based lock-free queue is designed as an inter-core communication mechanism,to improve the performance of the software queue through the data structure without lock. When the data arrival rate is lower,the queue adaptively reduces the queue length,thus taking up less memory space and making better use of processor cache;When the data arrival rate is higher,the queue adaptively increases the queue length to avoid data loss. Experimental results show that this queue has higher data processing performance than FastForward,MCRingBuffer and B-Queue queues in practical application scenarios w here the data arrival rate changes greatly.
引文
[1]WANG J,ZHANG K,TANG X,et al.B-Queue:efficient and practical queuing for fast core-to-core communication[J].International Journal of Parallel Programming,2013,41(1):137-159.
[2]ARNAUTOV S,FELBER P,FETZER C,et al.FFQ:a fast singleproducer/multiple-consumer concurrent FIFO queue[C]//Proceedings of IEEE International Symposium Parallel and Distributed Processing.Washington D.C.,USA:IEEE Press,2017:907-916.
[3]刘扬,王鹏,杨瑞,等.基于Open MP的遥感影像并行ISODATA聚类研究[J].计算机工程,2016,42(7):238-243.
[4]张步忠,程玉胜,王一宾.求解N皇后问题的片上多核并行混合遗传算法[J].计算机工程,2015,41(7):199-203.
[5]权衡.多核处理器运算及核间通讯模块设计研究[D].上海:复旦大学,2012.
[6]GIACOMONI J,MOSELEY T,VACHHARAJANI M.Fast forward for efficient pipeline parallelism:a cache-optimized concurrent lock-free queue[C]//Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York,USA:ACM Press,2008:43-52.
[7]PREUD T,SOPENA J,THOMAS G,et al.Batch Queue:fast and memory-thrifty core to core communication[C]//Proceedings of International Symposium on Computer Architecture and High Performance Computing.Washington D.C.,USA:IEEE Press,2010:215-222.
[8]LADAN-MOZES E,SHAVIT N.An optimistic approach to lock-free FIFO queues[J].Distributed Computing,2008,20(5):323-341.
[9]肖月振,华蓓.基于多核处理器的无锁零拷贝数据包转发框架[J].计算机工程,2013,39(12):35-39.
[10]ALEX K,EREZ P.Wait-free queues with multiple enqueuers and dequeuers[J].ACM SIGPLAN Notices,2012,46(8):223-234.
[11]MORRISON A,AFEK Y.Fast concurrent queues for x86processors[J].ACM SIGPLAN Notices,2014,48(8):103-112.
[12]TORQUATI M.Single-producer/single-consumer queues on shared cache multi-core systems[EB/OL].[2018-01-02].https://arxiv.org/pdf/1012.1824.pdf.
[13]HERLIHY M.The art of multiprocessor programming[C]//Proceedings of ACM Symposium on Principles of Distributed Computing.New York,USA:ACM Press,2006:1-2.
[14]LEE S,TIWARI D,YAN S,et al.HAQu:hardwareaccelerated queueing for fine-grained threading on a chip multiprocessor[J].IEEE International Symposium on High Performance Computer Architecture,2011,8(1):99-110.
[15]吴俊敏,朱峪,朱小东,等.基于硬件队列扩展的网卡虚拟化系统及其方法,CN102609298A[P].2012.