Hardware Acceleration for RLNC: A Case Study Based on the Xtensa Processor with the Tensilica Instruction-Set Extension

被引:7
作者
Acevedo, Javier [1 ]
Scheffel, Robert [2 ]
Wunderlich, Simon [1 ]
Hasler, Mattis [2 ]
Pandi, Sreekrishna [1 ]
Cabrera, Juan [1 ]
Fitzek, Frank H. P. [1 ]
Fettweis, Gerhard [2 ]
Reisslein, Martin [3 ]
机构
[1] Tech Univ Dresden, Deutsch Telekom Chair Commun Networks, Lab Germany 5G, D-01062 Dresden, Germany
[2] Tech Univ Dresden, Vodafone Chair Mobile Commun Syst, Lab Germany 5G, D-01062 Dresden, Germany
[3] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA
关键词
application-specific instruction-set processor (ASIP); flexible-length instruction extension (FLIX); galois field; hardware acceleration; multiply-accumulate (MAC) operations; random linear network coding (RLNC); single instruction multiple data (SIMD); WIRELESS NETWORKS; LOW-DELAY; DESIGN;
D O I
10.3390/electronics7090180
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Random linear network coding (RLNC) can greatly aid data transmission in lossy wireless networks. However, RLNC requires computationally complex matrix multiplications and inversions in finite fields (Galois fields). These computations are highly demanding for energy-constrained mobile devices. The presented case study evaluates hardware acceleration strategies for RLNC in the context of the Tensilica Xtensa LX5 processor with the tensilica instruction set extension (TIE). More specifically, we develop TIEs for multiply-accumulate (MAC) operations for accelerating matrix multiplications in Galois fields, single instruction multiple data (SIMD) instructions operating on consecutive memory locations, as well as the flexible-length instruction extension (FLIX). We evaluate the number of clock cycles required for RLNC encoding and decoding without and with the MAC, SIMD, and FLIX acceleration strategies. We also evaluate the RLNC encoding and decoding throughput and energy consumption for a range of RLNC generation and code word sizes. We find that for GF LNC encoding, the SIMD and FLIX acceleration strategies achieve speedups of approximately four hundred fold compared to a benchmark C code implementation without TIE. We also find that the unicore Xtensa LX5 with SIMD has seven to thirty times higher RLNC encoding and decoding throughput than the state-of-the-art ODROID XU3 system-on-a-chip (SoC) operating with a single core; the Xtensa LX5 with FLIX, in turn, increases the throughput by roughly 25% compared to utilizing only SIMD. Furthermore, the Xtensa LX5 with FLIX consumes roughly three orders of magnitude less energy than the ODROID XU3 SoC.
引用
收藏
页数:22
相关论文
共 106 条
[31]  
Douik A, 2016, IEEE WCNC
[32]   Instantly Decodable Network Coding: From Centralized to Device-to-Device Communications [J].
Douik, Ahmed ;
Sorour, Sameh ;
Al-Naffouri, Tareq Y. ;
Alouini, Mohamed-Slim .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2017, 19 (02) :1201-1224
[33]   A survey on network coding: From traditional wireless networks to emerging cognitive radio networks [J].
Farooqi, Muhammad Zubair ;
Tabassum, Salma Malik ;
Rehmani, Mubashir Husain ;
Saleem, Yasir .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2014, 46 :166-181
[34]   Band Codes for Energy-Efficient Network Coding With Application to P2P Mobile Streaming [J].
Fiandrotti, Attilio ;
Bioglio, Valerio ;
Grangetto, Marco ;
Gaeta, Rossano ;
Magli, Enrico .
IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (02) :521-532
[35]   Dynamic Rate Adaptation for Improved Throughput and Delay in Wireless Network Coded Broadcast [J].
Fu, Amy ;
Sadeghi, Parastoo ;
Medard, Muriel .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2014, 22 (06) :1715-1728
[36]   Caterpillar RLNC With Feedback (CRLNC-FB): Reducing Delay in Selective Repeat ARQ Through Coding [J].
Gabriel, Frank ;
Wunderlich, Simon ;
Pandi, Sreekrishna ;
Fitzek, Frank H. P. ;
Reisslein, Martin .
IEEE ACCESS, 2018, 6 :44787-44802
[37]   Parallelizing Network Coding Using CUDA [J].
Gan Xin-Biao ;
Shen Li ;
Wang Zhi-Ying ;
Lai Xin ;
Zhu Qi .
NEW TRENDS AND APPLICATIONS OF COMPUTER-AIDED MATERIAL AND ENGINEERING, 2011, 186 :484-488
[38]   Low Delay Random Linear Coding and Scheduling Over Multiple Interfaces [J].
Garcia-Saavedra, Andres ;
Karzand, Mohammad ;
Leith, Douglas J. .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2017, 16 (11) :3100-3114
[39]  
Gkantsidis C, 2005, IEEE INFOCOM SER, P2235
[40]   Xtensa: A configurable and extensible processor [J].
Gonzalez, RE .
IEEE MICRO, 2000, 20 (02) :60-70