Please use this identifier to cite or link to this item: https://idr.l3.nitk.ac.in/jspui/handle/123456789/9943
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPrasad, B.M.P.
dc.contributor.authorParane, K.
dc.contributor.authorTalawar, B.
dc.date.accessioned2020-03-31T06:51:48Z-
dc.date.available2020-03-31T06:51:48Z-
dc.date.issued2019
dc.identifier.citationInternational Journal of Systems Assurance Engineering and Management, 2019, Vol.10, 4, pp.696-712en_US
dc.identifier.uri10.1007/s13198-019-00799-5
dc.identifier.urihttp://idr.nitk.ac.in/jspui/handle/123456789/9943-
dc.description.abstractFast simulations are critical in reducing time to market in chip multiprocessors and system-on-chips. Several simulators have been used to evaluate the performance and power consumed by network-on-chips (NoCs). To speedup the simulations, it is necessary to investigate and optimize the hotspots in the simulator source code. Among several simulators available, Booksim2.0 has been chosen for the experimentation as it is being extensively used in the NoC community. In this paper, the cache and memory system behavior of Booksim2.0 have been analyzed to accurately monitor input dependent performance bottlenecks. The measurements show that cache and memory usage patterns vary widely based on the input parameters given to Booksim2.0. Based on these measurements, the cache configuration having the least misses has been identified. To further reduce the cache misses, software optimization techniques such as removal of unused functions, loop interchanging and replacing post-increment operator with pre-increment operator for non-primitive data types have been employed. The cache misses were reduced by 18.52%, 5.34% and 3.91% by employing above technology respectively. Thread parallelization and vectorization have been employed to improve the overall performance of Booksim2.0. The OpenMP programming model and SIMD are used for parallelizing and vectorizing the more time-consuming portions of Booksim2.0. Speedups of 2.93 and 3.97 were observed for the Mesh topology with 30 30 network size by employing thread parallelization and vectorization respectively. 2019, The Society for Reliability Engineering, Quality and Operations Management (SREQOM), India and The Division of Operation and Maintenance, Lulea University of Technology, Sweden.en_US
dc.titleAnalysis of cache behaviour and software optimizations for faster on-chip network simulationsen_US
dc.typeArticleen_US
Appears in Collections:1. Journal Articles

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.