Introduction (2)

[1]        Austin, T.; Blaauw, D.; Mahlke, S.; Mudge, T.; Chakrabarti, C.; Wolf, W., "Mobile supercomputers," in Computer , vol.37, no.5, pp.81-83, May 2004. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1297253&isnumber=28841

[2]        Kocher, P.; Lee, R.; McGraw, G.; Raghunathan, A.; Ravi, S., "Security as a new dimension in embedded system design," in Design Automation Conference, 2004. Proceedings. 41st , vol., no., pp.753-760, 7-11 July 2004. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1322583&isnumber=29281

Embedded Processors (6)

[3]          M. J. Schulte, J. Glossner, S. Jinturkar, M. Moudgill, S. Mamidi, and S. Vassiliadis, "A Low-Power Multithreaded Processor for Software Defined Radio," Journal of VLSI Signal Processing Systems, vol. 43, No. 2/3, pp. 143-159, June 2006. URL: http://link.springer.com/article/10.1007%2Fs11265-006-7267-1

[4]        Mark Woh, Sangwon Seo, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztian Flautner. 2009. AnySP: anytime anywhere anyway signal processing. SIGARCH Comput. Archit. News 37, 3 (June 2009), 128-139. URL:  http://doi.acm.org/10.1145/1555815.1555773

[5]        Codrescu, L.; Anderson, W.; Venkumanhanti, S.; Mao Zeng; Plondke, E.; Koob, C.; Ingle, A.; Tabony, C.; Maule, R., "Hexagon DSP: An Architecture Optimized for Mobile Multimedia and Communications," in Micro, IEEE , vol.34, no.2, pp.34-43, Mar.-Apr. 2014. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6762801&isnumber=6786915

[6]        Kaisheng Ma; Yang Zheng; Shuangchen Li; Swaminathan, K.; Xueqing Li; Yongpan Liu; Sampson, J.; Yuan Xie; Narayanan, V., "Architecture exploration for ambient energy harvesting nonvolatile processors," in High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on , vol., no., pp.526-537, 7-11 Feb. 2015. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7056060&isnumber=7056013

[7] van de Waerdt, J.-W.; Vassiliadis, S.; Sanjeev Das; Mirolo, S.; Yen, C.; Zhong, B.; Basto, C.; van Itegem, J.-P.; Dinesh Amirtharaj; Kulbhushan Kalra; Rodriguez, P.; van Antwerpen, H., "The TM3270 media-processor," in Proceedings. 38th Annual IEEE/ACM International Symposium on Microarchitecture, pp.12 pp.-342, 16-16 Nov. 2005 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1540971&isnumber=32904

[8] Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. 2010. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News 38, 3 (June 2010), 37-47. URL: http://doi.acm.org/10.1145/1816038.1815968

ISA Customization (6)

[9]        Gonzalez, R.E., "Xtensa: a configurable and extensible processor," in Micro, IEEE , vol.20, no.2, pp.60-70, Mar/Apr 2000. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=848473&isnumber=18443

[10]      Paolo Faraboschi, Geoffrey Brown, Joseph A. Fisher, Giuseppe Desoli, and Fred Homewood. 2000. Lx: a technology platform for customizable VLIW embedded processing. In Proceedings of the 27th annual international symposium on Computer architecture (ISCA '00). ACM, New York, NY, USA, 203-213. URL: http://doi.acm.org/10.1145/339647.339682

 [11]     Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, and Krisztian Flautner. 2005. An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors. In Proceedings of the 32nd annual international symposium on Computer Architecture (ISCA '05). IEEE Computer Society, Washington, DC, USA, 272-283. URL: http://dx.doi.org/10.1109/ISCA.2005.9

 [12]     S. Seo, M. Woh, S. Mahlke, T. Mudge, S. Vijay, and C. Chakrabarti. 2009. Customizing wide-SIMD architectures for H.264. In Proceedings of the 9th international conference on Systems, architectures, modeling and simulation (SAMOS'09), Walid Najjar and Michael J. Schulte (Eds.). IEEE Press, Piscataway, NJ, USA, 172-179. URL: http://cccp.eecs.umich.edu/papers/swseo-samos09.pdf

[13]      Chris Jenkins, Michael Schulte, and John Glossner. 2011. Instructions and hardware designs for accelerating SNOW 3G on a software-defined radio platform. Analog Integr. Circuits Signal Process. 69, 2-3 (December 2011), 207-218. URL: http://link.springer.com/article/10.1007%2Fs10470-011-9712-8

[14]      Dongrui She, Yifan He, and Henk Corporaal. 2012. Energy efficient special instruction support in an embedded processor with compact isa. In Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems (CASES '12). ACM, New York, NY, USA, 131-140. DOI=10.1145/2380403.2380430. URL: http://doi.acm.org/10.1145/2380403.2380430


 

Network on chip (3)

[15]      Tobias Bjerregaard and Shankar Mahadevan. 2006. A survey of research and practices of Network-on-chip. ACM Comput. Surv. 38, 1, Article 1 (June 2006). DOI=10.1145/1132952.1132953 http://doi.acm.org/10.1145/1132952.1132953

[16]      Jason Cong and Bingjun Xiao. 2013. Optimization of interconnects between accelerators and shared memories in dark silicon. In Proceedings of the International Conference on Computer-Aided Design (ICCAD '13). IEEE Press, Piscataway, NJ, USA, 630-637. URL: http://dl.acm.org/citation.cfm?id=2561828.2561953

[17]      George L. Yuan, Ali Bakhoda, and Tor M. Aamodt. 2009. Complexity effective memory access scheduling for many-core accelerator architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). ACM, New York, NY, USA, 34-44. DOI=10.1145/1669112.1669119 http://doi.acm.org/10.1145/1669112.1669119

Embedded Software (5)

[18]      L. Chakrapani, J. Gyllenhaal, W. Hwu, S. Mahlke, K. Palem, and R. Rabbah, “Trimaran: An Infrastructure for Research in Instruction-Level Parallelism,” Lecture Notes in Computer Science, Springer-Verlag, vol. 3602, pp. 32-41, August 2005. URL: http://link.springer.com/chapter/10.1007%2F11532378_4

[19]      Chang Hong Lin; Yuan Xie; Wolf, W., "Code Compression for VLIW Embedded Systems Using a Self-Generating Table," in Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.15, no.10, pp.1160-1171, Oct. 2007 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4303127&isnumber=4303116.

[20]      Stephen Hines, Joshua Green, Gary Tyson, and David Whalley. 2005. Improving Program Efficiency by Packing Instructions into Registers. In Proceedings of the 32nd annual international symposium on Computer Architecture (ISCA '05). IEEE Computer Society, Washington, DC, USA, 260-271. URL: http://dx.doi.org/10.1109/ISCA.2005.32

[21]      Mojtaba Mehrara, Po-Chun Hsu, Mehrzad Samadi, and Scott Mahlke. 2011. Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA '11). IEEE Computer Society, Washington, DC, USA, 87-98. URL: http://web.eecs.umich.edu/~mehrara/pubs/mehrara-hpca11.pdf

[22]      Yuhao Zhu and Vijay Janapa Reddi. 2014. WebCore: architectural support for mobileweb browsing. SIGARCH Comput. Archit. News 42, 3 (June 2014), 541-552. DOI=10.1145/2678373.2665749 http://doi.acm.org/10.1145/2678373.2665749

Embedded Operating Systems (4)

[23]      Mooney, V.J.; Blough, D.M., "A hardware-software real-time operating system framework for SoCs," in Design & Test of Computers, IEEE , vol.19, no.6, pp.44-51, Nov/Dec 2002. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1047743&isnumber=22460

[24]      Sankaralingam Panneerselvam and Michael M. Swift. 2012. Operating systems should manage accelerators. In Proceedings of the 4th USENIX conference on Hot Topics in Parallelism(HotPar'12). USENIX Association, Berkeley, CA, USA, 4-4. URL: https://www.usenix.org/system/files/hotpar12-final40.pdf

[25]      Alex Shye, Benjamin Scholbrock, and Gokhan Memik. 2009. Into the wild: studying real user activity patterns to guide power optimizations for mobile architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). ACM, New York, NY, USA, 168-178. DOI=10.1145/1669112.1669135 URL: http://doi.acm.org/10.1145/1669112.1669135

[26]      Jeremy Andrus, Christoffer Dall, Alexander Van't Hof, Oren Laadan, and Jason Nieh. 2011. Cells: a virtual mobile smartphone architecture. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11). ACM, New York, NY, USA, 173-187. URL: http://doi.acm.org/10.1145/2043556.2043574

Embedded Multiprocessors and Heterogeneous Chips (2)

[27]      Wolf, W.; Jerraya, A.A.; Martin, G., "Multiprocessor System-on-Chip (MPSoC) Technology," in Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.27, no.10, pp.1701-1713, Oct. 2008. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4627532&isnumber=4627531

 [28] Eric S. Chung, Peter A. Milder, James C. Hoe, and Ken Mai. 2010. Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO '43). IEEE Computer Society, Washington, DC, USA, 225-236. URL:  http://dx.doi.org/10.1109/MICRO.2010.36.

Embedded Accelerators (10)

[29]      Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, Karthik Gururaj, and Glenn Reinman. 2014. Accelerator-Rich Architectures: Opportunities and Progresses. In Proceedings of the 51st Annual Design Automation Conference (DAC '14). ACM, New York, NY, USA, , Article 180 , 6 pages. DOI=10.1145/2593069.2596667 URL: http://doi.acm.org/10.1145/2593069.2596667

[30]      Yunsup Lee, Rimas Avizienis, Alex Bishara, Richard Xia, Derek Lockhart, Christopher Batten, and Krste Asanović. 2011. Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators. In Proceedings of the 38th annual international symposium on Computer architecture (ISCA '11). ACM, New York, NY, USA, 129-140. URL: http://doi.acm.org/10.1145/2000064.2000080

[31]      Benjamin C. Brodie, David E. Taylor, and Ron K. Cytron. 2006. A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching. In Proceedings of the 33rd annual international symposium on Computer Architecture (ISCA '06). IEEE Computer Society, Washington, DC, USA, 191-202. DOI=10.1109/ISCA.2006.7 http://dx.doi.org/10.1109/ISCA.2006.7

[32]      Po-Chih Tseng,; Chang, Y.; Yu-Wen Huang; Hung-Chi Fang; Chao-Tsung Huang; Liang-Gee Chen, "Advances in Hardware Architectures for Image and Video Coding - A Survey," in Proceedings of the IEEE , vol.93, no.1, pp.184-197, Jan. 2005. URL:  http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1369708&isnumber=29978

[33]      Brandon Reagen, Yakun Sophia Shao, Gu-Yeon Wei and David Brooks,Quantifying Acceleration: Power/Performance Trade-Offs of Application Kernels in Hardware, International Symposium on Low Power Electronics and Design (ISLPED), Sept 2013. URL: http://www.eecs.harvard.edu/~shao/papers/reagen2013-islped.pdf

[34]      Yakun Sophia Shao, Brandon Reagen, Gu-Yeon Wei, and David Brooks. 2014. Aladdin: a Pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures. SIGARCH Comput. Archit. News 42, 3 (June 2014), 97-108. DOI=10.1145/2678373.2665689 URL: http://doi.acm.org/10.1145/2678373.2665689

[35]      Thierry Moreau, Mark Wyse, Jacob Nelson, Adrian Sampson, Hadi Esmaeilzadeh, Luis Ceze, Mark Oskin, SNNAP: Approximate computing on programmable SoCs via neural acceleration. HPCA 2015: 603-14. URL: https://homes.cs.washington.edu/~luisceze/publications/snnap-hpca-2015.pdf

[36]      Kevin Fan, Manjunath Kudlur, Ganesh S. Dasika, Scott A. Mahlke, Bridging the computation gap between programmable processors and hardwired accelerators. HPCA 2009: 313-322. URL: http://cccp.eecs.umich.edu/papers/fank-hpca09.pdf

[37]      Wajahat Qadeer, Rehan Hameed, Ofer Shacham, Preethi Venkatesan, Christos Kozyrakis, and Mark A. Horowitz. 2013. Convolution engine: balancing efficiency & flexibility in specialized computing. SIGARCH Comput. Archit. News 41, 3 (June 2013), 24-35. DOI=10.1145/2508148.2485925. URL: http://doi.acm.org/10.1145/2508148.2485925

[38]      Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: shifting vision processing closer to the sensor. InProceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 92-104. URL: http://doi.acm.org/10.1145/2749469.2750389

Heterogeneous memory systems  (3)

[39]      Jason Power, Arkaprava Basu, Junli Gu, Sooraj Puthoor, Bradford M. Beckmann, Mark D. Hill, Steven K. Reinhardt, and David A. Wood. 2013. Heterogeneous system coherence for integrated CPU-GPU systems. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 457-467. URL: http://dl.acm.org/citation.cfm?id=2540747

[40]      Jing Huang, Yuanjie Huang, Olivier Temam, Paolo Ienne, Yunji Chen, and Chengyong Wu. 2014. A low-cost memory interface for high-throughput accelerators. In Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems(CASES '14). ACM, New York, NY, USA, , Article 11 , 10 pages. DOI=10.1145/2656106.2656109 http://doi.acm.org/10.1145/2656106.2656109

[41]      Praveen Yedlapalli, Nachiappan Chidambaram Nachiappan, Niranjan Soundararajan, Anand Sivasubramaniam, Mahmut T. Kandemir, and Chita R. Das. 2014. Short-Circuiting Memory Traffic in Handheld Platforms. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, USA, 166-177.  http://dx.doi.org/10.1109/MICRO.2014.60

Scratchpads and Caches (4)

[42]      Rakesh Komuravelli, Matthew D. Sinclair, Johnathan Alsop, Muhammad Huzaifa, Maria Kotsifakou, Prakalp Srivastava, Sarita V. Adve, and Vikram S. Adve. 2015. Stash: have your scratchpad and cache it too. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 707-719. http://dl.acm.org/citation.cfm?id=2750374

[43]      Lluc Alvarez, Lluís Vilanova, Miquel Moreto, Marc Casas, Marc Gonzàlez, Xavier Martorell, Nacho Navarro, Eduard Ayguadé, and Mateo Valero. 2015. Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 720-732. URL: http://doi.acm.org/10.1145/2749469.2750411

[44]      Snehasish Kumar, Arrvindh Shriraman, and Naveen Vedula. 2015. Fusion: design tradeoffs in coherent cache hierarchies for accelerators. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 733-745. URL: http://doi.acm.org/10.1145/2749469.2750421

[45]      Farmahini-Farahani, A.; Nam Sung Kim; Morrow, K., "Energy-efficient reconfigurable cache architectures for accelerator-enabled embedded systems," in Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on , vol., no., pp.211-220, 23-25 March 2014. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6844485&isnumber=6844447

Security (1+TBD)

[46]      Giovanni Russello, Arturo Blas Jimenez, Habib Naderi, and Wannes van der Mark. 2013. FireDroid: hardening security in almost-stock Android. In Proceedings of the 29th Annual Computer Security Applications Conference (ACSAC '13). ACM, New York, NY, USA, 319-328. DOI=10.1145/2523649.2523678. URL: http://doi.acm.org/10.1145/2523649.2523678