ECE 751 Project Topic Suggestions and Ideas

For the 751 course project, you need to form a team of 4 students to conduct a focused research-oriented project during the second half of the semester. Given the focus in the course on hardware and software co-designed solutions for embedded applications, each project should include both hardware and software implementation and evaluation targeted at an embedded application or application domain.

Requirements:

· You need to work in teams of 4 approximately students. Smaller or larger teams must adjust the scope of their work to match the size of the team.

· If your proposed project does not include both a hardware and a software component, you must discuss this with me before the proposals are due, and must justify that your project still maintains sufficient scope and is relevant to this course.

· You must submit a written project proposal of up to two pages through the learn@uw dropbox by October 13, 2015. The proposal must include the names of all team members, a summary of the proposed topic and a research plan that outlines how you will accomplish your goals.

· You must submit a 2-3 page progress report by midnight on November 17, using the learn@uw dropbox. This must report your current progress on achieving the goals you set in your initial proposal, and how you will recover or reset your goals if you are falling behind.

· Project findings will be presented orally during class time on Dec. 1 and Dec. 3.

· Written project reports that fully document your activities and findings are due on Dec. 15, 2015 using the learn@uw dropbox. Report guidelines are provided here.

· The project report must also include a statement of work that identifies the contributions of each individual on the team. This statement of work must reflect a team consensus and must be signed by all team members. I recommend that you structure this statement as a table with a row for each project milestone, a column for each team member, and the percentage contribution of each team member to each milestone in the entries in the table.

Here are some guidelines for the suggested default project topic and structure. You can deviate from these but make sure you discuss your plans with me well in advance of the Oct. 13 proposal deadline:

· Pick an important, performance- and/or power-constrained embedded application (some ideas are below, but this really is wide open, and it is up to you to find something you are interested in or familiar with), characterize the application to find where it spends most of its execution time, then propose a hardware accelerator for that task. The accelerator could be a new compound instruction with matching execution unit, or it could be a stand-alone accelerator, or it could be reconfigurable logic, or anywhere in between these options. Read the assigned papers carefully for ideas and directions. Next, design the hardware, rewrite the application to use the hardware, and build a performance and/or power model to evaluate your design.

· Note that given the scope required for the project, you will have to cut some corners in your model development (most likely). That is to say, I do not expect a conference-paper-level evaluation of the hardware or the software. Of course, the more detailed, the better, but a cycle-accurate timing model coupled with an accurate power model is probably not a realistic goal for a semester project. More likely, you will develop a “spreadsheet” model that captures the time spent in the accelerated and non-accelerated portions of the application, accounts for all significant overheads (i.e. invoking the accelerator, communicating operands to it and results from it via some interconnect or through the memory hierarchy, etc.), and then estimates performance benefits and power savings by subtracting that portion from the baseline and adding back in the estimates you have developed through hardware design, modeling, and analysis.

· Important embedded applications include all and any of the ones mentioned in the papers or in lecture, as well as many others. Your approach to accelerating an application need not be novel, and the application need not be novel. However, my standard for evaluating your efforts will be inversely proportional to the novelty of each. That is to say, I will apply the least stringent standard to a novel application (one not studied before) accelerated with a novel approach (one not previously proposed for other applications), since you will have the most difficulty in evaluating it. At the other extreme, if you repeat a prior proposal (pre-existing application with a previously-proposed accelerator), I will hold you to a much higher standard of evaluation. Note that what you choose to do is up to you: in many ways, repeating a prior proposal is much safer, since you can simply mimic the evaluation and ideas from the prior paper, and you should have a very good sense of where you will end up by the end of the semester. However, I encourage you to take a risk in this project, as you will be evaluated based on effort, not on results, and the probability that a novel application and/or novel approach can lead to a publication at a workshop or conference is, of course, much higher.

· Possible applications of interest include:

1. Media related: encoders, decoders, document rendering (PDF or html->printer, etc.)

2. Computer vision: object recognition, optical flow, image understanding, etc. Look at the OpenCV library as a starting point, pick a task, and devise an accelerator to streamline this task.

3. Mobile web browsing (javascript acceleration, html5)

4. Mobile UI acceleration (the user interface thread is a significant power consumer in smartphones).

5. Embedded sensing platforms. Look for scenarios where sensors require significant processing (typically some kind of signal processing) to interpret sensory data, or are extremely power-constrained (so must minimize power for analysis and/or for transmission to communicate sensed results wirelessly).

6. Software radio. For novel directions, look into cognitive radio (combination of sensing what is the current RF environment and then adapting channel allocation, etc. to that context). What can be done in hardware (for less power) than is currently done in SDR?

7. GPS processing. Enabling GPS typically drains smartphone batteries very quickly. GPS signals are encoded using CDMA and require solving the triangulation problem to derive location. Typical implementations include a fairly high-performance DSP to perform these tasks. Can we accelerate these tasks and reduce power consumption?. One interesting place to start looking at this might be the gnss-sdr project, which uses a cheap TV tuner USB card as the receiver for a software radio. By tuning it to the right (GPS) frequency the raw signal is captured and the GPS signal can be decoded from it using GNU Radio running on a laptop.

8. Sensor fusion and compressed sensing. GPS power can be reduced by only triggering GPS samples when the accelerometer and/or gyroscope indicate movement. Look at an implementation of this kind of sensor fusion and find opportunities to accelerate for better performance or reduced power.

9. Dead reckoning. The combination of accelerometer data, gyroscope data, and compass data can be used to estimate location in the absence of a GPS signal (e.g. inside a building). Are there opportunities for acceleration in this domain?

10. Security. Preventing various side-channel attacks in a power/energy-efficient manner. Implementing security protocols and ciphers efficiently.

· Possible accelerator architectures include:

1. Accelerator architectures based on novel instructions and execution units, in the same vein as the Xtensa reading.

2. Standalone accelerators that are connected at the memory bus or to the cache hierarchy.

3. Reconfigurable accelerators based on FPGA technology.

4. Enhancements of existing approaches like VLIW.

5. Accelerators based on neural concepts (see readings [35] or [38]).

· Related possibilities that are somewhat vague but may give you some ideas:

1. Investigate challenges and opportunities of deploying near-threshold computing (NTC) in embedded scenarios.

2. Look at opportunities created by new memory technologies (e.g. phase-change memory, or even new 3D packaging like Micron’s Hybrid Memory Cube) for embedded systems.

3. Recent work in cache replacement policies may or may not be relevant for embedded applications. Study these with embedded workloads and devise new/better policies. Start with the using the infrastructure from the 2009 contest: http://www.jilp.org/jwac-1/

4. Recent work in branch prediction may have relevance to embedded systems. Explore these using the 2010 or 2014 contest infrastructure and winning proposals: http://www.jilp.org/jwac-1/ http://www.jilp.org/cbp2014/

5. Recent work in DRAM memory schedulers may have relevance to embedded systems. Explore these using the 2012 contest infrastructure: http://www.cs.utah.edu/~rajeev/jwac12

6. Compression for caches, scratchpads, buses: are there new and unexplored possibilities in this domain? Arithmetic coding vs. Huffman coding? Etc.

· Note that you are largely on your own for tool support, so you should probably rely on pre-existing familiarity with simulation and synthesis tools from prior research or coursework (e.g. ECE 551, CS537, etc.), and assemble teams that have a diverse set of skills.

· We also have access to Xilinx Zynq FPGA boards, which include ARM cores and FPGA resources; these are an excellent platform for prototyping attached accelerator ideas.

These are just suggestions. I welcome your ideas; please speak with me after class or during office hours to refine them further. I prefer that you come up with your own ideas of what you are interested in.