The microprocessor technology road-map predicts a future with tens to hundreds
of processors per chip and beyond, but with limited clock frequency
improvements and likely simpler individual processors. Faced with the
corresponding demise of sequential program performance, the software industry
is compelled to parallelize existing software by introducing threads and
synchronization to target these multicore processors. None feels the pinch
more dramatically than the FPGA companies who's CAD software (i) must
manipulate hardware designs that are themselves growing with Moore's law, but
(ii) is composed of a large number of sequential algorithms. While progress
has been made on parallelizing the most crucial algorithms (at great expense),
future parallelization efforts will require a more cost-effective approach.
Transactional Memory (TM) promises an easier, optimistic alternative to locks
for critical sections---allowing programmers to avoid deadlock and other bugs
when synchronizing code, and also allowing critical sections to execute in
parallel whenever they operate on independent data. In this talk I will
summarize our recent work on using TM to parallelize simulated-annealing-based
placement for FPGAs---in particular, we used a software-TM (TinySTM) to
parallelize the placement phase of Versatile Place and Route (VPR) 5.0.2.
Using TM allowed us to very quickly produce a parallel and correct version of
the software, and to then focus on incrementally tuning performance. I will
describe our experiences in tuning the STM and CAD software, and the
interesting algorithmic trade-offs that exist in this application area. In the
end we find that transactionalized placement has the potential for scalable
performance, but that hardware support for TM is likely required to overcome
overheads.
Biography:
Greg Steffan is an Associate Professor in the Department of Electrical and Computer Engineering at the University of Toronto. His expertise is in computer architecture and compilers, and his research currently targets methods of exploiting parallelism in multicore processors and FPGAs. Greg completed his doctorate at Carnegie Mellon University in 2003, after undergraduate and Masters degrees at the University of Toronto in 1995 and 1997 respectively. He is a recipient of the Ontario Ministry of Research and Innovation Early Researcher Award Award (2007), a Siebel Scholar (2002), an IBM CAS Faculty Fellow, and a senior member of IEEE and ACM.