Cathal McCabe (Xilinx Ireland)

Title: PYNQ: Python productivity for Zynq

Cathal McCabe (Xilinx Ireland)Abstract

PYNQ is an open-source framework that enables programmers who want to use embedded systems to exploit the capabilities of Xilinx Zynq All Programmable SOCs and MPSOCs. It allows users to exploit custom hardware in the programmable logic without having to use ASIC-style CAD tools. Instead the target device is programmed in Python and the code is developed and tested directly on the embedded system. The programmable logic circuits are imported as hardware libraries and programmed through their APIs, in essentially the same way that software libraries are imported and programmed.

The framework combines four main elements: (1) the use of a high-level productivity language, Python in this case; (2) Python-callable hardware libraries based on FPGA overlays; (3) a web-based architecture incorporating the open-source Jupyter Notebook infrastructure served from Zynq's embedded processors; and (4) Jupyter Notebook's client-side, web apps. The result is a web-centric programming environment that enables software programmers to work at higher levels of design abstraction and to re-use both software and hardware libraries.

This talk will introduce the PYNQ project, and explore the latest developments, including support for the next generation Zynq Ultrascale+ heterogeneous MPSOC and new additions to the PYNQ framework.


Cathal has an MEng in Electrical and Electronic Engineering from Queens University Belfast.

Before joining Xilinx, he was a senior engineer in the Science and Technology Facilities Council (STFC) in the UK where he was the Europractice manager for FPGA, Embedded, and ESL design tools and flows. He was responsible for supporting universities across Europe in the use of advanced microelectronic design flows.

Cathal is currently a senior applications engineer in the Xilinx CTO department, where he manages the Xilinx University Program in EMEA.

He is responsible for supporting adoption of the latest Xilinx technologies for teaching and research in 1000's of universities across the region, and also manages the XUP donation program, providing academics with access to Xilinx hardware platforms and software.

Cathal has also been a key developer on the PYNQ project which makes it easier for programmers to use Xilinx All Programmable Zynq SOCs and MPSOCs.

Prof Philippe Coussy ( Univ de Bretagne Sud)

Title: Energy-Efficient Reconfigurable Accelerators for Ultra-low Power Processing

Prof Philippe CoussyAbtsract

In this talk, a fresh look to Coarse Grained Reconfigurable Arrays (CGRAs) as ultra-low power accelerators for near-sensor processing will be given. A general-purpose Integrated Programmable-Array accelerator (IPA) exploiting a novel architecture, execution model, and compilation flow for application mapping that can handle kernels containing complex control flow, without the significant energy overhead incurred by state of the art predication approaches will be presented. To optimize the performance and energy efficiency, the IPA architecture with special focus on shared memory access, with the help of the flexible compilation flow will be explored. The proposed accelerator achieves an average energy efficiency of 1617 MOPS/mW operating at 100MHz, 0.6V in 28nm UTBB FD-SOI technology, over a wide range of near-sensor processing kernels, leading to an improvement up to 18, with an average of 9.23 (as well as a speed-up up to 20.3, with an average of 9.7) compared to a core specialized for ultra-low power near-sensor processing


Philippe Coussy is a full professor in the Lab-STICC (UMR CNRS) at the Université de Bretagne-Sud, France, where he leads the “Design of Advanced Architectures” group. He was graduated from Université Pierre et Marie Curie (MSc, 1999), Université de Bretagne-Sud (Ph.D., 2003 and Habilitation 2011). He is an elected member of the technical committee of the IEEE Signal Processing Society, Design and Implementation of Signal Processing Systems (DISPS) since 2011. His research interests include system-level and computer-aided design, high-level synthesis and Coarse Grained Reconfigurable Architecture. He has organized several workshops and tutorials in many international conferences including DAC, DATE, CODES+ISSS and ASP-DAC. He was guest editor for several special issues of scientific journals and co-editor of two books (Springer). He regularly serves as a national and international scientific expert and participates as PC member in many international ACM/IEEE conferences and as reviewer for major IEEE/ACM journals. He was Associate Editor of the IEEE Signal Processing Letters for the design and implementation of signal processing systems.

Dr. Zied El Marrakchi (Mentor Graphics, Tunisia)

Title : Multi-FPGA based Prototyping for SoC validation


Dr. Zied El Marrakchi (Mentor Graphics, Tunisia)Software has come to dominate system-on-chip (SoC) development. It is increasingly common for the software effort to be on the critical path of the project schedule. Only FPGA-based prototyping provides both the speed and accuracy necessary to develop and validate complex software integration prior to silicon. The exciting benefits of an FPGA-based prototype are:

  • Quick fine tuning of hardware/software integration and software validation pre-silicon
  • In-system device validation with real-time interfaces and in end application
  • Extended register transfer level (RTL) testing and debugging

Prototyping next-generation SoCs, which contain more functionality than the capacity offered by a single FPGA, means spreading that functionality across multiple FPGAs, leading to challenging partitioning and timing closure issues. Traditional prototyping solutions manage device under test (DUT) partitioning either at the RTL level or at the gate level, but they fail to offer a predictable and efficient flow that would allow the FPGA-based SoC prototype to be brought up quickly.

VPS (Veloce Prototyping System) tackles FPGA-based prototyping challenges andoffersSoC designers an innovative methodology unifying the benefits of gate-level partitioning and RTL partitioning, providing a short, automated, and predictable path to prototype.

Multi-FPGA partitioning is a complex optimization problem that must handle multiple constraints and concurrent objectives. The partitioning challenges that have to be overcome to make FPGA-based prototyping effective are:

  • Heterogeneous FPGA logic resources management
  • Unbalanced interconnect management and pin multiplexing
  • Timing closure issues and timing constraints generation
  • Incremental flow for fast turnaround
  • Full system verification and simulation
  • Bug hunting methodologies


  • 2002: Electrical Engineering degree from National Engineering School of Tunis (ENIT)
  • 2003: Micrelectronics Master in SupElec Rennes
  • 2008: PHD from Paris 6 (UPMC): “FPGA Architecture Exploration and EDA Tools Development”
  • 2009: Creation of “FlexRAS Technologies” Start-up: Partitioning Tools development for Multi-FPGA based Prototyping Systems
  • 2015: “FlexRAS Technologies” is acquired by Mentor Graphics
  • 2017: Creation and Management of Mentor Graphics Subsidiary in Tunis

Prof Suhaib A Fahmy (Univ of Warwick, UK)

Title: FPGA Overlays: Enhancing Abstraction and Productivity

Prof Philippe CoussyAbtsract

FPGAs are finding more widespread use including a recent explosion in datacenter contexts. Custom processing architectures are enabling a significant improvement in performance and computational efficiency, but come at the cost of high design complexity. Despite efforts to raise the abstraction level of front-end design tools, the back-end tools continue to dominate design times, make testing more complex, and limit portability. Overlays are a way to address these challenges by building a coarse grained programmable architecture on top of the FPGA and using higher level tools to compile to them. We will review work which demonstrates that an architecture- and application- centric design approach can overcome the traditionally significant overheads that accompanied overlays in the past. We will then review recent work on overlay architectures and design tools including an approach built around OpenCL. Finally, we will explore some of the areas where overlays may offer an ideal programming and architecture abstraction.


Suhaib Fahmy leads the Connected Systems Research Group in the School of Engineering at Warwick and is a Turing Fellow with The Alan Turing Institute. He graduated with an MEng and PhD from Imperial College London in 2003 and 2008, respectively, followed by time with Trinity College Dublin and Xilinx Research Labs, and 6 years with Nanyang Technological University in Singapore. His research explores the use of reconfigurable systems in domains including communications, cyber-physical systems, and automotive networks. Dr Fahmy received the Best Paper Award at FPT 2012, the IBM Faculty Award in 2013 and 2017, and the Community Award at FPL 2016. He serves on the technical program committees for a number of prestigious conferences in the area of reconfigurable computing, actively reviews for many journals in related areas, and sits on the ACM Technical Committee on FPGAs.