Sunteți pe pagina 1din 4

A Dynamic Reconfigurable Processor and a Design Tool for the Next Generation ECUs

Norifumi Yoshimatsu, Trouve Antoine, Takayuki Kando


Institute of Systems, Information Technologies and Nanotechnologies (ISIT) Fukuoka, Japan nyoshimatsu@isit.or.jp

Kazuaki Murakami
Kyushu University, Institute of Systems, Information Technologies and Nanotechnologies (ISIT) Fukuoka, Japan murakami@ait.kyushu-u.ac.jp

Abstract During the last two decades, electronic control units (ECU) became key components in automotive. This market is consequently getting very competitive and constrictive in term of both demanded functionalities and time to market (TTM). Hence, ECU design bas become very tough and new technical solutions are to be found by manufacturers. In this context, we believe that dynamic reconfigurable processors are a pertinent solution for forthcoming ECU design. In effect they can provide with efficient all-software design flow from high-level languages, as well as higher performance and lower power consumption than general purpose processors. As an illustration, this paper introduces a dynamic reconfigurable processor along with a high-level synthesis design tool. Keywords-component; retargetabble compiler dynamic reconfigurable processor,

shelf components. They are however not competitive at all in term of power efficiency. In particular, if they can meet very high performance constraints, some process-intensive applications present in ECU would cause typical GPP to increase dramatically heat dissipation. Automotive environment already requiring high working temperature, it makes the use of GPP restricted to non-critical applications far from vehicles hot-spots. In this context, application specific integrated circuit (ASIC) are usually used for applications which require processing speed which general purpose processors cannot deliver under reasonable power consumption and thermal budget. If ASIC is the best solution in terms of computational speed vs. power consumption efficiency, their development does imply significant non-recurring engineering (NRE) which can only be amortized with very important production volumes. In addition, in ASIC, applications are almost completely implemented in hardware, preventing any further update after the production of chips. Reconfigurable devices address this last issue, and can yet deliver good power efficiency. They are in consequence gaining attention of car manufacturers. Reconfigurable devices are able to change their own functionalities by updating some embedded configuration data. Furthermore, they are able like ASICs to provide with higher performance than GPP, while consuming less power by taking advantage of per-application optimization that they enable. One should distinguish between static reconfigurable devices and dynamic reconfigurable devices. If the first ones set their configuration only when system boots, the second ones are able to set their configuration while the system is running. Hence, dynamic reconfigurable devices make it possible to fragment configuration data and make them available only when needed. Such method enables to improve efficiency in functionalities for the die area. However, dynamic configuration can be a time consuming process which can become a performance bottleneck which is to be addressed during the design of both the dynamically reconfigurable chip and the ECU program. Such design issues can however be over passed as illustrated in [1] and [2] which succeed to integrate several functionalities of ECUs into a single dynamically reconfigurable device.

I.

INTRODUCTION

Modern vehicles are always to provide consumers with new functionalities which are above capabilities of traditional mechanical approaches. As a consequence, automotive has to rely more and more on electronic control unit (ECU) which became central component in cars to enhance passengers safety, provide greater comfort and improve environment friendliness. However, pressured by always growing needs, ECUs are to support increasingly complex functionalities. Furthermore, the augmentation of their number and the growing complexity of their interconnection network raise real estate issues. Hence, if in the one hand the implied required design effort dramatically increases, in the other hand market constraints are getting tighter, reducing the acceptable time to market (TTM) and raising pressure on ECU designers. Moreover, manufacturers are to enable head room in performance as ECUs are subject to be updated after shipment to keep track of new bug and features. ECU solutions based on general purpose processors (GPP) provide with a development flow from high level programming languages which allows painless design of complex applications. As a consequence, they also provide with short TTM as well as easy maintenance, even after shipment. Such solutions are also not expensive, as GPP are usually off-the-

978-1-4244-5035-0/09/$26.00 2009 IEEE

-388-

ISOCC 2009

Like GPP, application specific instruction set processors (ASIP) run programs as a suite of instructions. However, while GPP are designed to run all applications with an average efficiency through a basic instruction set architecture (ISA), ASIPs improve performances for a given target by adding application specific instructions. Such instructions are determined according to requirements of a designated application (or group of applications), and usually take advantages of specialized processing unit, some data processing parallelisation or a particular memory architectures. Common ASIP like [10] are not reconfigurable and the specific instructions are set at design time: the target application is fixed as the chip is fabricated (the term configurable processor is sometime used). Reconfigurable ASIPs like [3,4,5] make it possible to leverage flexibility by taking advantage of reconfigurable devices to implement application specific instructions: the ISA becomes dynamically defined. In such processor, the basic ISA is fixed in hardware (like in GPP) and application specific instructions are executed on a reconfigurable block. We developed a design tool for dynamically reconfigurable ASIP which is able to decide automatically which specific instructions are to be generated from an implementation of the target application in a high-level programming language like C. It enables to remove any hardware development step in ECU design, as usually required when ASIC and FPGA are used. This paper also introduces a reconfigurable processor as an application example for this tool. This paper introduces a development flow for reconfigurable ASIC, showing that such chips can be an important vector to facilitate development of ECU in automotive. Remaining paper is organized as follows. In section II, the ECU development process is briefly described before introducing in section III a dynamic reconfigurable processor architecture we developed. Our design tool for dynamic reconfigurable processors is then presented in section IV. Section V illustrates the efficiency of our tools and hardware using test programs intended for automotive applications. II. MODEL-BASED DESIGN

hardware. If previous steps can be done using simulators, the last design step, implementation, requires tests of the software object code against the real hardware.
Specification Algorithm design Software design Object code design Implementation

Fig. 1 Design refinement step The use of software solutions in the MDB removes any hardware related design and notably simplifies the whole ECU design process. In particular, the hardware / software partitioning step is set to its minimum. Hardware / software partitioning consists in identifying which parts of the application require hardware implementation to reach high performances, and which one can simply be executed on a less specific hardware (usually GPP). This step is led in the early stages of the design process and any mistake can lead to significant delay since performance issues are usually found during the implementation design step. III. 87.%#0

We developed a dynamic reconfigurable processor called Vulcan and a design tool for it [3]. Vulcan integrates a reconfigurable device as a reconfigurable datapath (RDP) into a processor core. Fig. 2 shows a block diagram of Vulcans datapath and register file. The processor core adopts a standard 32 bit RISC (Reduced Instruction Set Computer) ISA as a base instruction set and a five stage pipeline. The reconfigurable datapath consists in a network of processing elements (PE). Each configuration of the RDP corresponds to an applicationspecific instruction which extends the basic RISC ISA. As described in section IV, such configurations can be generated automatically by our design tool. During the execution of programs, the RDP is then used as one of the execution units of the processor core. By taking advantages of some parallelism or the specialization of its PEs, the RDP makes it possible to execute some operations more efficiently than with basic instructions Dynamic reconfigurable processors come with an overhead compared to corresponding GPP in term of execution time and chip area. The time overhead mainly consists in the configuration time of the reconfigurable block itself. In Vulcan custom instructions are implemented using a pair of two instructions: prcf and ci. The prcf instruction loads a configuration into the RDP While the subsequent ci instruction executes the configured operation. By separating the configuration and the execution steps, the overhead due to the reconfiguration of the RDP can be finely managed and reduced in some extent. These instructions are defined in addition to the basic RISC instructions and are part of the base ISA: they are generated by our design tool when compiling C sources of an

In the ECU development, model-based design (MBD) is widely adopted to address above issue [6,7,8,9]. In MBD applications are described under different levels of abstraction, each focusing on different testing goals. Those layers are then sorted and studied iteratively in the design refinement process, which enables to detect bugs and issues as early as possible and to avoid non necessary development. Fig. 1 shows the refinement process typically observed in the ECU development in the context of MBD. In the specification design step, the performance objectives as well as functionalities to be implemented are defined. In the algorithm design step, the algorithms themselves are designed to meet the specifications defined in the previous step. In the software design stage, algorithms designed previously are implemented using a programming language without considering hardware timing issues and target hardware related constraints. Those constraints are taken into account in the object code design step which aims at compiling the software for the very target

-389-

ISOCC 2009

application. Both instructions are encoded using 32 bits with respect to the RISC ISA format to reduce impact on the instruction decoder of the processor RISC comparing to the corresponding GPP.
Processor core Reconfigurable Datapath

a new target RDP is defined by devoted modular hardware libraries.


C source

ISAcc
Compiler Frontend

PE Register File

PE

PE

PE

PE
Instruction Library

Network PE PE PE PE PE

ISA Generator

Network
Datapath

Hardware Definition Library

Place & Route

Compiler Backend

Network PE PE PE PE PE
Configuration Data Object Code

Fig. 3 Overview of Compilation Flow Fig. 2 Vulcan Datapath Block Diagram Fig. 3 shows an example of an instruction sequence of Vulcan. prcf configures the RDP at lines 1 and 7. Then following custom instruction executed at lines 4 and 9 respectively.
1: 2: 3: 4: 5: 6: 7: 8: 9: prcf lb move cig1 move move prcf lw cig1 ISA14 $8, $2, ($3,$2), $5, $4, ISA15 $2, ($3, $2) 0x0($4) $8 $2, $6, $4, $10 $2 $3 0x0($2) $0, $6, $5, $2

V.

PERFORMANCE EVALUATION

To illustrate the potential of reconfigurable hardware for ECU design, this section will present results obtained with ISAcc and Vulcan with two mainstream applications. Furthermore, two different PE types will be used for the evaluation, as well as different RDP sizes, as illustrated by table 1. A cycle-based simulator is used to measure the execution time. Table 1 Size of RDP
Type of PE LUT ALU Number of PE 132 16

Fig. 3 Code Example IV. DESIGN TOOL

In order to reduce the TTM, it is desirable to design applications using a high level language rather than leading some time consuming hardware development using RTL. In the context of ECUs development where market constraints are very tight, it is a necessity. We developed ISAcc [3], a design tool which is able to generate automatically applicationspecific instructions for reconfigurable ASIP from a high level programming language such as C. Thanks to ISAcc, it is then possible to remove completely hardware development during the ECU design process which is normally required when using either ASIC or FPGA. Fig. 3 shows the code generation flow using ISAcc. Applications C source is parsed and converted into some internal data structures by the compiler frontend. The ISA generator searches customizable part in the internal data structure using an instruction library which consists in primitive elements of custom instruction. As a last step, the compiler backend generates an object code (including prcf and ci instructions) while the place & route generates configuration data for the RDP corresponding to the application-specific instructions. As noted above, resulting performances do depend on the nature of PEs; that is why we made ISAcc retargetable:

Fig. 4 shows the performance evaluation results. The base to calculate the speedup is Vulcan processor core without the RDP. Test applications are two mainstream algorithms. CRC performs a cyclic redundancy check on a given random input data. It makes high use of bitwise operations in its main loop, which explains the better performances with the LUT array against the ALU array. FFT performs a Fast Fourier Transformation. Arithmetic operations are mostly used and ALU array shows better performance improvement than LUT array. Note that both applications do use integer or fixed-point data.

Speedup
3 2.5 2 1.5 1 0.5 0 CRC FFT
1.7 2.7

Fig. 4 Performance

-390-

ISOCC 2009

As reference, table 2 shows number of CI automatically generated by ISAcc from C source code. Table 2 number of custom instruction (CI)
Application CRC32 FFT Type of PE LUT ALU Number of CI 12 27

REFERENCES
[1] J. Becker, M. Hbner, G. Hettich, R. Constapel, J. Eisenmann, and J. Luka, Dynamic and Partial FPGA Exploitation, Proc. IEEE, special issue: advanced automobile technologies, vol. 95, no. 2, pp.438-452, Feb. 2007 [2] Katarina Paulsson , Michael Hubner , Markus Jung , Jurgen Becker, Methods for Run-time Failure Recognition and Recovery in dynamic and partial Reconfigurable Systems Based on Xilinx Virtex-II Pro FPGAs, Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures, p.159, March 02-03, 2006 [3] Victor M. Goulart Ferreira, Lovic Gauthier, Takayuki Kando, Takuma Matsuo, Toshihiko Hashinaga, and Kazuaki Murakami, REDEFIS: a system with a redefinable instruction set processor, Proceedings of the 19th annual symposium on Integrated circuits and systems design, pp.14-19, Aug. 2006. [4] Vassiliadis Stamatis, Soudris Dimi, Fine- and Coarse-Grain Reconfigurable Computing [5] T. Kodama, T. Tsunoda, M. Takada, H. Tanaka, Y. Akita, M. Sato, and M. Ito, "Flexible engine: A dynamic reconfigurable accelerator with high performance and low power consumption," Proc. COOL Chips IX, pp.393408, April 2006 [6] Sudhir Sharma, Wang Chen, Using Model-Based Design to Accelerate FPGA Development for Automotive Applications, SAE World Congress & Exhibition, April 2009 [7] Edwards S., Lavagno L., Lee E.A., Sangiovanni-Vincentelli A., Design of embedded systems: formal models, validation, and synthesis, Proceedings of the IEEE, Volume 85, Issue 3, Mar 1997, pp.366-390 [8] H. Shokry and M. Hinchey, Model-Based Verification of Embedded Software. IEEE Computer, April 2009. [9] Kazuaki Murakami, Norifumi Yoshimatsu, Pradeep Rao, Shigeru Oho, Satoshi Shimada, Towards the Creation of an ECU Model Exchange Market,proceedings of ICROS-SICE International Joint Conference 2009, Aug 2009. [10] http://www.tensilica.com

VI.

CONCLUSION

Reconfigurable ASIP are able to provide a high level of flexibility and functionality, as well as higher performance and lower power consumption than general purpose processors. Furthermore, using design tools like ISAcc, the resulting TTM is comparable to the one of general purpose processor. ISAcc is a tool for dynamic reconfigurable processors which takes in input an application written in high level programming language such as C, and automatically generates application-specific instructions as well as object code. By using such flow, any hardware design can be removed from the ECU design process, dramatically reducing costs. In this context, we believe that reconfigurable processors are a realist and attractive approach to address many issues in ECU development.

-391-

ISOCC 2009