Design of an FPGA-based radiation tolerant agent for WorldFIP fieldbus

CERN makes extensive use of the WorldFIP fieldbus interface in the LHC and other accelerators in the pre-injectors chain. Following the decision of the provider of the components to stop the developments in this field and foreseeing the potential problems in the subsequent support, CERN decided to purchase the design information of these components and in-source the future developments using this technology. The first in-house design concerns a replacement for the MicroFIP chip whose last version was manufactured in an IC feature size found to be more vulnerable to radiation of high energy particles than the previous versions. nanoFIP is a CERN design based on a Flash FPGA implementing a subset of the functionality allowed by the communication standard, fitting the requirements of the different users and including the robustness against radiation as a design constraint. The development presented involved several groups at CERN working together in the framework of the Open Hardware Repository collaboration, and aiming at maximizing the interoperability and reliability of the final product.


Introduction
WorldFIP is an industrial fieldbus used at CERN for a large number of control systems. Currently there are more than 10.000 WorldFIP agents installed along the LHC accelerator and its pre-injectors. A significant factor in the selection of this technology in 2001 was its tolerance to radiation. The company Alstom was the main provider of the components and in particular of the integrated circuit performing the agent function on the bus, the microFIP chip. Unfortunately the latest version of this component was observed to be more sensitive to radiation than the earlier types. At the same time Alstom announced its decision to discontinue the developments in this technology and to phase out the support. Consequently CERN has decided to acquire the knowledge from Alstom and develop in-house a replacement for microFIP. The outcome is nanoFIP, an FPGA-based WorldFIP agent that comprises the tolerance to radiation as a design constraint.

A situable communication protocol 2.1 General features
WorldFIP fits in the profile family 5 of the IEC 61784 standard [1] and complies with IEC 61158 [2]. It provides deterministic network communication since the time needed for any process value to be updated in the control system is pre-determined. The access to the bus is controlled by a central Bus Arbitrator (BA) that grants bus access to the different agents following the sequence of a -1 - pre-configured table. Hence each station has guaranteed access to the bus at pre-defined time intervals. The protocol offers support for cyclic communication (such as process data), events (such as alarms) and messages (for example for configuration parameters). This is achieved through the usage of periodic and a-periodic variables, and a-periodic messages. The fieldbus can be deployed for up to 1000 m and 30 stations without repeaters. The inclusion of repeaters allows extensions beyond these values.
To enhance reliability WorldFIP implements several security mechanisms. The Medium Attachment Unit (MAU) is responsible for several functions including galvanic isolation to cope with failures on the physical layer. The data link layer includes time-outs to avoid miss-assignments of frames. Every frame travelling on the bus contains a 16-bit Frame Check Sequence (FCS) to prevent data corruption. Finally, the possibility of implementing redundancy on the physical medium, and the capacity of the network to accept any of the stations as possible BA purveys means for increased reliability in case of hardware failure [3].

Usage at CERN
The characteristics of WorldFIP matched the needs of many applications at CERN. However, not all these features are used and not all the applications use the same set of features. Bus speeds, cycle times and variable configurations differ for most users [4,5]. In some cases, certain choices were forced by the limitations present in the commercial solutions available. This was the case for example for the variable size and the usage of a-periodic communication. In 2006, an interdepartmental task force was set up at CERN to follow-up the evolution of the market and look for alternatives to cope with the expected difficulties in the long term supply for the components associated with WorldFIP technology. The conclusions of this task force stressed the strategic importance of the technology for CERN and the lack of suitable alternatives. The decision was made to acquire from Alstom the design information for all boards and components. The aim is to ensure the operation and maintenance of the existing infrastructure as well as to allow new developments based on the legacy WorldFIP applications [7].

.1 A subset of WorldFIP features
The first step of the in-sourcing project has been the development of nanoFIP, a replacement for the microFIP circuit. This circuit acts as the communication agent for the distributed Inputs and Outputs (I/O), providing several key functionalities for the transmission of data over the World-FIP fieldbus (figure 1). The project was set to be developed in the frame of the Open Hardware Repository collaboration. All the design information would be public and could be modified for new developments by any potential user. The open hardware approach encourages the regular iterations amongst developers resulting in optimized designs and increased reliability. At the same time this exchange and the review process that ensues are instrumental for harmonization between systems. A thorough survey within the user's community was launched to gather the set of mi-croFIP functionalities used by the different systems at CERN. An agreement was reached on a subset of features to which all new developments could adapt. Although not backwards compatible with the existing implementations, this set represents an improvement over some of the limitations of microFIP, like the variable size or the configuration options. Moreover, the consensus on the relative simplicity of the functionalities to be supported resulted in a faster development: • All traffic is performed through periodic variables. No support is provided for a-periodic variables or messages.
• The number of available variables is fixed to three: one consumed variable, one produced variable and one broadcast consumed variable.
• The size of the variables can be set up to a maximum of 124 bytes, with independent on-chip memory spaces for each variable.
• nanoFIP supports the three standard operating speeds for the bus bit rate: 31.25 kbit/s, 1 Mbit/s and 2.5 Mbit/s.
• Two modes of operation are possible: memory mode with full size variables and stand-alone mode with 2-byte consumed and produced variables directly transmitted without accessing the memory.
• A status byte can be appended to the produced variable for remote diagnostic.
• FCS checksum and network communication turn-around time-outs are implemented.
The attachment of nanoFIP to the physical medium is achieved by the FIELDRIVE and FIELDTR components from Alstom as was the case for microFIP. These two components have proven to be radiation tolerant and are supposed to be supported in the medium to long term.

The interface with the user logic
As seen above the current usage of the functionalities of the fieldbus standard at CERN is not homogeneous. Likewise, the existing implementations for the interface with the user I/O in the different CERN applications vary widely. They range from simple discrete logic and stand-alone -3 -operation to micro-controlled modes through FPGA or even micro-controllers. For nanoFIP it was decided to use a parallel memory-like open standard, the Wishbone system-on-chip interconnect. In memory mode the data bus is 8-bit wide with a 9-bit wide address bus, while in stand-alone mode the consumed and produced variables are exchanged over a 16-bit I/O port. A few control signals provide handshaking for data validity and support conflict detection when accessing the memory.
A strong specification requirement for nanoFIP was that no configuration sequence would be necessary so that nanoFIP would be immediately operational at power-up. As a consequence all configuration settings for the WorldFIP communication are applied through pins instead of writing to internal registers [8].

A radiation tolerant design
In many of the CERN applications, microFIP is operating in locations subjected to environmental radiation. This constraint was taken into account from the first stages in the nanoFIP conception. The use of aerospace radiation-hard silicon technology (ASIC or RT FPGA) was not compatible with the budgetary constraints of the project. On the other hand, certain commercial FPGA technologies were known to be able to operate reliably under levels of radiation compatible with our constraints. The target technology selected for the implementation of the nanoFIP design was the ProASIC-3 Flash FPGA from ACTEL. These components are immune to configuration corruption from Single Event Upsets (SEU) and are reported [9] to present a tolerance against Total Ionizing Dose (TID) above 200 Gy, which exceeds the requirement for the environment of nanoFIP.
Apart from selecting an appropriate hardware platform for nanoFIP, the circuit is intentionally kept simple by reducing the set of supported functionalities. In addition, several mitigation techniques are introduced in the digital design to enhance the reliability of operation. The application of automatic Triple Module Redundancy (TMR) at the level of the Precision RT FPGA synthesis tool ensures that every registered bit in the design is implemented as a set of three flip-flops with a voting system on its output. The same method is applied to the memory blocks used for variable buffering. Furthermore, optimizations by the synthesis tools on Finite State Machines (FSM) are controlled, forcing a safe encoding of seemingly unattainable states that could be reached as a result of an SEU. Nevertheless, several internal time-outs act as watchdog in case one FSM remains stalled.
In any event, the design of nanoFIP foresees several hardware and software reset possibilities. As the registers are in an undefined state when the ACTEL chip is powered-up, Power-On reset is supported through the addition of external passive components. From the network side, the processing of a reset variable allows the logic reset of either nanoFIP or the user logic. And finally the user logic can also force the reset of nanoFIP through a dedicated pin.

Design validation
Because of the strategic importance of this development for several CERN groups, particular care has been taken in the management of the project to guarantee a thorough validation of the nanoFIP design and prototypes. In parallel with the development of nanoFIP, a company was commissioned to develop a custom nanoFIP test card including a dedicated test firmware and software.
The VHDL source was exhaustively reviewed by panels of expert designers. Extensive simulations were carried out exploring the corner cases of the specified functionalities, specified error conditions and unspecified failure scenarios. Owing to these simulations, the debugging phase of the first prototypes was significantly smoothed giving way to intensive testing of the hardware. From the onset the test card could be used for the complete functional validation. Several millions of communication cycles were performed running uninterruptedly for weeks. The test networks were equipped with up to 19 agents while bus timings were tightened with bus cycle times down to 5 ms.
Subsequently, radiation tests were organized in the 230 MeV proton beam facility of the Paul Scherrer Institute (PSI) using 2 samples of the custom test card. No SEUs were observed under a total fluence of 7.4 E+11 and failure for cumulative effects only occurred after a TID of 400 Gy [10]. These results give a first confirmation of the immunity of the design to SEU and show a significant safety margin above the expected 100 Gy after 10 years of LHC operation. Additional larger scale radiation tests are planned in order to have a clear estimation of nanoFIP's cross section.

An added feature
Following the request of the Power Converters group, main users of the device, an extra feature has been added to the original nanoFIP specification. In order to allow the remote reprogramming of the user logic through the fieldbus, nanoFIP has been equipped with a JTAG controller interface. This interface uses general purpose pins of nanoFIP that are directly connected to the JTAG Test Access Port (TAP) of the user logic device. The programming sequence is received by nanoFIP by means of a dedicated variable. This sequence corresponds directly to the bits that have to be issued to the TAP port, so that nanoFIP does not need to perform any data processing. The development of this feature was the result of a joint effort between the user and the design team. The current implementation of this feature has been successfully tested with all main manufacturers of target devices and is pending for validation under radiation environment.
It is important to note that this feature is not intended to be used when the LHC machine is in operation. Nevertheless the radiation tests aim at verifying that the new logic inside nanoFIP does not compromise the operation of the user logic during irradiation, as well as checking the availability of the new functionality after irradiation.

Conclusion
CERN relies on the technology of WorldFIP as a performing, robust and open communication protocol for some of its industrial fieldbus and control applications in environments subjected to radiation. The commercial decisions of the main provider of the components presented a problem of availability for the operational requirements of the LHC in the coming years. In an example of collaboration between users and developers, CERN has put together at team that has delivered a solution to replace the most critical component with improved performance and increased reliability. The choice to keep the design simple has proven beneficial while allowing the flexibility to add functionality afterwards. The nanoFIP chip design is now available and is starting to be used in new designs by the equipment groups. The next steps of the in-sourcing project will have to address the possible obsolescence issues with the other components of the technology.