FAQ
 

This page answers the most commonly asked questions about TTP. Some of these questions have arisen from the inherently complex nature of time-triggered real-time systems that must meet high safety and fault tolerance requirements. Other questions concern basic assumptions about, and design rules for, dependable computer systems.

Application Domain, Migration Strategies

 

Is TTP suitable for a wide range of safety-critical applications?
Yes. TTP has been designed for a broad range of applications. The focus, originally, was on backbone communication buses for automotive systems. For this kind of system, an important consideration is the ease with which existing as well as new, safety-relevant applications can be integrated into the system. This, therefore, was one of the main design considerations of TTP at the time. When the aerospace industry showed an interest in developing safety-critical applications using TTP, the high safety standards required for their applications were integrated into the design of TTP. Today TTP is used cross-industry-wise because it meets the highest safety requirements.

 

TTP can be used in different applications and industries. It has three important aspects.

  • Safety: A high safety standard has to be an integral part of system design from the very beginning. Improved safety cannot be "tested into" the system or be retro-fitted at a later time. TTP, therefore, has the highest safety requirements built into it as an integral part of its design. This is a very important aspect of making an architecture future-proof. Migration can be started with a non-critical application and continued with safety-critical applications. This can be done without changing the architecture since both application types are supported by TTP.
  • Composability: An increase in system complexity requires that the architecture support testing and integration. In a system having the property of composability, it is guaranteed that extensions which modify existing functions or have new functions will only affect specified subsystems. There can be no side-effects that call for a system-wide test. TTP provides composability for real-time communication.
  • Data rate: Future applications will require considerably higher data rates. TTP has been designed to provide no limit to data rates and throughput. With TTP, therefore, new functions can be added even while reducing the number of gateways and the cost of the overall architecture.

Does TTP meet the requirements of the industry?
TTP was evaluated with respect to several requirements for next-generation automotive applications. Its suitability for these applications was established with regard to:

  • Safety
  • Cost-effectiveness
  • High data rates
  • Composability and ease of system integration
  • Flexibility in terms of vehicle platforms and model variations
  • Suitability for end-of-line programming (flash programming) and diagnosis
  • Extensibility in the field

Furthermore, TTP also meets aerospace requirements for Level A safety-critical applications.

 

What data rates does TTP support?
The TDMA bus access scheme is collision-free and puts no limit on the data rate. The communication controllers available today support 25 Mbit/s synchronous and 5 Mbit/s asynchronous transmission (Asynchronous transmission is the method used over twisted pair wiring, synchronous transmission uses Ethernet-like wiring). Data frames can carry a payload of up to 240 bytes each. TTP has demonstrated a net data rate to gross data rate ratio of up to 85%. Prototype implementations have used 1 Gbit/s technology (lab experiment in 2002).

 

Does TTP limit the frame size (size of data bytes per transmission)? Do all TDMA transmissions need to have the same frame size?
TTP allows a free choice of the number of data bytes per transmission, which allows node configurations to be adapted to application requirements. As a result, configuration is highly flexible, and the bandwidth provided by the physical layer can be efficiently utilized.

 

What network topologies are supported by TTP?
TTP networks can contain up to 64 nodes. The cabling topology can be bus, star, or any combination of the two. Multiple stars or sub-buses on stars are also supported.

 

A redundant star topology with a bus guardian integrated into the star coupler has the advantage of combining the highest safety level with minimal cost. This approach offers a significant cost advantage compared to a node-local bus guardian architecture (about € 47 in a 10-node system).

 

For detailed information about TTP cost advantages please contact info@ttagroup.org.

 

How can existing applications be migrated to TTP?
In the aerospace industry today, time-triggered applications are already state-of-the-art. In the automotive industry, however, time-triggered applications will appear for the first time only in the next generation of products. This calls for an efficient migration strategy for existing CAN-based software. Apart from CAN-TTP-Gateway solutions, a CAN emulation layer can be used on top of TTP. In the emulation layer, the registers of a CAN controller module are emulated, and this allows for the re-use of existing CAN software. A small portion of the TTP bandwidth is reserved for CAN, and CAN messages themselves are transmitted within regular TTP frames (on this reserved portion of the bandwidth). On a TTP bus at 10 Mbit/s, approximately 5% of the net bandwidth needs to be reserved to emulate a high-speed CAN network at 500 kbit/s. An experimental hardware-based CAN emulation layer was implemented at the Vienna University of Technology in 2002.

 

Instead of a full CAN emulation hardware solution, a software solution is also possible. In this scenario, CAN signals and identifiers are transported over a middleware layer, which supports event channels on TTP.

 

Both options allow existing CAN-based software to be migrated onto a TTP-based system with minimal effort.

 

For detailed information about event data and migration strategy please contact info@ttagroup.org.

 

Safety, Availability, and Fault Tolerance

 

What methods were used to verify correct functionality and safety of TTP?
A large part of TTP has been formally verified. As a result, TTP is considered the most comprehensively verified communication protocol today-at least in the automotive industry. The following procedures were used:

  • Formal verification: The protocol's core algorithms for consistency, stability, and safety were checked for correctness with formal mathematical proofs. This kind of verification guarantees that all system states are checked, and it is in contrast to testing, which checks only a subset of all possible system states. Formal verification was, and is being, carried out in cooperation with SRI International and other leading universities such as University of Ulm, University of York, University of Paris, and Vienna University of Technology. NASA has also supported this effort.
  • Millions of fault injection experiments: In two large projects funded by the European Commission (PDCS and FIT), different methods of physical and software-based fault injection were used on TTP systems to evaluate TTP's fault tolerance behavior and error detection properties. The experience gained during these projects was extremely helpful in improving the existing technology; it has also demonstrated how cost-optimal safe solutions can be realized. The FIT project showed that a new concept of a bus guardian must meet highest safety requirements. Based on these experiences, the intelligent star coupler bus guardian was developed, which guarantees optimal safety at minimal cost. In 2003 the SP Swedish National Testing and Research Institute conducted heavy-ion fault injection experiments to validate the Time-Triggered Architecture (TTA) with TTP-C2 communication controllers. According to their report, the fact that no fail silence violations were observed with TTP-C2 communication controllers in the star and the bus topologies makes TTA one of the most reliable distributed computer architectures for highly dependable real-time systems.

    If you would like to receive a copy of the SP report, please contact info@ttagroup.org.

  • Certification: The use of TTP in applications with the highest safety level requires that the protocol be certified. The development process and the design of the TTP-C2 chip model are both fully documented as required by aerospace standards. The documentation and processes of TTTech Computertechnik AG, combined with the relevant documentation and processes of the respective semiconductor manufacturer are the basis to certify products according to the RTCA software standards used by FAA (Federal Aviation Agency) or JAA (European Joint Aviation Authorities). The OSEKtime-based operating system TTP-OS is developed according to the RTCA software standard DO-178B Level A. The firmware providing the protocol functionality of the AS8202NF communication controller (based on TTP-C2NF) is certifiable in compliance with the DO-178B standard for Level A applications. TTP-Verify has been designed as a software verification tool in compliance with the software development standard RTCA DO-178B and supports the verification of safety-critical distributed control systems developed under RTCA DO-178B Level A. This design directive is one of the most stringent certification standards for developing safety-critical software.
  • Project experience and use in commercial applications: TTP has been used in projects and prototypes for more than 10 years. FPGA-based systems were first used in 1995, and TTP ASICs have been available since 1998. Also, several development projects have used TTP and gained experience in using the system. The first TTP-based systems were deployed in the field in 2002. The first TTP-based fly-by-wire systems will be in commercial production in 2004. Honeywell deploys TTP in full authority engine control systems on the Lockheed Martin F-16 and the Aermacchi M-346 trainer fighter. Honeywell's APEX integrated cockpit using TTP is applied in single-engine turboprop aircraft such as GROB Ranger G 160, EXTRA EA-500, IBIS Ae270 and several other airplanes. TTP is used in the Airbus A380 Mega-Airliner. Nord-Micro has selected TTP as communication protocol for the Airbus A380 cabin pressure control system. Since 2002, Alcatel has been using TTP as field bus protocol in its commercially produced railway signaling system ELEKTRA 2. The system is used in Switzerland, Austria and Hungary.

A total of more than 170 man years of development have been invested in the safety of TTP.

 

Do the formal verification methods verify the actual algorithms in the chip?
No, they verify the algorithms as they are specified in the current protocol specification, which is also the basis for chip implementations. Additional verification activities in the form of peer-reviews and conformance testing verify that the algorithms in the protocol specification are implemented correctly in the chip design.

 

Is the proof of correctness at the protocol level of any use if application-level checks and end-to-end checks have to be performed in addition anyway?

Consistency checks at the protocol level guarantee that all communicating nodes have identical information, or that an error signal is raised by the communication layer. The consistency checks at the protocol level do not make the application-level checks obsolete, they support them. This assurance can be used as a proven argument in a system safety case-which makes the proof of safety at the system level much simpler. If the protocol does not provide such mechanisms as an acknowledgment and consistency check, each application must provide these mechanisms separately if they are required for safety or integrity reasons. Doing this can mean more effort and cost for each application. Furthermore, if the proof of correctness is carried out, for the most part, at the software level, then every change in the software may require an additional round of verification. For safety-critical functions, the costs could become very high. Proof of consistency at the protocol level, therefore, can simplify the safety case at the system level.

 

If a communication controller and a bus guardian are integrated on a single chip, would the requirements for safety-critical applications be met?
No, fault injection experiments have shown that common mode faults with uncontained effects are all too likely in a single-chip solution, at least for highly safety-critical applications. What is needed is an external bus guardian that has higher levels of integrity than a single-chip solution. External, node-local bus guardians are currently not available for TTP, but a central guardian (i.e. star coupler) is. However, for non-safety-critical applications, a single-chip solution of communication controller with integrated bus guardian offers improved availability at minimal cost.

 

What is the fault tolerance strategy of TTP?
The so-called "single fault" hypothesis. A fault hypothesis is required to define and validate the fault tolerance properties of a fault-tolerant system. The fault hypothesis explains the fault model that is the basis for error detection and fault tolerance. The "single fault" hypothesis requires that any single fault-up to arbitrary failure modes resulting from an error of a complete node-is reliably detected and tolerated. Besides that, many multiple faults can be detected and tolerated. In addition, the protocol specifies higher-level error detection for faults that are outside of the "single fault" hypothesis and therefore cannot be tolerated by TTP itself.

 

This leads to a three-level fault strategy, which can be adapted for each implementation and application level:

  • Operation in a fault-free scenario;
  • Error detection and integrated fault tolerance, for all single and some multiple faults;
  • Error detection (without appropriate fault tolerance strategy in the protocol layer) for all multiple communication faults; such faults, e.g. a complete temporary loss of communication, are not handled by the integrated fault tolerance of TTP but need the attention of the application (never-give-up strategy).

What are "SOS faults" and how are they dealt with in TTP?
This class of rare faults is especially tricky to address since it brings the communication subsystem into a state of permanent disagreement about the actual fault. "SOS" stands for "slightly-off-specification" and indicates that the effect is marginal, i.e., the SOS fault is detected as a fault by some components (such as line drivers and communication controllers) but accepted as a non-fault by others.

 

Such faults, once they occur, can create errors so frequently that a distributed error detection mechanism cannot be relied upon to pick them all up and handle them appropriately. The only way to address this kind of fault is to prevent SOS signals and remove them completely from the system. Such prevention must be performed by a dedicated central unit, such as a star coupler.

 

What is a "babbling idiot" fault?
If a communication participant (node) in a distributed system does not communicate in the intended way but tries to monopolize the shared communication link by sending "more than allowed," this node is called a "babbling idiot." There are many different faults that can lead to a babbling idiot, and include faults in the application software of event-driven communication systems, but can occur in any kind of communication system.

 

In fail-safe systems, which can shut down if a babbling idiot fault were to occur, only error detection of the babbling idiot is required. In fail-operational systems, babbling idiot faults must, additionally, be tolerated. The only reliable way of detecting and tolerating this fault is to have a separate guarding unit, often called a bus guardian, which protects the bus from any illegal access.

 

What happens in a TTP system when a communication error, or any other type of error, occurs?
In case of single faults-which are always tolerated-there is usually no reason for any kind of recovery. TTP detects and reports any single fault so that a specific system response can be initiated if necessary.

 

In case of multiple communication faults-especially so-called clique faults-a system restart may be the only option available to restore consistency in the system. This is because in such cases the communication system is unable to provide all nodes with correct and consistent data. When this happens, the application is notified of the fault and has to decide upon a reasonable course of action. For some applications, it may be useful to configure the system so as to continue operation with inconsistent data; in other cases, stand-alone operation in the safe state is more advisable. In whatever way the system chooses to proceed, it is the decision of the application-and not of TTP-to perform or not to perform a restart.

 

It should be noted, however, that multiple faults occur rarely if at all in a well-designed communication system. However, the possibility must be provided for, and a detection mechanism set up.

 

What is the "never-give-up" strategy of TTP?
TTP provides mechanisms to detect communication faults that cannot be tolerated by the communication layer (multiple communication faults, SOS faults). The never-give-up strategy is based on this error-detection capability and mandates a clean restart and recovery in all cases that cannot be handled by more efficient or faster recovery mechanisms.

 

Does TTP fault tolerance require whole nodes to be redundant, or can fault tolerance be made scalable?
This question refers rather to the system architecture than to TTP. TTP facilitates strictly synchronous redundant functions and therefore enables "replica-deterministic" redundant components. Components can be complete nodes or individual software modules. Such redundant functions (subsystems) can be dual-, triple-, or even higher-level-redundant. It is also possible to design different modes of operation on a node where some important redundant functions are executed (and some non-important ones are left out) only if another node executing the important function fails. In this way, a very fine scale of fault tolerance can be designed.

 

Membership and Consistent Agreement

 

What are the benefits provided by the TTP membership service protocol?
The purpose of the TTP membership protocol is to provide reliable error detection of all communication faults. Such faults include transmission faults, reception faults, and even some that occur outside the scope of the communication layer fault hypothesis (such as clique faults during transmission). Faults in the application software are not covered by the membership service, even though some application faults may lead to a membership loss (for example, the setting of a faulty life sign, and the requesting of an invalid mode change). The purpose of communication-level membership is to provide fast and reliable error detection. Using communication-level membership, an application can detect the source of a problem more efficiently and reliably, and can perform fast and correct error handling and recovery.

 

A correct membership flag of a node M in the local communication controller thus indicates "no errors in the communication between node M and the other node." In combination with the consistency checking, it also indicates "every node in the membership has detected correct communication with node M." A node, which is not in the membership, has been diagnosed as "not communicating correctly," which is quite different from "the CPU on that node has not set its life sign flag." A communication error is diagnosed within a short time interval, short being defined as two TDMA slots for consistent faults, less than two TDMA rounds for arbitrary communication faults.

 

The decision to introduce a life sign in TTP and not let a communication controller transmit if this life sign is not serviced correctly is not a primary design element for the membership, but allows intentional phases of receive-without-transmission by the application, e.g. for reconfiguration or reintegration.

 

Membership information can be read incorrectly at a node, and this would lead to inconsistent activity on that node. How can you protect against this?
There is no guarantee that information from a communication controller will be read correctly by the CPU. This applies not only to membership information, but also to all data. This issue is therefore neither a membership issue nor a TTP issue. It can be addressed by specific hardware mechanisms that can reliably detect or prevent such a fault, or by not making the fail silence assumption for a node. Both approaches can be used with TTP.

 

How much of the membership mechanism is used in the formal verifications of TTP? What is the semantics of the membership in these verifications?
All formal verifications of TTP take account of the membership mechanism with regard to how it provides consistency of communication. (For a definition of consistency, see the discussion below.)

 

This means that a case for the safeness of a distributed system using the membership mechanism can assume the system's consistency properties to be verified, provided that the mechanism has been correctly implemented (in hardware) and the implementation has been independently verified. Otherwise, each software implementation has to be verified individually, and for each change made to the software.

 

How long does the agreement of the TTP membership service take in the worst case? Is this fast enough for any known application?
For consistent faults (i.e., faults which are seen in the same way by all nodes) two successive nodes (i.e., two sending slots with a typical combined length of 200 microseconds) are needed in the worst case. In the best case, which is the fault-free case, only one node is needed.

 

For inconsistent faults, the worst case is 1.5 rounds, with round being defined as the shortest time interval between any two transmissions of a node. Nodes may have a longer interval between two successive transmissions if multiplexing is used.

 

Both implementations are optimal with respect to timing. There is no faster way to provide single fault-tolerant acknowledgment and agreement over the bus, regardless of the application or communication protocol.

 

It is of special importance that the acknowledgment logic is implemented in dedicated hardware. In a high-speed system with 16 nodes and a round duration of 1 millisecond, error detection and acknowledgment must be calculated every 66 microseconds. When performing acknowledgment and consistency services in the application, different methods and slower acknowledgment cycles are used.

 

For detailed information about membership and consistent agreement please contact info@ttagroup.org.

 

What does "consistent transmission" mean? Does it mean that all nodes connected to the same medium receive data practically simultaneously?
The common definition of "consistency" includes much more than just the concept of "simultaneous reception." Most prominently, it includes the notion of "common knowledge," which is taken to mean that all information from a node is present on all other nodes that consider this information relevant. Also, that the information is present on "all other nodes" is taken to be guaranteed-not just assumed. The existence of such a guarantee would require two conditions to be satisfied: first, a mutual confirmation of reception (acknowledgment); and second, global acceptance of this confirmation (agreement).

 

Assumed or guaranteed consistency is a core element in the design of a distributed system. Only if there is consistency in lower layers (especially communication layers) can a higher layer (such as the application layer) assume that the information the system processes and produces is commonly available. Often, this requirement is assumed and not validated-an improper design choice for safety relevant systems.

 

Why does TTP perform communication-level acknowledgment and consistency checks when it is necessary to perform application-level acknowledgment anyway?
The question can also be put like this: if a distributed application has to ensure that it is in a consistent state, why must the communication system also check that it is in a consistent state? Has the application-level checking not ensured that already?

 

The error detection mechanisms used by TTP to ensure consistency at the communication layer (membership, acknowledgment, clique detection) complement application-level agreement/acknowledgment. Checking if a specific sensor value was correctly sent and received is a different issue than checking if the control function using that sensor value has computed correct outputs, and if these outputs were correctly received by the actuation unit.

 

It is arguably correct that a higher-level error detection like "is the output of the control function correct?" will also detect a missing transmission of the sensor value, which was an input to that control function. But it will take longer to detect this error, and it may be more difficult to find out what went wrong in the first place (it was not the control function!). So for some functions the application-level agreement may be quite adequate, while for others the faster and more specific communication-level mechanisms may be required. This is especially true for fast control functions, where responses to input data have to be given with the same rate as the communication cycle.

 

Does membership reduce system availability? Does it force a node to restart after a single communication fault?
No, the membership service does not reduce system availability. Membership serves to detect all communication faults, including transmission and reception faults. If a node fails to transmit (typically due to noise during the transmission) and is therefore removed from the membership, every node, including this node, detects it. The node can retry transmission in the next round (not immediate retransmission!), and, if it succeeds, is included in the membership again. Faults which cannot be tolerated at the communication level (some multiple faults, clique faults), are reported and have to be handled at a higher layer. If the system decides that a restart is the best action to take in the event of a non-tolerable fault being reported, it must be understood that the restart is a system decision; the restart cannot be attributed to the membership service. The membership service simply detects the fault.

 

Clock Synchronization

 

TTP is designed for systems with four or more nodes. How robust is the clock synchronization with only two or three nodes? Does it meet the requirements for safety-critical applications?
TTP systems with two or three nodes (as active members in the clock synchronization) work reasonably well; the precision is less accurate than for larger systems. However, one of the core characteristics of a TTP system, single fault tolerance, is not met if there are less than four nodes. Single fault tolerance for asymmetric faults requires at least four nodes.

 

The safety of the TTP clock synchronization has been verified formally and experimentally. It can be guaranteed that: (i) the synchronization condition is fulfilled, and (ii) that any violation of this condition can be detected for all faults within the fault hypothesis. This guarantee is based on a formal proof.

 

What is the difference between rate and offset correction?
The clocks in a distributed system inevitably drift away from each other. The amount of drift is unpredictable and varies with temperature, voltage, and age of the crystals. Consequently, periodical resynchronizations of the clocks are necessary if a notion of common time-synchronization-is to be established.

 

There are two ways to resynchronize clocks. One way is to periodically check the difference and adjust each clock so as to reduce the offset among them as much as possible. This method is called 'state' or 'offset' correction, and has to be repeated periodically. An example of this type of correction is the way we correct our wristwatches every so often. This kind of correction mechanism is a simple and robust one.

 

Another way to resynchronize clocks is to periodically check how fast the clocks drift away from each other, and adjust the rate of drift so that the times displayed stay "closer together." This method, called 'rate' correction, is akin to cruising behind another vehicle and periodically adjusting one's own speed to that of the vehicle in front so that the distance in between remains approximately constant.

 

These two mechanisms can be combined or alternated to suit different situations.

 

In TTP, the clocks of the communication controllers are synchronized during runtime by offset correction only. The reason is that for this mechanism-in the specific form that is employed in TTP-a formal-mathematically proven assertion exists that no single fault in the system can bring the synchronization algorithm into an instable state.

 

Offset correction is available in TTP in two ways, indicated below. It should be noted that neither of these is part of the runtime synchronization mechanism inside the TTP protocol, but needs support from the application.

 

1. External Rate Correction: Adjustment of the whole TTP cluster drift to an external clock, such as a GPS receiver. This does not improve synchronization among clocks within the TTP network; it keeps the network synchronized with reference to the external time source.

 

2. Clock Calibration: Static calibration of each TTP controller in the network. The drift rate measurement is performed before system runtime, and an appropriate adjustment value is stored in the TTP configuration data (MEDL). This minimizes the static amount of drift between the clocks, and is a suitable mechanism for dealing with drift that results from the aging of clock crystals. It does not, however, account for changes in drift during the drive cycle (e.g. temperature).

 

How much efficiency does TTP lose by not using internal rate correction for clock synchronization?
Rate correction offers optimized inter-frame gaps, i.e., the smallest possible gaps between any two subsequent transmissions in a TDMA network. TTP does not utilize rate correction - how much efficiency is lost due to the resulting longer inter-frame gaps? This question is best answered with an example.

 

Assume a network of 25 ECUs, each of which sends 50 bytes per round. The duration of a round is 2 milliseconds, and the data rate is 10 Mbit/s. The busload for the network is 50%. Further, assume a worst-case drift of the individual clocks of +/- 1000 ppm. This could lead to drifts of as much as 4 microseconds per round if no rate correction is performed. With a safety margin for measurement and propagation delay jitter of over 100%, we can assume that 10 microseconds would make a reasonable precision requirement for this network.

 

10 microseconds represent half a percent of the round. That would be far worse than the value that could be achieved with rate correction (e.g. 1 microsecond).

 

The overall loss of efficiency depends on how often this interval of 1 or 10 microseconds occurs during the round. In a network where each node sends exactly once, we would get 25 intervals for 25 nodes: 250 microseconds or 12.5% of the round. We could take four times as much "gap" (so as, say, to prevent noise on the line that lasts longer than 10 microseconds from affecting two subsequent transmissions) and we would still have 50% of the round for data transmission: 25 ECUs, each of which sends 50 bytes.

 

The loss of efficiency therefore depends on the number of transmissions per time, which again corresponds to the data length of these transmissions. As shown in the above example, even a rather large precision still allows for a high data efficiency.

 

Does the TTP clock synchronization tolerate multiple faults?
The membership mechanism of TTP is capable of detecting any kind of communication fault that is not already detected and handled by other means. This makes the TTP clock synchronization very robust. The membership mechanism removes faulty nodes from the clock synchronization algorithm, thus ensuring that only consistent nodes are used for clock synchronization. In this respect, TTP tolerates even multiple faults, as long as they do not create other system-level problems. That is very likely because most functions do not have triple redundancy.

 

Flexibility

 

Can TTP send event-triggered messages?
TTP can send only time-triggered messages, but an application using TTP can send both time-triggered and event-triggered messages. The transmission of event-triggered messages is performed over an event channel (bandwidth is reserved for event transmissions inside the TDMA slots, and the messages use identifiers). A typical event channel mechanism is a CAN emulation, in which a CAN-compatible interface is provided and the CAN messages are transmitted inside TTP data frames. For a 500 kbit/s CAN, such an emulation would require about 5% of a 10 Mbit/s TTP system.

 

Composability of the TTP system is maintained when using event transmission in this way because bandwidth is not arbitrated among different nodes (as in CAN or Byteflight)-but only among different functions within a node. Timing and bandwidth analysis for event transmissions is therefore done on a per-node basis and does not need system-level design.

 

How can typical bandwidth-intense event transmissions like diagnosis and download be performed in TTP?

Diagnosis data, like diagnosis requests and answers, are typically transmitted over event channels. A diagnostic channel for common diagnosis protocols requires about 1% per node of the net bandwidth of a 5 Mbit/s TTP system. This bandwidth is protected by the bus guardian, just like every other transmission.

 

For download and end-of-line programming, TTP offers a dedicated download mode. This mode requires a special programming device, a Download Master, and is performed in a simple master-slave fashion, utilizing the complete bus bandwidth for flash programming or similar. This is a maintenance functionality, which does not affect, or require, time-triggered operation.

 

Is the length and data size of each TDMA slot individually configurable?

Yes, the duration of slots and the number of bytes per slot are configured individually in the MEDL. The duration of a node's slot in relation to the length of the TDMA round represents the share of bandwidth that this node "owns" in the network. This allows flexible configuration and optimization of bandwidth usage.

 

Can the two channels be used for different data?

Yes, TTP allows sending data on both channels (redundantly), or just on one channel (in which case more bandwidth is available). Any amount of "mixing" of redundant and non-redundant data is possible. Of course, fault tolerance is not provided for non-redundant data.

 

Can nodes be connected to only one channel?

In principle, this is possible, even for consistent TTP slots. It is, however, not recommended because it causes application-specific restraints to the single fault hypothesis at the protocol level. Spatial proximity faults can be more efficiently handled by using a TTP star architecture. Consistency between nodes not connected to the same channel is not supported as it contradicts the principle of consistent communication among all nodes.

 

Why can a node not send several times during one round?

A TTP round is defined as the shortest interval between two sending slots of any node. A simple and thus safer bus guardian design and the verifications of safety mechanisms are the most prominent reasons for this definition of a round: it must be guaranteed that each node can make an influence on acknowledgment and clock synchronization at most once per round in order to validate the “single fault” hypothesis. If nodes need a slower rate of retransmission, this is possible using the “multiplexing” feature. Since the size of each TDMA slot in TTP can be configured individually with up to 240 bytes per slot and nodes with slower retransmission requirements can be defined as “multiplexed,” there is no reason for multiple sending slots in TTP.

 

Configuration Data (MEDL) and CPU Interface (CNI)

 

Does the MEDL (message descriptor list) have to be changed on all ECUs in the network when the communication design is changed?
No, not generally. Since the MEDL contains only information about slot sizes, and not data, it is only when a slot size is changed that the communication tables need to be updated-and therefore the MEDL. Not-yet-allocated, and therefore empty, slots are reserved for future ECUs. If an ECU is added, then only nodes that receive data from the new ECU have to be updated. As a result, only component testing is required before a new ECU is integrated into the system. The overall communication behavior is unchanged because the MEDL remains the same. Changing the MEDL, however, would necessitate a system retest.

 

A change in the MEDL would also be necessary if there is insufficient or no bandwidth left for new functionality. In such a situation, the remaining bandwidth at existing nodes must be combined to provide the additionally required bandwidth. This requires a complete, but probably quite small change in the system communication timing. Obviously this problem of "updating a nearly 100% full system" is a general issue and not directly related to TTP.

 

Does the use of a MEDL address the necessity of providing identical communication interfaces between platforms and suppliers and support for changes without reprogramming of nodes that are unaffected by a change?
Yes. This is because the MEDL defines only the length of slots, not the contents of slots. The contents of a slot (its size, position, and the packing of signals inside data frames) are defined by a separate middleware layer (OSEKtime FT-COM). A suitable platform strategy can be specified accordingly.

 

Do all communicating nodes have to have all communication information (the complete communication matrix) in their MEDLs?
No, the MEDL contains only information about the slot lengths, not about the contents of the individual slots. Consequently, multi-vendor projects with restricted information about the communication interfaces are supported by this MEDL architecture.

 

How much memory does the MEDL require?
Experience has shown that the memory needed for the MEDL is not an issue. Very complex systems may have about 3 kB, but such systems will always have a large portion of software to pack and unpack the messages in a node, and the MEDL can be seen as being part of this software. For other communication systems, the information contained in the MEDL (communication controller configuration driver) is stored in the non-volatile memory of the CPU instead.

 

There is no restriction on the structure and size of the MEDL in the protocol specification, as well as no restriction on the implementation of MEDL memory (RAM, Flash). Silicon manufacturers are therefore free to optimize their MEDL design for best efficiency and cost.

 

Does the CNI (communication network interface) require the CPU to access all communication data synchronously?
No, asynchronous access is possible using the non-blocking write protocol. Typically, a separate middleware layer (OSEKtime FT-COM) performs the interfacing to the CNI, thus removing the need for the application software to do any direct CNI access.

 

Why does TTP put received data into memory locations defined by the MEDL, and not into message buffers using identifiers, like CAN?
For two reasons: memory efficiency and fault detection.

  • Memory efficiency: If transmissions can be up to 8 bytes long, as they are in CAN, not very much memory is wasted if a single byte message is received in an 8-byte buffer. If transmissions can be up to 200 bytes long, buffers will have to be 200 bytes long. However, many transmissions contain only a couple of bytes.
    The available communication memory (CNI) in TTP can be allocated in any pattern, regardless of the length of transmission. This allocation is configured in the MEDL. In a 2 kB memory area, for instance, there can be as many short or as many long buffers as are needed, so long as they all fit within 2 kB. If they do not, overlapping structures can be defined, in which data is overwritten at known intervals. So a TTP CNI is never "overfull." More CNI memory simply means fewer CPU activations to get the data out-and therefore more time for application tasks.
  • Fault detection: When using identifiers, the problem of masquerading can occur: Some ECU in the system transmits an identifier that has not been assigned to that ECU. By adding check data into the transmission itself (and taking up some bandwidth as well as CPU time at the sender and receiver side), it can reliably be checked whether the message is from the sender that the sender claims to be. The problem is that it cannot usually be detected who the faulty sender is in the case of a masquerading fault. Any ECU could be the faulty one.
    TTP does not use any identifiers at the protocol level. All transmissions are identified only by the time of transmission. Identifiers can be included in the transmissions to signal message content, but the sender cannot masquerade as another. It is always 100% certain which ECU sent a specific transmission, correct or faulty.

Why does TTP not use buffer management?
Buffers are feasible only for short data packets. With larger buffers (say 100 bytes), the buffer memory would be unused for most of the time since most data packets would be much shorter than the maximum length. In TTP, each buffer would need to have 240 bytes, especially if double buffering (for atomic read operation) and a reasonable number of buffers (at least 16 or 32) are to be supported. The TTP CNI uses address-mapped data frames instead, and therefore allows a perfect optimization of the memory usage of data packets.

 

Furthermore, the update rate for buffers may vary strongly, while the update rate of CNI frames is constant and known. If a buffer has not been read when a new matching message is received, the buffer contents have to be overwritten or the new data will have to be discarded. In TTP, the newest data for any location in the CNI DPRAM memory is always available, the CPU is not affected by a large interrupt load (especially on fast communication systems with short rounds), and no data will be lost even in the worst-case load scenario.

 

Operation of Partial Networks and Start-Up

 

Do the fault tolerance mechanisms of TTP cause problems if only parts of a network are present, as is often the case during tests?
Typically, they do not cause any problem. The distributed clock synchronization is the only mechanism that might be susceptible to problems for this reason. That is because it requires a working subset (at least two nodes) of the "actively synchronizing" set of nodes in the full network to be present. This requirement of the distributed clock synchronization can be fulfilled with rest bus simulation hardware (TTP nodes sending in the slots of the missing nodes) if necessary.

 

Can TTP start up with noise on a channel?

Yes, this is possible on a bus. When an intelligent star architecture is used, both channels can be noisy. Startup can still be performed, however, provided the noise does not destroy all communication.

 

Physical Layer

 

What dedicated TTP physical layers are available?
TTP has been tested on several existing physical layers. Most systems in use today have high-speed CAN transceivers or RS485. Except for MOST, no physical layer is currently available that supports automotive requirements at data rates higher than 1 Mbit/s.This applies to all communication protocols in use today. The physical layer specification for TTP has now been finalized. This specification is a result of the TTA-Group Physical Layer Working Group.

 

Does TTP support electrical and optical data transmission?
Yes, a system has been tested in which one channel was made of copper and the other of glass.

 

What is the difference between a dedicated TTP physical layer and RS485?
The dedicated TTP physical layer differs from RS485 in the following features:

  • Defined bus idle behavior without bus biasing
  • Excellent common mode behavior (-12 / +20V) and high speed
  • Fail-silent behavior when exceeding the permissible common-mode range
  • Wake-up function and sleep mode
  • Built-in diagnostics for electric bus faults
  • Compatible with 42V power supply (PowerNet)
  • Adaption to communication controller supply voltage (2.5V, 3.3V, 5V)

What are the requirements for a dedicated TTP physical layer?
The finalized specification takes into account all timing and level requirements for a dedicated TTP physical layer. As to the environment requirements, the same guidelines obtain as for existing CAN networks. Additionally, higher safety and support of 42V power supply are included in the specification.

 

Was any testing done under close-to-real conditions?
Yes, e.g., the simulation and test setup of an aircraft bus architecture. This test involved various cable lengths (50 and 100 m) and investigated the aging effect on connectors. All functional and EMI/EMC tests, including lightning upset tests, were passed in accordance with RTCA DO-160D.

 

What EMI/EMC requirements were included in the design of a dedicated TTP physical layer?
The EMI/EMC requirements included in the design of a dedicated TTP physical layer were as follows: ISO-pulse-test (ISO7637-1, ISO7637-2, ISO7637-3), ESD protection, EMI, and EMC. Additionally, many environment requirements from the automotive industry were also taken into consideration.

 

Does TTP support energy management like shutdown and wake-up?
Yes, the physical layer supports these functions.

 

Is wake-up also supported by standard drivers?
Yes, in principle. Standard drivers such as high-speed CAN use dominant bits in the data stream for detecting wake-up. Due to MFM encoding, TTP supports wake-up only up to a certain data rate, depending on the component used.



Information