首页> 外文学位 >Networking issues in distributed real-time systems.
【24h】

Networking issues in distributed real-time systems.

机译:分布式实时系统中的网络问题。

获取原文
获取原文并翻译 | 示例

摘要

Networking involves every aspect in the design of the network infrastructure from the selection/synthesis of the interconnection topology to what communication protocols it should use and how it should be deployed and maintained. A large body of literature is available on these issues. We attempt to further increase this body of literature by looking at two specific issues: the synthesis of networks that satisfy multiple properties and the design of fault tolerant communication services for high-speed networks.; Synthesizing networks that satisfy multiple requirements, such as high reliability, low diameter, good embeddability etc., is a difficult problem to which there has been no completely satisfactory solution. Our approach to the problem involves a simple filtration process that takes as input a large number of randomly generated graphs. By using multiple filters, one for each requirement and arranging them such that one feeds the other, the final output consists of a short-list of networks that the designer can choose from. Our experimental results show that this approach is both practical and powerful. Perhaps our biggest achievement here is that we show how this seemingly simple approach can generate networks that are serious competitors to several traditional well-known networks. We further highlight the practical applicability of these networks by considering how they can be effectively used in a packaging environment.; The interconnection network can have a dominant effect on the reliability of a distributed system. While existing network softwares have been optimized for performance, they have not been able to deal with network failures effectively. We have developed a light-weight fault detection and recovery technique that provides coverage for almost all network interface failures. The detection is based on software watchdog timers and the recovery is based on delta-logging. We have implemented the schemes as a fault tolerance layer over Myrinet, a commercially available networking technology. The implementation showed that a fault detection time of 1 ms and a complete recovery time of around 0.5 second can be achieved with a performance impact of less than 10%. The effectiveness of our fault tolerance schemes was evaluated using a versatile performance and recovery analysis tool called RAPIDS.
机译:从互连拓扑的选择/综合到应使用的通信协议以及应如何部署和维护的网络拓扑,网络涉及网络基础结构设计的各个方面。关于这些问题的文献很多。我们试图通过研究两个具体问题来进一步增加这方面的文献:满足多种属性的网络的综合以及针对高速网络的容错通信服务的设计;满足诸如高可靠性,小直径,良好的可嵌入性等多种要求的网络是一个难题,目前还没有完全令人满意的解决方案。我们解决该问题的方法涉及一个简单的过滤过程,该过程将大量随机生成的图作为输入。通过使用多个过滤器,每个过滤器一个用于每个需求,并按照一个过滤器的要求进行排列,最终输出包括设计者可以选择的简短网络列表。我们的实验结果表明,这种方法既实用又有效。也许我们在此获得的最大成就是,我们展示了这种看似简单的方法如何生成与几个传统知名网络严重竞争的网络。通过考虑如何在包装环境中有效使用这些网络,我们进一步强调了这些网络的实际适用性。互连网络可以对分布式系统的可靠性产生主要影响。尽管现有网络软件已针对性能进行了优化,但它们仍无法有效处理网络故障。我们已经开发了一种轻量级的故障检测和恢复技术,该技术可以覆盖几乎所有网络接口故障。该检测基于软件看门狗计时器,而恢复则基于增量日志记录。我们已经将该方案实现为Myrinet(一种可商购的联网技术)上的容错层。该实现表明,可以实现1 ms的故障检测时间和大约0.5秒的完全恢复时间,而对性能的影响小于10%。我们使用称为RAPIDS的多功能性能和恢复分析工具评估了我们的容错方案的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号