ENEE 647: Design of Distributed Computer Systems

UMCP ENEE 647 Indepth Course Description

Course Goals

Communication protocols, models of interprocess communication and synchronization in distributed operating systems, interprocess synchronization and communication primitives; remote procedure call protocols; electronic mail and store-and-forward communication; deadlock handling in distributed systems; processes and transactions in distributed systems; client servers models of computation; distributed shared memory; distributed file systems; recovery and fault-tolerance; protection and communication security.

Course Prerequisites

ENEE 459S or an equivalent operating systems course.

Topics Prerequisites

Basic understanding and knowledge of basic digital computer design including processor, memory, and I/O organization and interconnections and instruction set architecture. The student is expected to have basic knowledge and experience in assembly language programming for at least one instruction set architecture. Basic probability theory is required for performance evaluation.

References

  1. Andrew S. Tanenbaum, "Distributed Operating Systems," Prentice-Hall, 1995.
  2. Andrew S. Tanenbaum, "Computer Networks," 3rd edition, Prentice-Hall, 1995.

Core Topics

  • Review of multiprocessor systems with/without shared memory and of multicomputer systems. Interconnections between different systems components ranging from shared busses, crossbar switches, dedicated busses, to network interconnections. Examine the impact of these architectures on operating systems mechanisms and understand how network operating systems and distributed operating systems differ from centralized ones.
  • Client-server, peer-to-peer, and group communications models of distributed computation. Blocking and unblocking, reliable and unreliable, communication. Remote procedure call semantics.
  • Clock synchronization concepts and algorithms, mutual exclusion, election algorithms, distributed transactions, transaction atomicity (e.g., consistency, crash recovery), deadlock handling in distributed systems.
  • Process models and threads, thread semantics, processor allocation, allocation algorithms, processor scheduling in distributed systems.
  • Distributed file and directory system concepts, semantics of file sharing, file caching and replication algorithms. Design and implementation problems and solutions; case studies Network File System vs. Andrew File System, log-structured file systems. Trends in distributed file systems.
  • Shared memory architectures (e.g., on-chip, bus-based, ring-based, NUMA); consistency models for shared memories; paged based shared memory systems; shared-variable, memory systems; object-based shared memory.

Optional Topics

  • . Security in distributed systems and networks including: security threats in network and distributed systems; introduction to cryptography; authentication protocols based on symmetric and asymmetric cryptography; digital signatures; access control in distributed systems; case studies (e.g., DCE security).
  • . Fault tolerance in distributed systems including: classification of system failures (e.g., omission, timing, Byzantine failures), synchronous vs. asynchronous systems; redundancy; active replication; primary site back-up; membership and agreement protocols.