Donald Yeung's Publications

Articles in Refereed Symposia, Conferences, and Workshops:

Wanli Liu and Donald Yeung. Using Aggressor Thread Information to Improve Shared Cache Management for CMPs. To appear in Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT-XVIII)}. Raleigh, NC. September 2009.
[pdf, gzip'd ps]

Xuanhua Li and Donald Yeung. Exploiting Value Prediction for Fault Tolerance. In Proceedings of the 3rd Workshop on Dependable Architectures (WDA-III). Lake Como, Italy. November 2008.
[pdf, gzip'd ps]

Xuanhua Li and Donald Yeung. Application-Level Correctness and its Impact on Fault Tolerance. In Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA-XIII). Phoenix, AZ. February 2007.
[pdf, ps]

Xuanhua Li and Donald Yeung. Exploiting Soft Computing for Increased Fault Tolerance. In Proceedings of the 2006 Workshop on Architectural Support for Gigascale Integration. Boston, MA. June 2006.
[pdf, ps]

Seungryul Choi and Donald Yeung. Learning-Based SMT Processor Resource Distribution via Hill-Climbing. In Proceedings of the 33rd International Symposium on Computer Architecture (ISCA-XXXIII). Boston, MA. June 2006.
[pdf, gzip'd ps]

Kursad Albayraktaroglu, Aamer Jaleel, Xue Wu, Manoj Franklin, Bruce Jacob, Chau-Wen Tseng, and Donald Yeung. BioBench: A Benchmark Suite of Bioinformatics Applications. In Proceedings of the 2005 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-V). Austin, TX. March 2005.
[pdf] [benchmark suite download]

Deepak N. Agarwal, Sumitkumar N. Pamnani, Gang Qu, and Donald Yeung. Transferring Performance Gain from Software Prefetching to Energy Reduction. In Proceedings of the 2004 International Symposium on Circuits and Systems (ISCAS2004). Vancouver, Canada. May 2004.
[pdf, gzip'd ps]

Dongkeun Kim, Steve Shih-wei Liao, Perry Wang, Juan del Cuvillo, Xinmin Tian, Xiang Zou, Hong Wang, Donald Yeung, Milind Girkar, and John Shen. Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors. In Proceedings of the 2004 International Symposium on Code Generation and Optimization with Special Emphasis on Feedback-Directed and Runtime Optimization (CGO2004). San Jose, CA. March 2004.
[pdf, gzip'd ps]

Deepak Agarwal, Wanli Liu, and Donald Yeung. Exploiting Application-Level Information to Reduce Memory Bandwidth Consumption. Fourth Workshop on Complexity-Effective Design. San Diego, CA. June 2003.
[pdf, gzip'd ps]

Dongkeun Kim and Donald Yeung. Design and Evaluation of Compiler Algorithms for Pre-Execution. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X). San Jose, CA. October 2002.
[pdf, gzip'd ps]

Gautham K. Dorai and Donald Yeung. Transparent Threads: Resource Allocation in SMT Processors for High Single-Thread Performance. In Proceedings of the 11th Annual International Conference on Parallel Architectures and Compilation Techniques (PACT-XI). Charlottesville, VA. September 2002.
[pdf, gzip'd ps]

Nicholas Kohout, Seungryul Choi, Dongkeun Kim, and Donald Yeung. Multi-Chain Prefetching: Effective Exploitation of Inter-Chain Memory Parallelism for Pointer-Chasing Codes. In Proceedings of the 10th Annual International Conference on Parallel Architectures and Compilation Techniques (PACT-X). Barcelona, Spain. September 2001.
[pdf, gzip'd ps]

Abdel-Hameed A. Badawy, Aneesh Aggarwal, Donald Yeung, and Chau-Wen Tseng. Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations. In Proceedings of the 15th Annual International Conference on Supercomputing (ICS-XV). Sorrento, Italy. June 2001.
[pdf, gzip'd ps]

Nicholas Kohout, Seungryul Choi, and Donald Yeung. Multi-Chain Prefetching: Exploiting Memory Parallelism in Pointer-Chasing Codes. Solving the Memory Wall Problem Workshop. Vancouver, Canada. June 2000.
[pdf, gzip'd ps]

Donald Yeung. The Scalability of Multigrain Systems. In Proceedings of the 13th Annual International Conference on Supercomputing (ICS-XIII). Rhodes, Greece. June 1999.
[pdf, gzip'd ps]

Donald Yeung, Nicholas Kohout, Sujata Ramasubramanian, Ilya Khazanov, and Rishi Kurichh. Vortex: Irregular Data Stream Support for Data-Intensive Applications. Eighth Scalable Shared Memory Multiprocessors Workshop. Atlanta, GA. April 1999.
[abstract]

Andras Moritz, Donald Yeung, and Anant Agarwal. Exploring Optimal Cost-Performance Designs for Raw Microprocessors. In Proceedings of the 6th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM-VI). Napa, California. April 1998.
[pdf, gzip'd ps]

Donald Yeung, John Kubiatowicz, and Anant Agarwal. MGS: A Multigrain Shared Memory System. In Proceedings of the 23rd Annual International Symposium on Computer Architecture (ISCA-XXIII). Philadelphia, PA. May 1996.
[pdf, gzip'd ps]

Anant Agarwal, Ricardo Bianchini, David Chaiken, Kirk L. Johnson, David Kranz, John Kubiatowicz, Beng-Hong Lim, Ken Mackenzie, and Donald Yeung. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA-XXII). Santa Margherita, Italy. June 1995.
[pdf, gzip'd ps]

Donald Yeung and Anant Agarwal. Experience with Fine-Grain Synchronization in MIMD Machines for Preconditioned Conjugate Gradient. In Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP-IV). San Diego, California. May 1993.
[pdf, gzip'd ps]

Invited Conference Papers:

Stephen Crago, Janice Onanian McMahon, Chris Archer, Krste Asanovic, Richard Chaung, Keith Goolsbey, Mary Hall, Christos Kozyrakis, Kunle Olukotun, Una-May O'Reilly, Rick Pancoast, Viktor Prasanna, Rodric Rabbah, Steve Ward, and Donald Yeung. CEARCH: Cognition-Enabled Architecture. In Proceedings of the 10th Annual High Performance Embedded Computing Workshop. Lexington, MA. September 2006.
[pdf]

Articles in Refereed Journals:

Wanli Liu and Donald Yeung. Enhancing LTP-Driven Cache Management Using Reuse Distance Information. Journal of Instruction-Level Parallelism. Vol. 11. pp. 1-24. April 2009.
[pdf, gzip'd ps]

Seungryul Choi and Donald Yeung. Hill-Climbing SMT Processor Resource Distribution. ACM Transactions on Computer Systems. Vol. 27, No. 1. February 2009. (c) 2009 ACM
[pdf]
(ACM digital library distribution)

Xuanhua Li and Donald Yeung. Exploiting Application-Level Correctness for Low-Cost Fault Tolerance. Journal of Instruction-Level Parallelism. Vol. 10. pp. 1-28. September 2008.
[pdf, gzip'd ps]

Sumit Pamnani, Deepak Agarwal, Gang Qu, and Donald Yeung. Low Power System Design with Performance Enhancement Techniques--General Approach and Case Study. Journal of Circuits, Systems, and Computers. Vol. 16, No. 5. pp. 745-767. October 2007.
[pdf]

Dongkeun Kim and Donald Yeung. A Study of Source-Level Compiler Algorithms for Automatic Construction of Pre-Execution Code. ACM Transactions on Computer Systems. Vol. 22, No. 3. pp. 326-379. August 2004. (c) 2004 ACM
[pdf, gzip'd ps]
(ACM digital library distribution)

Abdel-Hameed A. Badawy, Aneesh Aggarwal, Donald Yeung, and Chau-Wen Tseng. The Efficacy of Software Prefetching and Locality Optimizations on Future Memory Systems. Journal of Instruction-Level Parallelism. Vol. 6. July 2004.
[pdf, gzip'd ps]

Seungryul Choi, Nicholas Kohout, Sumit Pamnani, Dongkeun Kim, and Donald Yeung. A General Framework for Prefetch Scheduling in Linked Data Structures and its Application to Multi-Chain Prefetching. ACM Transactions on Computer Systems. Vol. 22, No. 2. pp. 214-280. May 2004. (c) 2004 ACM
[pdf, gzip'd ps]
(ACM digital library distribution)

Gautham K. Dorai, Donald Yeung, and Seungryul Choi. Optimizing SMT Processors for High Single-Thread Performance. Journal of Instruction-Level Parallelism. Vol. 5. pp. 1-35. April 2003.
[pdf, gzip'd ps]

Andras Moritz, Donald Yeung, and Anant Agarwal. SimpleFit: A Framework for Analyzing Design Tradeoffs in Raw Architectures. IEEE Transactions on Parallel and Distributed Systems. Vol. 12, No. 6. pp. 730-742. June 2001.
[pdf, gzip'd ps]

Donald Yeung, John Kubiatowicz, and Anant Agarwal. Multigrain Shared Memory. ACM Transactions on Computer Systems. Vol. 18, No. 2. pp. 154-196. May 2000. (c) 2000 ACM
[pdf, gzip'd ps]
(ACM digital library distribution)

Anant Agarwal, Ricardo Bianchini, David Chaiken, Frederic T. Chong, Kirk L. Johnson, David Kranz, John D. Kubiatowicz, Beng-Hong Lim, Kenneth Mackenzie, and Donald Yeung. The MIT Alewife Machine. Proceedings of the IEEE. Vol. 87, No. 3. pp. 430-444. March 1999.
[pdf, gzip'd ps]

Anant Agarwal, John Kubiatowicz, David Kranz, Beng-Hong Lim, Donald Yeung, Godfrey D'Souza, and Mike Parkin. Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors. IEEE Micro. pp. 48-61. June 1993.
[pdf, gzip'd ps]

Chapters in Books:

Janice McMahon, Steve Crago, and Donald Yeung. Advanced Microprocessor Architectures. High Performance Embedded Computing Handbook: A Systems Perspective. CRC Press. 2008.

Yan Solihin and Donald Yeung. Data Cache Prefetching. Speculative Execution in High Performance Computer Architectures. CRC Press. 2005.

David Kranz, Beng-Hong Lim, Donald Yeung, and Anant Agarwal. Low-Cost Support for Fine-Grain Synchronization in Multiprocessors. Multithreading: A Summary of the State of the Art. Kluwer Academic Publishers. 1992.
[pdf, gzip'd ps]

Articles in Review:

Technical Reports:

Wanli Liu and Donald Yeung. Compatible Working Sets: the Case for Flexible Management of Shared Caches in CMPs. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2008-13. July 2008.
[pdf]

Wanli Liu and Donald Yeung. Enhancing LTP-Driven Cache Management Using Reuse Distance Information. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2007-33. June 2007.
[pdf]

Xuanhua Li and Donald Yeung. Application-Level Correctness and its Impact on Fault Tolerance. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2006-36. August 2006.
[pdf]

Meng-Ju Wu and Donald Yeung. Parallelization of the SSCA #3 Benchmark on the RAW Processor. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2006-42. August 2006.
[pdf]

Seungryul Choi and Donald Yeung. Hill-Climbing SMT Processor Resource Scheduler. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2005-30. May 2005.
[pdf]

Gautham K. Dorai, Donald Yeung, and Seungryul Choi. Optimizing SMT Processors for High Single-Thread Performance. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2003-07. January 2003.
[pdf, gzip'd ps]

Deepak Agarwal and Donald Yeung. Exploiting Application-Level Information to Reduce Memory Bandwidth Consumption. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2002-64. July 2002.
[pdf, gzip'd ps]

Dongkeun Kim and Donald Yeung. Using Program Slicing to Drive Pre-Execution on Simultaneous Multithreading Processors. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2001-49. June 2001.
[pdf, gzip'd ps]

Aneesh Aggarwal, Abdel-Hameed A. Badawy, Donald Yeung, and Chau-Wen Tseng. Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations. University of Maryland Institute for Advanced Computer Studies Technical Report, UMIACS-TR-2000-57. July 2000.
[pdf, gzip'd ps]

Nicholas Kohout, Seungryul Choi, and Donald Yeung. Multi-Chain Prefetching: Exploiting Memory Parallelism in Pointer-Chasing Codes. University of Maryland Systems and Computer Architecture Group Technical Report, UMD-SCA-TR-2000-01. June 2000.
[pdf, gzip'd ps]

Donald Yeung. Multigrain Shared Memory. Ph.D. thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. MIT/LCS Technical Report, MIT-LCS-TR-743. February 1998.
[pdf, gzip'd ps]

Donald Yeung, William J. Dally, and Anant Agarwal. How to Choose the Grain Size of a Parallel Computer. MIT/LCS Technical Report, MIT-LCS-TR-739. February 1994.
[pdf, gzip'd ps]

Donald Yeung. An Evaluation of Multiprocessor Support for Fine-Grain Synchronization in Preconditioned Conjugate Gradient. Master's Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. MIT/LCS Technical Report, MIT-LCS-TR-565. February 1993.
[pdf, gzip'd ps]

Unpublished:

Donald Yeung. Scalability of Multicast Communication over Wide-Area Networks. Area Exam, Massachusetts Institute of Technology. April 1996.
[pdf, gzip'd ps]

ACM permission notice:
The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

ACM copyright notice:
Copyright (c) 2000 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.

Last updated: September 2009 by Donald Yeung (yeung@eng.umd.edu)