------------------------- CSRD Reports --------------------------------------
NOTE:
Some reports are available in hard copy only. Please send mail to the CSRD
Librarian, librarian@csrd.uiuc.edu, to receive a copy via U.S. mail.
(The cost for a hard copy of a paper is $ .06 per page.)
Files with ".Z" extensions are UNIX "compress" archives - extract them
using the UNIX "uncompress" command.
Files with ".gz" extensions are archives compressed with "gzip".
extract them using the gunzip command.
(*) Graphs or pictures not included. Please send email to CSRD Librarian,
librarian@csrd.uiuc.edu, to receive a hard copy via U.S. mail.
(The cost for a hard copy of a paper is $ .06 per page.)
(**)Uuencoded version available in addition to compressed PostScript; look
for filenames with extension .uu .
(+) Report is comprised of many Postscript files archived as a tar file;
look for filenames with extension .tar .
rpt# - author - title
-----------------------------------------------------------------------------
811 - Eric J. Bina - Modifications to the UNIX File System Check
Program FSCK for Quicker Crash Recovery
813 - David Marcovitz - A Multiprocessor Cache Performance Metric
816 - E. Gallopoulos and - Solving Elliptic Equations on the Cedar
Ahmed Sameh Multiprocessor
819 - D. N.-M. Jayasimha - Communication and Synchronization in
Parallel Computation
821 - Thomas Martin Conte - The Simulation and Tuning of the Global
Memory Subsystem of a Multiprocessor
825 - V. Guarna, D. Gannon, - Faust: An Integrated Environment for the
D. Jablonowski, Y. Gaur, Development of Parallel Programs
and A. Malony
826 - Allan Tuchman and - Matrix Visualization in the Design of
Michael Berry Numerical Algorithms
827 - M. Berry, D. Chen, - The Perfect Club Benchmarks: Effective
P. Koss, D. Kuck, Performance Evaluation on Supercomputers
L. Pointer, etc.
833 - Peter Shirley and - Volume Visualization Methods for Scientific
Allan Tuchman Computing
835 - Zhiyuan Li and - Solution of a Divide-and-Conquer Maximin
Edward M. Reingold Recurrence
836 - D. Lilja, D. Marcovitz, - Memory Referencing Behavior and a Cache
an P.-C. Yew Performance Metric in a Shared Memory
Multiprocessor
838 - S. Abraham and - Analysis of the Twisted Cube Topology
K. Padmanabhan
839 - Perry Emrath, Sanjoy - Event Synchronization Analysis for Debugging
Ghosh, and David Padua Parallel Programs
840 - Z. Shen, Z. Li, and - An Empirical Study on Array Subscripts and
P.-C. Yew Data Dependences
841 - William Tsun-yuk Hsu - A Simple Syncrhonziation Network for Large
and Pen-Chung Yew Multiprocessor Systems
845 - Lee-Hian Quek and - Efficient Space-Subdivision Methods In
D. D. Hearn Ray-Tracing Algorithms
846 - David J. Kuck - Keynote Address: 15th Annual Int'l Symp. On
Computer Architecture, 1988, Honolulu, HI
847 - Peter Koss - Application Performance on Supercomputers
849 - Peter Shirley and - Volume Visualization at the Center for
Henry Neeman Research and Development
850 - Ju-Ho Tang - Performance Evaluation of Vector Machine
Architectures
851 - M. Berry, H.-C. Chen, - Algorithmic Design on the Cedar
U. Meier, A. Tuchman, Multiprocessor
H. Wijshoff, G.-C Yang,
and E. Gallopoulos
852 - Z. Li, P.-C. Yew, and - Data Dependence Analysis on Multi-Dimensional
C.-Q. Zhu Array References
854 - E. Gallopoulos and - On the Parallel Solution of Parabolic
Y. Saad Equations
859 - H. Neeman and A. Tuchman- Simulation Time Animation System
890 - Hsin-Chu Chen and - A Domain Decomposition Method for 3D
Ahmed Sameh Elasticity Problems
892 - Steven Karlovsky - Automatic Management of Programmable Caches
Algorithms and Experience
896 - Lynn Pointer - Perfect Report: 1
900 - David Lilja - Efficient Generation of Poisson Distributed
Random Numbers
902 - E. Gallopoulos, - Experiments with Elliptic Problem Solvers
G. Frank, and U. Meier on the Cedar Multicluster
903 - E. Gallopoulos and - Efficient Parallel Solutions of Parabolic
Y. Saad Equations: Implicit Methods
907 - Timothy Alden Davis - A Parallel Algorithm for Sparse Unsymmetric
Factorization
910 - Zhiyuan Li - Intraprocedural and Interprocedural Data
Dependence Analysis for Parallel Computing
911 - S. Midkiff, D. Padua, - Compiling Programs with User Parallelism
and R. Cytron
916 - Hoichi Cheong and - Compiler-Drected Cache Management In
Alexander Veidenbaum Multiprocessors
917 - Ding-Kai Chen - MaxPar: An Execution Driven Simulator for
Studying Parallel Systems
919 - Tim McDaniel - Xas Reference Manual
920 - Hsin-Chu Chen and - Performance of the Finite Strip Method for
Aifang He Structural Analysis on a Parallel Computer
923 - A. Malony, D. Reed, - Performance Measurement Intrusion and
and H. Wijshoff Perturbation Analysis
928 - David Padua - Problem Solving Environments for Parallel
Computing
929 - Robert D. Skeel - Macromolecular Dynamics on a Shared Memory
Multiprocessor
931 - Richard Barton - Xylem Scheduling
932 - Pen-Chung Yew - Summary of Cedar System Configuration
933 - T. Kerkhoven, A. Galick,- Efficient Numerical Simulation of Electron
J.H. Arends, Y. Saad, States in Quantum Wires
and U. Ravaioli
936 - Brian Bliss - Instrumentation of Fortran Programs for
Automatic Roundoff Error Analysis and
Performance Evaluation
937 - Hoichi Cheong and - Compiler-Directed Cache Management for
Alexander Veidenbaum Multiprocessors
938 - James R.B. Davies - Delta Project Quarterly Progress Report
942 - Ding-Kai Chen, Hong-Men - The Impact of Synchronization and
Su, and Pen-Chung Yew Granularity on Parallel Systems
943 - Hoichi Cheong and - Basic Components of the Parsim Simulator
Alexander Veidenbaum
945 - Hoichi Cheong - CHDL Subset Specifications
949 - Edward H. Gornish - Compile Time Analysis for Data Prefetching
950 - Hiroshi Suzuki - A Serial Communication Interface for a
Parallel Simulation System
951 - B. Bliss, M.-C. Brunet, - Automatic Parallel Program Instrumentation
and E. Gallopoulos With Applications in Performance And Error
Analysis
952 - William Tsu-yuk Hsu and - An Effective Synchronization Network for
Pen-Chung Yew Hot-spot Accesses (*)
953 - Hoichi Cheong - Compiler-Directed Cache Coherence
Strategies for Large-Scale Shared-Memory
Multiprocessor Systems
954 - David Lilja and - The Performance Potential of Fine-Grain and
Pen-Chung Yew Coarse-Grain parallel Architectures
955 - C.-Q. Zhu, Z. Fang, and - A New Parallel Sorting Approach with
X. Li Sorting Memory Module
959 - Seth Abraham and - Constraint Based Evaluation of Computer
Krishnan Padmanabhan Networks
961 - David Hammerslag - Faust Library Browser User's Manual
964 - Lynn Pointer - Perfect: Performance Evaluation for Cost-
Effective Transformations Report 2
965 - G. Cybenko, L. Kipp, - Supercomputer Performance Evaluation and
L. Pointer and D. Kuck the Perfect BenchmarksTM
966 - R. Eigenmann, G. Jaxon, - Cedar Fortran and Its Compiler
J. Hoeflinger, D. Padua
967 - David Jablonowski - The Project Manager Library
968 - David Jablonowski - GMB: Graph Manager/Browser
969 - E. Gallopoulos and - Efficient Solution of Parabolic Equations
Y. Saad By Polynomial Approximation Methods
970 - Jose Moreira and - The Lidex Approach
Wilson Ruggiero
971 - Jose Moreira and - A Review of HDLs
Wilson Ruggiero
972 - Jose Moreira and - Lidex Tutorial
Wilson Ruggiero
973 - Jose Moreira and - Lidex Reference ManuaL
Wilson Ruggiero
974 - Jose Moreira and - Lidex Simulation Environment User's Manual
Wilson Ruggiero
975 - Jose Moreira and - ASM Description with Lidex
Wilson Ruggiero
976 - Jose Moreira and - Analysis of a SIMD Computer
Wilson Ruggiero
977 - Seth Abraham - Issues in the Architecture of Direct
Interconnection Schemes for Multiprocessors
978 - J Laminie and U. Meier - Solving Navier-Stokes Equations on the Cedar
Multi-Cluster System
979 - Hsin-Chu Chen - Two Special Classes of Matrices
982 - P. Cappello, C. Koc, - Systolic Computation of Interpolating
and E. Gallopoulos Polynomials
983 - Zhiyu Shen, Zhiyuan Li, - An Empirical Study of Fortran Programs for
and Pen-Chung Yew Parallelizing Compilers
988 - Kwang Keun Yi - On-the-Fly Methods to Measure the Locality
of Programs
989 - David J. Lilja - The Impact of Parallel Loop Scheduling
Strategies on Prefetching in a Shared-Memory
Multiprocessor
990 - David J. Lilja and Pen- - Improving Memory Utilization in Cache
Chung Yew Coherence Directories
991 - Jyh-Herng Chow - Parallel Execution of Lisp Programs in the
Parcel Run Time Environment
995 - Mohammad Reza Haghighat - Symbolic Dependence Analysis For High
Performance Parallelizing Compilers
997 - Chia-Ling Lee - On Run-Time Systems for Parallel
Supercomputers
1005 - Dale Allan Schouten - An Overview of Interprocedural Analysis
Techniques for High Performance Parallelizing
Compilers
1006 - Peter Shirley, Allan - A Polygonal Approximation to Direct Scalar
Tuchman Volume Rendering
1009 - John Barrett Andrews - A Hardware Tracing Facility for a
Multiprocessing Supercomputer
1012 - Bruce Paul Leung - Issues on the Design of Parallelizing
Compilers
1014 - G.G. Hung, Y.C. Wen, - A Parallel Circuit Simulation Using
K. Gallivan, R. Saleh Hierarchical Relaxation
1016 - P. Sinvhal-Sharma and - CPROF: A Trace Based Profiler for Shared
Sanjay Sharma Memory Multiprocessor Systems
1018 - Peter Williams and - An A Priori Depth Ordering Algorithm for
Peter Shirley Meshed Polyhedra
1022 - C. Koc, P. Cappelo, and - Decomposing Polynomial Interpolation for
E. Gallopoulos Systolic Arrays
1024 - Mei-Qin Chen and Ahmed - Conjugate Subspaces Decomposition and Its
Sameh Application in Solving Linear Systems with
Many Right-Hand Sides
1025 - B. Bliss, M.C. Brunet, - Automatic Parallel Program Instrumentation
and E. Gallopoulos with Applications in Performance and Error
Analysis
1026 - Ryan O'Neill McDonald - A Neural Network Approach to Phoneme
Recognition
1027 - David Krumme, - Gossiping in Minimal Time
George Cybenko,
and K. Ventataraman
1028 - Mei-Qin Chen - A Parallel Quasi-Newton Method for Partially
Separable Large-Scale Minimization
1029 - Youcef Saad - SPARSKIT: A Basic Tool Kit for Sparse Matrix
Computation
1032 - J.-H. Chow, and Luddy - Switch-Stacks: A Scheme for Microtasking
Harrison Nested Parallel Loops
1033 - Chuigang Fu - Evaluating the Effectiveness of Fortran
Vectorizers by Measuring Total Parallelism
1034 - Allen Davis Malony - Performance Observability
1035 - Ulrike Meier and - Parallelization and Performance of Conjugate
Rudolf Eigenmann Gradient Algorithms on the Cedar Hierarchical-
Memory Multiprocessor
1036 - Utpal Banerjee - Unimodular Transformations of Double Loops
1042 - Zahira Ammarguellat - A Control-Flow Normalization Algorithm and
Its Complexity
1044 - Gang Lou - Nested Iterative Methods for a Class of
Indefinite Systems
1045 - Victor Eijkhout - Analysis of Parallel Incomplete Point
Factorizations
1047 - S. Aslam, R. Bramley, - The Advanced Software Development and
H.C. Chen, G. Cybenko, Commercialization Project Progress Report
E. Gallopoulos, H. Gao, PR-1
A. Malony, A. Sameh,
T. Canfield, M. Minkoff,
C. Mueller, E. Plaskacz,
D. Weber, D. Anderson,
and I.U. Therios
1048 - Victor Eijkhout - Beware of Modified Incomplete Factorizations
1049 - Michael Waitsel Berry - Multiprocessor Sparse SVD Algorithms and
Applications
1051 - Pen-Chung Yew and - SEE: A System Evaluation Environment for
John Bruner Studying Parallel Systems
1052 - CSRD Staff - Perfect Report 2: Addendum 1
1053 - Elana Granston and - Detecting Redundant Accesses to Array Data
Alexander Veidenbaum
1056 - George N. Frank - Experiments on the Cedar Multicluster with
Parallel Block Cyclic Reduction and an
Application to Domain Decomposition Methods
1057 - David J. Lilja and - Combining Hardware and Software Cache
Pen-Chung Yew Coherence Strategies
1061 - Naomi Voegtli and - The Performance of Hierarchical Systems with
Pen-Chung Yew Wiring Constraints
1062 - Hock-Beng Lim and - Parallel Program Behavioral Study on a
Pen-Chung Yew Shared Multiprocessor (+)
1065 - Gang Lou - Parallel Methods for Solving Linear Systems
Via Overlapping Decomposition
1067 - Allan Tuchman, David - A System for Remote Data Visualization
Jablonowski, and George
Cybenko
1068 - David Jablonowski and - Vista Users Manual
Allan Tuchman
1070 - Carl J. Beckmann and - Broadcast Networks for Dast Synchronization
Constantine D.
Polychronopoulos
1071 - Henry Neeman - A Decomposition Algorithm For Visualization
Irregular Grids
1073 - Elana Granston and - An Integrated Hardware/Software Solution
Alexander Veidenbaum for Effective Management of Local Storage in
High-Performance Systems
1074 - Pavlos Konas - Parallel Discrete Event Simulation on
Shared Memory Microprocessors
1075 - John Bruner, Hoichi - Quarterly Progress Report: Chief Project
Cheong, Alexander
Veidenbaum, and Pen-
Chung Yew
1080 - Paul M. Petersen and - Experimental Evaluation of Some Data
David A. Padua Dependence Tests
1082 - Zhiyuan Li - Compiler Algorithms for Event Variable
Synchronization
1083 - Steve Sullivan - Vector and Parallel Implementations of the
Wavelet Transform
1084 - Elana Granston, - Design and Analysis of a Scalable, Shared-
Stephen Turner, and Memory System with Support for Burst Traffic
Alexander Veidenbaum
1085 - Gung-Chung Yang - DSPACK: A Parallel Direct Sparse Matrix
Package for Shared-Memory Multiprocessors
1086 - Gung-Chung Yang - PARASPICE: A Portable Parallel Circuit
Simulator
1087 - Gung-Chung Yang - An Integrated CAD System for Device Model
Design, Parameter Extraction, and Circuit
Simulation
1088 - Gung-Chung Yang - PARASPICE: A Parallel Circuit Simulator
for Shared-Memory Multiprocessors
1089 - S. Saarinen, R. Bramley,- Ill-Conditioning in Neural Network Training
G. Cybneko Problems
1093 - George Cybenko - Supercomputer Performance Trends and the
Perfect Benchmarks
1094 - Hsin-Chu Chen - Circulative Matrices of Degree O
1095 - Kwang-Keun Yi and - On-the-fly Circuit to Measure the Average
Luddy Harrison Working Set Size
1097 - Peter L. Williams - Visibility Ordering Meshed Polyhedra
1099 - Jyh-Herng Chow, and - Microtasking Recursive, Parallel Programs
Luddy Harrison
1100 - Gregory Jaxon, David - Project Summary: The Delta Program
Padua, and Paul Petersen Manipulation System
1101 - David J. Lilja, and - Architectures and Compiler Techniques for
Pen-Chung Yew Exploiting Parallelism in Loops
1104 - Elana Granston and - Signature-Based Polymorphism for C++
Vincent Russo
1107 - Allan Tuchman, George - VISTA: A System for Remote Data Visualization
Cybenko, David
Jablonowski, Brian
Bliss, and Sanjay Sharma
1109 - Victor Eijkhout - Beware of Unperturbed Modified Incomplete
Factorizations
1110 - John B. Andrews, and - An Analytical Approach to Performance/Cost
Constantine D. Modeling of Parallel Computers
Polychronopoulos
1111 - Carl J. Beckmann and - The Effect of Scheduling and Synchronization
Constantine D. Overhead on Parallel Loop Performance
Polychronopoulos
1115 - Jay Hoeflinger - Automatic Parallelization and Manual
Improvements of the Perfect Club Program For
Cedar
1118 - Perry Emrath, Sanjoy - Detecting Nondeterminacy in Parallel
Ghosh, and David Padua Programs
1119 - Albert T. Galick, - Iterative Solution of the Eigenvalue Problem
Thomas Kerkhoven and for a Dielectric Waveform
Umberto Ravaioli
1121 - Victor Eijkhout and - The Role of the Strengthened C.B.S.
Panayot Vassilevski Inequality In Multilevel Methods
1122 - CSRD Staff - The Cedar Project
1124 - L. DeRose, K. Gallivan, - Parallel Ocean Circulation Modeling on
E. Gallopoulos, and Cedar
A. Navarra
1125 - Bret Andrew Marsolf - Large Grain Parallel Sparse System Solver
1129 - S. Aslam, H.C. Chen, - The Advanced Software Development and
G. Cybenko, H. Gao, Commercialization Project Progress Report
E. Gallopoulos, M. Ham, PR-2
A. Malony, A. Sameh,
S. Sharma, T. Canfield,
D. Leibfritz, M. Minkoff,
C. Mueller, E. Plaskacz
1130 - S. Aslam, E. Gallopoulos- Experiments in Thermal Hydraulics
M. Ham, T. Canfield, M. Simulation: Multiprocessing Commix
Minkoff, & R. Blomquist
1131 - Allan Tuchman, David - Run-Time Visualiation of Program Data
Jablonowski, and George
Cybenko
1132 - E. Gallopoulos - Algorithms and Applications Research at CSRD
1133 - Marior Furnari and - Run Time Management of Lisp Parallelism and
Constantine the Hierarchical Task Graph Program
Polychronopoulos Representation
1134 - U. Meier, G. Skinner, - A Collection of Codes for Sparse Matrix
J. Gunnels Computations
1136 - David John Lilja - Processor Parallelism Considerations and
Memory Latency Reduction in Shared Memory
Multiprocessors
1137 - Naomi Voegtli - Chebyshev Polynomial Preconditioning for the
Conjugate Gradient Method
1138 - Hock-Beng Lim - Characterization of Parallel Program
Behavior on a Shared-Memory Multiprocessor
(+)
1139 - Brian Edward Healy - Parallel and Vector Algorithms in Nonlinear
Structural Dynamics Using the Finite Element
Method
1141 - Milind Girkar - Formalizing Functional Parallelism
1142 - K. Gallivan, B. - MCSPARSE: A Parallel Sparse Unsymmetric
Marsolf, H. Wijshoff Linear System Solver
1144 - P. Chang, D. Lavery, - The Importance of Prepass Code Scheduling
W-M Hwu for Superscalar and Superpipelined Processors
1145 - Yen-Cheng Wen, Kyle - Parallel Event-Driven Waveform Relaxation
Gallivan, and Resve Saleh
1149 - B. Bliss - Interactive Sterring Using the Application
Executive
1150 - Lawrence Rauchwerger - PERFECT: The Portably Instrumented Perfect
Benchmarks
1151 - J. Andrews, C. Beckmann,- Notification and Multicast Networks for
and D. Poulsen Synchronization and Coherence. (**)
1152 - P. Sharma, L. Rauch- - Perfect BenchmarksTM: Instrumented Version
werger & J. Larson
1153 - Daeshik Lee - Boundary Method-Based Domain Decomposition
on Multiprocessors
1156 - William Tsun-yuk Hsu - The Impact of Wiring Constraints on
and Pen-Chung Yew Hierarchical Networks (*)
1157 - Jay Hoeflinger - Cedar Fortran Programmer's Handbook
1160 - George Nikolas Angouras - Scheduling of Parallel Programs on
Multiprogrammed Parallel Processor Systems
1166 - Jyh-Herng Chow and - Compile-time Analysis of Parallel Programs
Luddy Harrison that Share Memory
1167 - Luddy Harrison - Dynamic Control of Parallelism and
Jyh-Herng Chow Granularity in Executing Nested Parallel
Loops
1169 - S. Sharma, R. Bramley, - Evaluating, Visualizing and Analysing the
P. Sinvahl-Sharma, Parallel Program Performance
and G. Cybenko
1173 - Paul Petersen and - Machine-Independent Evaluation of
David Padua Parallelizing Compilers
1177 - Peter L. Williams - Is Interactive Direct Volume Rendering
Feasible for Nonrectilinear Volumes?
1178 - Rudolf Eigenmann - Toward a Methodology of Optimizing
Programs for High-Performance Computers
1179 - William Tsun-yuk Hsu - Performance Evaluation of Wire-Limited
and Pen-Chung Yew Hierarchical Networks (*)
1180 - Perry Emrath and Bret - mdb - Xylem Parallel Debugger User's Guide
Marsolf
1181 - Luiz Antonio DeRose - Parallel Ocean Circulation Modeling on Cedar
(M.S. thesis)
1182 - Milind Girkar - Functional Parallelism Theoretical
Foundations and Implementation
1183 - Michael Berry and - Scientific Benchmark Characterizations
George Cybenko
1184 - P. McClaughry and - Tools That Led to Increased Program
R. Eigenman Performance
1189 - Hoichi Cheong - Life Span Strategy - A Compiler-Based
Approach to Cache Coherence
1190 - Randy Bramley - An Orthogonal Projection Algorithm for
Generalized Stokes Problem
1193 - R. Eigenmann, J. - Experience in the Automatic
Hoeflinger, Z. Li and Parallelization of Four Perfect-
and D. Padua Benchmark Programs
1194 - Jay Hoeflinger - Run-Time Dependence Testing by Integer
Sequence Analysis
1196 - Bruno Nitrosso - Porting of N3S (release)
1197 - Mohammad Haghighat, and - Symbolic Dependence Analysis for High-
Constantine Performance Parallelizing Compilers
Polychronopoulos
1200 - L. DeRose, K. Gallivan, - Experiments with an Ocean Circulation
and E. Gallopoulos Model on Cedar
1203 - V. Simoncini and - A Memory-Conserving Hybrid Method for
E. Gallopoulos Solving Linear Systems with Multiple
Right Sides (Extended Abstract)
1205 - V. Simoncini and - QMSTAB: A Quasi-Minimum Residual Approach
E. Gallopoulos - for the BI-CGSTAB Algorithm
1207 - C. Beckmann and C. - Microarchitecture Support for Dynamic
Polychronopoulos Scheduling of Acyclic Task Graphs
1210 - Ulrike Meier Yang - Preconditioned Conjugate Gradient-Like
Methods for Nonsymmetric Linear Systems
1212 - Gregory Jaxon - Cedar Fortran Data Distribution - Using the
-D Runtime Library
1214 - Jose E. Moreira - Multiple Omega Networks for Parallel
Processing
1218 - William Blume and - Performance Analysis of Parallelizing
Rudolf Eigenmann Compilers on the Perfect Benchmarks\uTM\d
Programs
1221 - David C. Sehr and - Estimating the Inherent Parallelism in
Laxmikant V. Kale Prolog Programs
1222 - Luddy Harrison - PARCEL and MIPRAC: Parallelizers for
and Zahira Ammarguellat Symbolic and Numeric Programs
1223 - Luddy Harrison - The Design of Automatic Parallelizers
and Zahira Ammarguellat for Symbolic and Numeric Programs
1225 - Patrick Earl McClaughry - PTOPP - A Practical Toolset for the
Optimization of Parallel Programs
1227 - Luddy Harrison - A Program's Eye View of MIPRAC (**)
and Zahira Ammarquellat
1228 - Paul M. Petersen and - Dynamic Dependence Analysis: A Novel
David A. Padua Method for Data Dependence Evaluation
1231 - T.F. Chan, T. Szeto, - QMRCGSTAB: A Quasi-Minimal Residual
E. Gallopoulos, V. Variant of the Bi-CGSTAB Algorithm for
Simoncini, and C.H. Tong Nonsymmetric Systems
1237 - M. Haghighat and - Symbolic Program Analysis and
C. Polychronopoulos Optimization for Parallel Compilers
1238 - R. Netzer and S. Ghosh - Efficient Race Condition Detection for
Shared-Memory Programs with Post/Wait
Synchronization
1239 - J. Chow and Luddy - A General Framework for Analyzing Shared-
Harrison Memory Parallel Programs
1240 - S. Ho - MaxPar Extensions for Isolating Performance
Problems (*)
1241 - Z. Ammarguellat - A Control-Flow Normalization Algorithm
and Its Complexity (**)
1242 - V. Simoncini and E. - An Iterative Method for Nonsymmetric
Gallopoulos Systems with Multiple Right-Hand Sides
1243 - Peter L. Williams - Interactive Direct Volume Rendering of
Curvilinear and Unstructured Data
1244 - Kwang-Keun Yi and - Interprocedural Data Flow Analysis for
Luddy Harrison Compile-Time Memory Management
1245 - Li-Ling Chen and - Efficient Computation of the Fixpoints
Luddy Harrison that Arise in Complex Program Analysis
1246 - Jay Hoeflinger - Automatic Parallelization and Manual
Improvements of the Perfect Club Program
OCEAN for Cedar
1247 - Jay Hoeflinger - Automatic Parallelization and Manual
Improvements of the Perfect Club Program
TRFD for Cedar
1248 - H. Chen, H. Gao, and - WHAMS3D Project Progress Report PR-3:
G. Lai Parallel Implementations of WHAMS3D on
Two Shared-Memory Multiprocessors
1249 - William Joseph Blume - Success and Limitation in Automatic
Parallelization of the Perfect
Benchmarks\uTM\d Programs
1250 - U. Banerjee, R. - Automatic Program Parallelization
Eigenmann, A. Nicolau
and D.A. Padua
1251 - David A. Padua - Problem-Solving Environments for Parallel
Computers
1252 - Tsun-yuk Hsu - Multiprocessor Communications: Design and
Technology
1259 - E. Gallopoulos, E. - Future Research Directions in Problem
Houstis and J. R. Rice Solving Environments for Computational
Science
1260 - Kwangkeun Yi and - Automatic Generation and Management of
Luddy Harrison Interprocedural Program Analyses
1261 - D. Kuck, E. Davidson, - The Cedar System and an Initial Performance
D. Lawrie, A. Sameh, Study
D. Padua, P. Yew, et al.
1267 - Elana Granston and - Compile Time Techniques for Using the
Alexander Veidenbaum Priority Data Cache to Reduce Memory
Access Delays
1268 - Mahdi Seddighnezhad - Using a Cache in Place of a Cedar-Like Vector
Prefetch Unit
1270 - Sanjoy Ghosh - Automatic Detection of Nondeterminacy and
Scalar Optimization in Parallel Programs
1273 - Paul Petersen - Evaluation of Programs and Parallelizing
Compilers Using Dynamic Analysis Techniques
1274 - L. DeRose, K. Gallivan - Status Report: Parallel Ocean Circulation
and E. Gallopoulos Modeling on Cedar
1276 - R. Eigenmann and - Practical Tools for Optimizing Parallel
P. McClaughry Programs
1277 - P. Petersen and D. Padua - Static and Dynamic Evaluation of Data
Dependence Analysis
1278 - D. Calvetti, E. - Accuracy Control for Parallel Evaluation
Gallopoulos and L. of Matrix Rational Functions
Reichel
1279 - D. Padua and P. Petersen - Evaluation of Parallelizing Compilers
1281 - Brian Edward Usevitch - Perfect Reconstruction Filter Banks for
Adaptive Filtering and Coding
1282 - P. Emrath and B. Marsolf - MDB - A Parallel Debugger for Cedar
1283 - Kwangkeun Yi and Luddy - System Z1 Programming Manual
Harrison
1284 - U. M. Yang and K. A. - An Analysis of a Cedar Implementation
Gallivan of DYFESM
1288 - David Christopher Sehr - Automatic Parallelization of Prolog Programs
1289 - Yung-Chin Chen - Cache Design and Performance in a Large-Scale
Shared-Memory Multiprocessor System
1294 - David Jablonowski, Brian - VASE User's Manual Version 1.0
Bliss, John Bruner, and
Robert Haber
1298 - Albert Galick - Efficient Solution of Large Sparse
Eigenvalue Problems in Microelectronic
Simulation
1299 - Luddy Harrison - Generalized Iteration Space and the
Parallelization of Symbolic Programs
(Extended Abstract)
1301 - Jyh-Herng Chow - Compile-Time Analysis of Explicitly
Parallel Programs
1303 - Peng Tu and David Padua - Automatic Array Privatization
1306 - David A. Padua and - Polaris: A New Generation Parallelizing
Rudolf Eigenmann Compiler for MPP's
1310 - Kwangkeun Yi - Automatic Generation and Management of
Program Analysis
1314 - Mohammad Haghighat and - Symbolic Analysis: A Basis for
C. Polychronopoulos Parallelization Optimization, and
Scheduling of Programs
1316 - Carl J. Beckmann and - Explicit Dynamic Scheduling: A Practical
Constantine D. - Micro-Dataflow Architecture
Polychronopoulos
1317 - K.A. Faigin, J.P. - The Polaris Internal Representation
Hoeflinger, D.A. Padua,
P.M. Petersen, and
S.A. Weatherford
1318 - Xiaoge Wang - Incomplete Factorization Preconditioning
for Linear Least Squares Problems
1319 - D. Calvetti, L. Reichel, - Incomplete Partial Fractions for Parallel
and E. Gallopoulos Evaluation of Rational Matrix Functions
1323 - Gregg M. Skinner - Simulation of DNA Solvation on a Shared-
Memory Parallel Computer
1324 - Gregg M. Skinner - Finding and Exploiting Parallelism in a
Production Combustion Simulation
1325 - Lynn Choi and Pen-Chung - A Compiler Directed Cache Coherence Scheme
Yew With Improved Intertask Locality
1328 - Kyle Gallivan, Bret - The Parallel Solution of Nonsymmetric
Marsolf and Harry Sparse Linear Systems Using the H*
Wijshoff Reordering and Associated Factorization
1329 - L. Rauchwerger and - The PRIVATIZING DOALL Test: A Run-Time
D.A. Padua Technique for DOALL Loop Identification
and Array Privatization
1330 - David K. Paulsen and - Data Prefetching and Data Forwarding in
Pen-Chung Yew Shared Memory Multiprocessing
1331 - Ding-Kai Chen and - Statement Reordering for DOACROSS Loops
Pen-Chung Yew
1332 - William Blume and - Symbolic Analysis Techniques Needed for the
Rudolf Eigenmann Effective Parallelization of the Perfect
Benchmarks
1335 - Luis DeRose and David - An Inference Mechanism for the Compilation of
Padua Interactive Array Languages
1336 - Peng-Tu and David Padua - Demand-Driven Symbolic Analysis
1337 - Jose E. Moreira and - Autoscheduling in a Shared Multiprocessor
Constantine D. Polychrono-
poulos
1338 - Rudolf Eigenmann, - Restructuring Fortran Programs for Cedar.
Jay Hoeflinger,
Greg Jaxon, Zhiyuan Li,
David Padua
1339 - Lawrence Rauchwerger - Speculative Run-Time Parallelization of Loops
and David Padua
1340 - Kyle Gallivan and - Practical Issues Related to Developing
Bret Marsolf Object-Oriented Numerical Libraries
1341 - Kyle Gallivan, - On the Development of Libraries and Their
Willliam Jalby, Bret Use in Applications
Marsolf, Ahmed Sameh
1342 - V. Simoncini and - Convergence Analysis of Block Iterative
E Gallopoulos Methods
1343 - V. Simoncini and - Block Interative Methods and Matrixx
E. Gallopoulos Equations
1345 - William Blume, - The Range Test: A Dependence Test for
Rudolf Eigenmann Symbolic, Non-linear Expressions.
1346 - Carl Josef Beckmann - Hardware and Software for Functional
and Fine Grain Parallelism.
1347 - Grant Haab, Michael - Analysis and Exploration of the program FALSE
Klemme, Sharad Mehrotra,
Krishna Subramanian
1348 - W. Blume, R. Eigenmann, - Automatic Detection of Parallelism: A Grand
J. Hoeflinger, D. Padua, Challenge for High-Performance Computing
P. Petersen, L.
Rauchwerger, and P. Tu
1349 - Lawrence Rauchwerger and - Parallelizing WHILE Loops for Multiprocessor
David Padua Systems
1351 - Sharad Mehrotra and - A New Data Prefetch Mechanism for
Luddy Harrison Accelerating General-Purpose Computation
1352 - Sirpa Helena Saarinen - Modelling Functions From Sample Data With
Classification Applications
1359 - Gregg M. Skinner, and - Parallelization and Performance of a
Rudolf Eigenmann Combustion Chemistry Simulation
1372 - J. Moreira and C. - On the Implementation and Effectiveness of
Polychronopoulos Autoscheduling
1373 - J. Moreira and C. - Autoscheduling in a Distributed Shared-Memory
Polychronopoulos Environment
1374 - Ding-Kai Chen - Compiler Optimizations for Parallel Loops
With Fine-Grained Synchronization
1375 - W. Blume, R. Eigenmann, - Polaris: The Next Generation in
K. Faigin, J. Grout, Parallelizing Compilers
J. Hoeflinger, D. Padua,
P. Petersen, W. Pottenger,
L. Rauchwerger, P. Tu,
S. Weatherford
1377 - David K. Poulsen - Memory Latency Reduction via Data
Prefetching and Data Forwarding in Shared
Memory Multiprocessors
1378 - V. Simoncini and E. - A Hybrid Block GMRES Method for
Gallopoulos Nonsymmetric Systems with Multiple Right-
Hand Sides
1381 - William Blume and - Symbolic Range Propagation
Rudolf Eigenmann
1383 - Lawrence Rauchwerger - The Privatizing DOALL Test: A Run-Time
and David Padua Technique for DOALL Loop Identification
and Array Privatization
1384 - Edward Gornish - Adaptive and Integrated Data Cache
Prefetching for Shared-Memory Multiprocessors
1385 - D. Chen, D. Oesterreich, - An Efficient Algorithm for the Run-time
J. Torrellas, P. Yew Parallelization of DOACROSS Loops
1386 - J. Torrellas and Z. - The Performance of the Cedar Multistage
Zhang Switching Network
1387 - J. Torrellas, C. Xia, - Optimizing Instruction Cache Performance for
R. Daigle Operating System Intensive Workloads
1388 - Josep Torrellas and - Comparing the Performance and Programmability
David Koufaty of the DASH and Cedar Multiprocessors for
Scientific Loads
1389 - Peng Tu and David - Efficient Building and Placing of Gating
Padua Functions
1390 - Lawrence Rauchwerger - The LRPD Test: Speculative Run-Time
and David Padua Parallelization of Loops with Privatization
and Reduction Parallelization
1392 - Rudolf Eigenmann, - On the Automatic Parallelization of the
Jay Hoeflinger, and Perfect Benchmarks
David Padua
1393 - William Morton Pottenger - Induction Variable Substitution and Reduction
Recognition in the Polaris Parallelizing
Compiler, M.S. thesis
1396 - Bill Pottenger and - Parallelization in the Presence of
Rudolf Eigenmann Generalized Induction and Reduction
Variables
1399 - Peng Tu and David Padua - Gated SSA-Based Demand-Driven Symbolic
Analysis for Parallelizing Compilers
1400 - Lawrence Rauchwerger, - Run-Time Methods for Parallelizing Partially
Nancy M. Amato, David Parallel Loops
A. Padua
1403 - Hui Gao and John Larson - A Year's Profile of Academic Supercomputer
Users Using the CRAY Hardware Performance
Monitor
1404 - Jose Eduardo Moreira - On the Implementation and Effectiveness of
Autoscheduling for Shared-Memory
Multiprocessors (Ph.D. thesis)
1405 - William Blume, Rudolf - Polaris: Improving the Effectiveness of
Eigenmann, Keith Faigin, Parallelizing Compilers
John Grout, Jay
Hoeflinger, David Padua,
Paul Petersen, Bill
Pottenger, Lawrence
Rauchwerger, Peng Tu,
Stephen Weatherford
1407 - Ulrike Meier Yang and - A New Family of Preconditioned Iterative
Kyle A. Gallivan Solvers for Nonsymmetric Linear Systems
1408 - Ulrike Meier Yang - A Family of Preconditioned Iterative Solvers
for Sparse Linear Systems(Ph.D. thesis)
1410 - Peng Tu - Privatization and Distribution of Arrays
1411 - K. Gallivan, E. Grimme, - A Rational Lanczos Algorithm for Model
and P. Van Dooren Reduction
1412 - Masayuki Kuba - On the Parallelization of a CFD Code
1413 - Masayuki Kuba, - The Synergetic Effect of Compiler,
Constantine D. Architecture, and Manual Optimizations on
Polychronopoulos, and the Performance of CFD on Multiprocessors
Kyle Gallivan
1415 - Kyle Gallivan, Eric - Asymptotic Waveform Evaluation via a Lanczos
Grimme, Paul Van Dooren Method
1416 - Kyle Gallivan, Srikanth - On Solving Block Toeplitz Matrices Using
Thirumalai, Paul Van a Block Schur Algorithm (extended version)
Dooren
1418 - Kyle Gallivan, Srikanth - A Block Toeplitz Look-ahead Algorithm
Thirumalai, Paul Van
Dooren
1419 - Kyle Gallivan, Srikanth - QR Factorization of Rank Deficient Block
Thirumalai, Paul Van Toeplitz Matrices
Dooren
1420 - Kyle Gallivan, Eric - On Some Modifications of the Lanczos
Grimme, David Sorensen, Algorithm and the Relation With Pade
Paul Van Dooren Approximations
1421 - Kyle Gallivan, Eric - Pade Approximation of Large-Scale Dynamical
Grimme, Paul Van Dooren Systems with Lanczos Methods
1422 - Kyle Gallivan, Srikanth - High Performance Algorithms for Toeplitz and
Thirumalai, V. Vermaut, Block Toeplitz Matrices
Paul Van Dooren
1423 - Kyle Gallivan, Srikanth - A New Look-Ahead Schur Algorithm
Thirumalai, Paul Van
Dooren
1424 - Kyle Gallivan, Harry - Solving Large Nonsymmetric Sparse Linear
Wijschoff, Bret Marsolf Systems Using McSparse
1425 - Lynn Choi and Pen-Chung - Eliminating Stale Data References through
Yew Array Data-Flow Analysis
1426 - Bill Pottenger and - Idiom Recognition in the Polaris
Rudolf Eigenmann Parallelizing Compiler
1427 - Lynn Choi and Pen-Chung - Interprocedural Array Data-Flow Analysis for
Yew Cache Coherence
1428 - Dale Allan Schouten - Efficient Scheduling of Parallel Tasks in a
a Multiprogramming Environment (Ph.D. thesis)
1429 - Bill Blume and Rudolf - Demand-driven, Symbolic Range Propagation
Eigenmann
1430 - Luiz DeRose, Kyle - A MATLAB Compiler and Restructurer for the
Gallivan, Bret Marsolf, Development of Scientific Libraries and
David Padua Applications
1431 - John Robert Grout - Inline Expansion for the Polaris Research
Compiler (M.S. thesis)
1432 - Peng Tu - Automatic Array Privatization and Demand-
Driven Symbolic Analysis (Ph.D. thesis)
1433 - William Joseph Blume - Symbolic Analysis Techniques for Effective
Automatic Parallelization (Ph.D. thesis)
1434 - Eduard Ayguade, Cristina - A Uniform Internal Representation for High-
Barrado, Jesus Labarta, Level and Instruction-Level Transformations
David Lopez, Susana
Moreno, David Padua,
Mateo Valero
1435 - Sunil Kim - Interconnection Networks and Data Prefetching
for Large-scale Multiprocessors: Design and
Performance
1436 - Adam Stuart Block - A Study of Instruction-Level Parallelism
Architectures and Overhead Analysis of
Parallel Execution
1437 - Luiz DeRose, Kyle - FALCON: An Environment for the Development
Gallivan, Stratis of Scientific Libraries and Applications
Gallopoulos, Bret
Marsolf, David Padua
1439 - Sharad Mehrotra and - A Close Look at a New Memory Access
Luddy Harrison Classification Scheme
1442 - William Blume, Rudolf - Effective Automatic Parallelization with
Eigenmann, Keith Faigin, Polaris
John Grout, Jay
Hoeflinger, David Padua,
Paul Petersen, William
Pottenger, Lawrence
Rauchwerger, Peng Tu,
Stephen Weatherford
1443 - Stephen Wilson Turner - Performance Analysis of Multiprocessor
Interconnection Networks Using A Burst-
Traffic Model
1444 - Lawrence Rauchwerger, - A Scalable Method for Run-Time Loop
Nancy M. Amato, David Parallelization
A. Padua
1448 - L. DeRose, K. Gallivan, - FALCON: A MATLAB Interactive Restructuring
E. Gallopoulos, B. Compiler
Marsolf, D. Padua
1449 - Jose Moreira, Dale - The Performance Impact of Granularity
Schouten, Constantine Control and Functional Parallelism
Polychronopoulos
1450 - Lawrence Rauchwerger - Run-Time Parallelization: A Framework for
Parallel Computation (Ph.D. Thesis)
1451 - Kyle Gallivan, Bret - The Generation of Optimized Codes Using
Marsolf, Aart Bik, Nonzero Structure Analysis
Harry Wijshoff
1453 - Luiz DeRose, Kyle - An Environment for Interactive Development
Gallivan, E. Gallopoulos, of Software Using MATLAB (conference slides)
Bret Marsolf, David
Padua
1454 - David Williamson - Evaluation of Architectural Tradeoffs for
Multiple Context Processors (M.S. Thesis)
1455 - Ramesh Yarlagadda - A Study of Scheduling Techniques for
Instruction Level Parallelism Processors
(M.S. Thesis)
1456 - Bill Pottenger and - Targeting a Shared Address Space Version
Rudolf Eigenmann of the Seismic Benchmark Seis 1.1
1458 - Sharad Mehrotra and - Quantifying the Performance Potential of a
Luddy Harrison Data Prefetch Mechanism for Pointer-
Intensive and Numeric Programs
1459 - Lynn Choi and Pen-Chung - Compiler and Hardware Support for Cache
Yew Coherence in Large-Scale Multiprocessors:
Design Considerations and Performance
Evaluation
1460 - Masayuki Kuba, Marie- - Practical Parallelization of Molecular
Christine Brunet, and Dynamics on Shared and Distributed Memory
Constantine D. Machines
Polychronopoulos
1462 - Luiz DeRose and David - A MATLAB to Fortran 90 Translator and its
Padua Effectiveness
1464 - D. Koufaty, X. Chen, - Data Forwarding in Scalable Shared-Memory
D. Poulsen, J. Torrellas Multiprocessors
1465 - J. Torrellas - Scalable Shared-Memory Architectures
1466 - Z. Zhang, J. Torrellas - Speeding Up Irregular Applications in Shared-
Memory Multiprocessors: Memory Binding and
Group Prefetching
1467 - C. Xia, J. Torrellas - Improving the Data Cache Performance of
Multiprocessor Operating Systems
1468 - A. Raynaud, Z. Zhang, - Distance-Adaptive Update Protocols for
J. Torrellas Scalable Shared-Memory Multiprocessors
1469 - R. Daigle, C. Xia, - Low Perturbation Address Trace Collection
J. Torrellas for Operating System, Multiprogrammed, and
Parallel Workloads in Multiprocessors
1470 - V. Simoncini and E. - Convergence Properties of Block GMRES and
Gallopoulos Matrix Polynomials
1471 - Luiz DeRose and David - Accelerating MATLAB Programs with FALCON
Padua
1473 - W. Blume, R. Doallo, R. - Advanced Program Restructuring for High-
Eigenmann, J. Grout, J. Performance Computers with Polaris
Hoeflinger, T. Lawrence,
J. Lee, D. Padua, Y.
Paek, B. Pottenger, L.
Rauchwerger, P. Tu
1476 - Bret Marsolf - Investigation of the Page Fault Performance
of Cedar
1477 - Lynn Choi and Andrew - Integrating Networks with Memory Hierarchies
Chien in a Multicomputer Node Architecture
1478 - Lynn Choi and Andrew - The Design and Performance Evaluation of
Chien DI-multicomputer
1479 - Chun Xia - Exploiting Multiprocessor Memory Hierarchies
for Operating Systems (Ph.D. thesis)
1480 - Lynn Choi and Pen-Chung - Eliminating Stale Data References through
Yew Array Data-Flow Analysis
1481 - Lynn Choi and Pen-Chung - Interprocedural Array Data-Flow Analysis for
Yew Cache Coherence
1482 - Lynn Choi and Pen-Chung - Compiler and Hardware Support for Cache
Yew Coherence in Large-Scale Multiprocessors:
Design Considerations and Performance Study
1483 - Lynn Choi and Pen-Chung - Program Analysis for Cache Coherence Beyond
Yew Procedural Boundaries
1484 - Lynn Choi - Hardware and Compiler Support for Cache
Coherence in Large-Scale Multiprocessors
(Ph.D. thesis)
1488 - Sharad Mehrotra - Data Prefetch Mechanisms for Accelerating
Symbolic and Numeric Computation
(Ph.D. thesis)