Publications
On the Existence Problem of Incomplete Factorisation Methods,”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-435, December 1999.
(222.2 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Automatic Determination of Matrix-Blocks,”
Lapack Working Note 151, University of Tennessee Computer Science Technical Report, no. UT-CS-01-458, January 2001.
(1.15 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Proposed Standard for Matrix Metadata,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-03-02, Submitted to ACM TOMS, November 2003.
(13.39 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Numerical Metadata API Reference,”
Innovative Computing Laboratory Technical Report, February 2007.
(454.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,”
ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019.
(1.63 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Applying Aspect-Oriented Programming Concepts to a Component-based Programming Model,”
IPDPS 2003, Workshop on NSF-Next Generation Software, Nice, France, March 2003.
(66.99 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Improvements in the Efficient Composition of Applications,”
IPDPS 2004, NGS Workshop (to appear), Sante Fe, 00 2004.
(42.85 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information,”
EuroMPI, Chicago, IL, ACM, September 2017.
(745.58 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming,”
Parallel Computing, vol. 38, no. 8, pp. 391-407, August 2012.
(1.64 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Soft Error Resilient QR Factorization for Hybrid System with GPGPU,”
Journal of Computational Science, vol. 4, issue 6, pp. 457–464, November 2013.
(995.45 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Dense Linear System Solver with Soft Error Resilience,”
IEEE Cluster 2011, Austin, TX, September 2011.
(1.27 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Robustness of the Young/Daly Formula for Stochastic Iterative Applications,”
49th International Conference on Parallel Processing (ICPP 2020), Edmonton, AB, Canada, ACM Press, August 2020.
(1.11 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Algorithm-Based Fault Tolerance for Dense Matrix Factorization,”
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2012, New Orleans, LA, USA, ACM, pp. 225-234, February 2012.
(865.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Algorithm-based Fault Tolerance for Dense Matrix Factorizations,”
University of Tennessee Computer Science Technical Report, no. UT-CS-11-676, Knoxville, TN, August 2011.
(865.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,”
First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.
(1.24 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,”
ICCS 2012, Omaha, NE, June 2012.
(1.27 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Soft Error Resilient QR Factorization for Hybrid System with GPGPU,”
Journal of Computational Science, Seattle, WA, Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems at SC11, November 2011.
(965.88 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Soft Error Resilient QR Factorization for Hybrid System,”
University of Tennessee Computer Science Technical Report, no. UT-CS-11-675, Knoxville, TN, July 2011.
(1.39 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Soft Error Resilient QR Factorization for Hybrid System,”
UT-CS-11-675 (also LAPACK Working Note #252), no. ICL-CS-11-675, July 2011.
(1.39 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,”
FOSS4G 2010, Barcelona, Spain, September 2010.
(1.57 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
OpenCL Evaluation for Numerical Linear Algebra Library Development,”
Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.
(2.69 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Optimal Checkpointing Strategies for Iterative Applications,”
IEEE Transactions on Parallel Distributed Systems, vol. 33, issue 3, pp. 507-522, March 2022.
(1.47 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Providing GPU Capability to LU and QR within the ScaLAPACK Framework,”
University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.
(7.48 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Preconditioners for Batched Iterative Linear Solvers on GPUs,”
Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.
“Task Based Cholesky Decomposition on Xeon Phi Architectures using OpenMP,”
International Journal of Computational Science and Engineering (IJCSE), vol. 17, no. 3, October 2018.
“JLAPACK - Compiling LAPACK Fortran to Java,”
Scientific Programming, vol. 7, no. 2, pp. 111-138, October 2002.
(307.46 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Top500 Supercomputer Sites (13th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-425, June 1999.
(278.51 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Disaster Survival Guide in Petascale Computing: An Algorithmic Approach,”
in Petascale Computing: Algorithms and Applications (to appear): Chapman & Hall - CRC Press, 00 2007.
(260.18 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance Community,”
International Journal of High Performance Computing Applications (to appear), July 2009.
(203.04 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Development for High End Computing with Python Language Wrapper (PLW),”
International Journal for High Performance Computer Applications, vol. 21, no. 3, pp. 360-369, 00 2007.
(179.32 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Report on the TianHe-2A System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.
(7.15 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Computing Trends, Supercomputers, Clusters, and Grids,”
Information Processing Society of Japan Symposium Series, vol. 2003, no. 14, pp. 55-58, January 2003.
“A Tribute to Gene Golub,”
Computing in Science and Engineering: IEEE, pp. 5, January 2008.
“The Quest for Petascale Computing,”
Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.
(178.3 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
(752.59 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Self-Adapting Numerical Software and Automatic Tuning of Heuristics,”
Lecture Notes in Computer Science, vol. 2660, Melbourne, Australia, Springer Verlag, pp. 759-770, June 2003.
(45.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, UT-CS-89-85, 00 2010.
(6.42 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,”
International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017.
(446.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Empirical Performance Tuning of Dense Linear Algebra Software,”
in Performance Tuning of Scientific Applications (to appear), 00 2010.
“Finite-choice Algorithm Optimization in Conjugate Gradients (LAPACK Working Note 159),”
University of Tennessee Computer Science Technical Report, UT-CS-03-502, January 2003.
(64.52 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software,”
University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.
(539.24 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
HPCS Library Study Effort,”
University of Tennessee Computer Science Technical Report, UT-CS-08-617, January 2008.
(73.22 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
IPDPS 2012, the 26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, IEEE Computer Society Press, May 2012.
(405.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster)
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.
(2.53 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Report on the Oak Ridge National Laboratory's Frontier System,”
ICL Technical Report, no. ICL-UT-22-05, May 2022.
(16.87 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Top500 Supercomputer Sites (15th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-442, June 2000.
(278.88 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, 00 2011.
(6.42 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
(106.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accelerating Numerical Dense Linear Algebra Calculations with GPUs,”
Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.
(1.06 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, CS-89-85, January 2004.
(6.42 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)