


default search action
CCF Transactions on High Performance Computing, Volume 7
Volume 7, Number 1, February 2025
- Hanzheng Liang, Chencheng Deng, Peng Zhang, Jianbin Fang, Tao Tang, Chun Huang:
An empirical performance evaluation of SYCL on ARM multi-core processors. 1-16 - Youxuan Xu, Tong Wu, Shigang Li
, Xueying Wang, Jingjing Wang:
SparkAttention: high-performance multi-head attention for large models on Volta GPU architecture. 17-28 - Tao Huang, Yonggui Liang, Shubao Yu, Kexin Chen:
TxCocket: an innovative solution for efficient cross-node data transmission enabled by CXL-based shared memory. 29-42 - Wenhao Dai, Ziyi Jia, Yuesi Bai, Qingxiao Sun
:
Convergence-aware operator-wise mixed-precision training. 43-57 - Jin Zhang, Jincheng Zhou, Xiang Zhang, Di Ma, Chunye Gong:
Fine-grained vectorized merge sorting on RISC-V: from register to cache. 58-71 - Muchun Peng, Qinglin Wang, Yuechao Liang, Weihao Guo, Shun Yang, Yaling Liang, Yongzhen Shi, Ligang Cao, Jie Liu:
GreenB+Tree: an energy-efficient B+tree for MIMD architectures. 72-84
Volume 7, Number 2, April 2025
- Pin Chen
, Qing Mo, Zexin Xu, Xianwei Zhang, Yutong Lu:
Star-gen: an HPC-AI framework for constructing large-scale computational materials database. 85-99 - Wentao Feng, Shizhe Shang, Pengfei Li, Hailong Yang, Zhongzhi Luan
, Depei Qian:
SyncNOVA: an end-to-end fine-grained profiling tool oN lOck behaVior detection and critical section diAgnosis. 100-113 - Ningxi Tian, Silu Huang, Xiaowen Xu
:
Mixed precision block-Jacobi preconditioner: algorithms, performance evaluation and feature analysis. 114-128 - Jianfei Xu, Lianhua He, Zhong Jin:
Mixed precision SpMV on GPUs for irregular data with hierarchical precision selection. 129-141 - Wenlong Fan, Haobo Hua
, Jiandong Shang
, Zhuxin Wen, Hengliang Guo, Litao Zhang:
Optimizing 2D convolution for DCUs. 142-154 - Xiangyu Meng, Xun Wang, Mingzhen Li, Guangming Tan, Weile Jia:
An interpretable DeePMD-kit performance model for emerging supercomputers. 155-168 - Heming Zhong, Xiaojian Pan, Zengquang He, Haoling Wang, Dan Huang, Zhiguang Chen:
GPU acceleration for DNA sequence alignment algorithm and its application. 169-177

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.