default search action
Dezun Dong
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [j44]Ke Wu, Dezun Dong, Weixia Xu:
A lightweight RDMA connection protocol based on post-hoc confirmation. J. Parallel Distributed Comput. 195: 104991 (2025) - 2024
- [j43]Shaocong Wang, Xiaoyun Zhang, Changhong Wang, Ke Wu, Cunlu Li, Dezun Dong:
DRLAR: A deep reinforcement learning-based adaptive routing framework for network-on-chips. Comput. Networks 246: 110419 (2024) - [j42]Xiaoyun Zhang, Dezun Dong, Cunlu Li, Shaocong Wang, Liquan Xiao:
A survey of machine learning for Network-on-Chips. J. Parallel Distributed Comput. 186: 104778 (2024) - [j41]Ke Wu, Dezun Dong, Weixia Xu:
COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol Codesign. ACM Trans. Archit. Code Optim. 21(3): 49:1-49:26 (2024) - [j40]Weiling Yang, Jianbin Fang, Dezun Dong, Xing Su, Zheng Wang:
Optimizing Full-Spectrum Matrix Multiplications on ARMv8 Multi-Core CPUs. IEEE Trans. Parallel Distributed Syst. 35(3): 439-454 (2024) - [j39]Fan Yuan, Xiaojian Yang, Shengguo Li, Dezun Dong, Chun Huang, Zheng Wang:
Optimizing Multi-Grid Preconditioned Conjugate Gradient Method on Multi-Cores. IEEE Trans. Parallel Distributed Syst. 35(5): 768-779 (2024) - [c106]Shuiyi He, Zicong Wang, Xuan Tang, Qiyao Sun, Dezun Dong:
Chimera: Leveraging Hybrid Offsets for Efficient Data Prefetching. PACT 2024: 144-155 - [c105]Xinfeng Deng, Li Zhou, Dezun Dong, Jibo Wei:
Enhancing Multi-Agent Communication Collaboration through GPT-Based Semantic Information Extraction and Prediction. ACM TUR-C 2024 - [c104]Hongze Zhou, Dinghuang Hu, Zejia Zhou, Guoyuan Yuan, Dezun Dong:
DDT: Dynamical Selective Dropping Threshold for Reactive Congestion Control. ACM TUR-C 2024 - [c103]Zhe Bai, Enda Yu, Dezun Dong, Pingjing Lu:
Enhancing Gradient Compression for Distributed Deep Learning. APNet 2024: 171-172 - [c102]Zhu Yuan, Guoyuan Yuan, Dezun Dong:
ACU: Aggregator-based Congestion control and link Utilization optimization strategy for multi-tenant in-network aggregation. APNet 2024: 194-195 - [c101]Yuan Lu, Dinghuang Hu, Guannan Zhang, Jie Shen, Dezun Dong:
Power of Insensitivity: Fixing Threshold Truncation of Switch Buffer Management Policies. CCGrid 2024: 640-641 - [c100]Deshun Bi, Shengguo Li, Dezun Dong, Peng Zhang, Jianbin Fang:
Optimizing SpMV on Heterogeneous Multi-Core DSPs through Improved Locality and Vectorization. ICPP 2024: 1145-1155 - [c99]Xiao Fu, Weiling Yang, Dezun Dong, Xing Su:
Optimizing Attention by Exploiting Data Reuse on ARM Multi-core CPUs. ICS 2024: 137-149 - [c98]Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, Xiangke Liao:
Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning. ICSE 2024: 39:1-39:13 - [c97]Kainan Yu, Xinxin Qi, Peng Zhang, Jianbin Fang, Dezun Dong, Ruibo Wang, Tao Tang, Chun Huang, Yonggang Che, Zheng Wang:
Optimizing General Matrix Multiplications on Modern Multi-core DSPs. IPDPS 2024: 964-975 - [c96]Dinghuang Hu, Dezun Dong:
Understanding Different Transport Coexistence in Datacenter Networks. IPDPS (Workshops) 2024: 1212-1213 - [c95]Xinbiao Gan, Guang Wu, Shenghao Qiu, Feng Xiong, Jiaqi Si, Jianbin Fang, Dezun Dong, Chunye Gong, Tiejun Li, Zheng Wang:
GraphCube: Interconnection Hierarchy-aware Graph Processing. PPoPP 2024: 160-174 - [c94]Xiaojian Yang, Shengguo Li, Fan Yuan, Dezun Dong:
DBSR: An Efficient Storage Format for Vectorizing Sparse Triangular Solvers on Structured Grids. SC 2024: 59 - [c93]Guangnan Feng, Jiabin Xie, Dezun Dong, Yutong Lu:
UNR: Unified Notifiable RMA Library for HPC. SC 2024: 105 - [i8]Enda Yu, Dezun Dong, Xiangke Liao:
Full-Stack Allreduce on Multi-Rail Networks. CoRR abs/2405.17870 (2024) - [i7]Guangnan Feng, Jiabin Xie, Dezun Dong, Yutong Lu:
UNR: Unified Notifiable RMA Library for HPC. CoRR abs/2408.07428 (2024) - [i6]Yanjing Wang, Lizhou Wu, Wentao Hong, Yang Ou, Zicong Wang, Sunfeng Gao, Jie Zhang, Sheng Ma, Dezun Dong, Xingyun Qi, Mingche Lai, Nong Xiao:
A Comprehensive Simulation Framework for CXL Disaggregated Memory. CoRR abs/2411.02282 (2024) - 2023
- [j38]Yuan Lu, Guoyuan Yuan, Yang Bai, Dezun Dong, Renjie Zhou:
EagerCC: An ultra-low latency congestion control mechanism in datacenter networks. Comput. Networks 236: 110009 (2023) - [j37]Aoxiang Feng, Dezun Dong, Fei Lei, Junchao Ma, Enda Yu, Ruiqi Wang:
In-network aggregation for data center networks: A survey. Comput. Commun. 198: 63-76 (2023) - [j36]Wenxiang Yang, Xiangke Liao, Dezun Dong, Jie Yu:
Exploring job running path to predict runtime on multiple production supercomputers. J. Parallel Distributed Comput. 175: 109-120 (2023) - [j35]Yemao Xu, Dezun Dong, Dongsheng Wang, Shi Xu, Enda Yu, Weixia Xu, Xiangke Liao:
SSD-SGD: Communication Sparsification for Distributed Deep Learning Training. ACM Trans. Archit. Code Optim. 20(1): 7:1-7:25 (2023) - [j34]Enda Yu, Dezun Dong, Xiangke Liao:
Communication Optimization Algorithms for Distributed Deep Learning Systems: A Survey. IEEE Trans. Parallel Distributed Syst. 34(12): 3294-3308 (2023) - [c92]Binyan Lan, Fei Lei, Ke Wu, Dezun Dong:
DFR: Dynamic-thresold Fault-tolerant Routing for Fat Tree. APNet 2023: 180-181 - [c91]Hongbing Tan, Jing Zhang, Libo Huang, Xiaowei He, Dezun Dong, Yongwen Wang, Liquan Xiao:
A Multi-level Parallel Integer/Floating-Point Arithmetic Architecture for Deep Learning Instructions. Euro-Par 2023: 260-274 - [c90]Xiaoyun Zhang, Yaohua Wang, Dezun Dong, Cunlu Li, Shaocong Wang, Liquan Xiao:
DeTAR: A Decision Tree-Based Adaptive Routing in Networks-on-Chip. Euro-Par 2023: 352-366 - [c89]Deshun Bi, Shengguo Li, Yichen Zhang, Xiaojian Yang, Dezun Dong:
Efficiently Running SpMV on Multi-core DSPs for Banded Matrix. ICA3PP (5) 2023: 201-220 - [c88]Binyan Lan, Fei Lei, Dezun Dong, Ke Wu, Xiaoyun Zhang:
DFAR: Dynamic-threshold Fault-tolerant Adaptive Routing for Fat Tree Networks. ICPADS 2023: 721-728 - [c87]Xiao Fu, Xing Su, Dezun Dong, Weiling Yang:
Characterize and Optimize Dense Linear Solver on Multi-core CPUs. ICPADS 2023: 1833-1842 - [c86]Deshun Bi, Xiaowen Tian, Shengguo Li, Dezun Dong:
Efficiently Running SpMV on Multi-Core DSPs for Block Sparse Matrix. ICPADS 2023: 1912-1919 - [c85]Guannan Zhang, Dinghuang Hu, Dezun Dong:
Rately: Accurate Data Center CC based on One-Way Delay. ICPADS 2023: 2759-2760 - [c84]Xiaojian Yang, Shengguo Li, Fan Yuan, Dezun Dong, Chun Huang, Zheng Wang:
Optimizing Multi-grid Computation and Parallelization on Multi-cores. ICS 2023: 227-239 - [c83]Ruiqi Wang, Dezun Dong, Fei Lei, Junchao Ma, Ke Wu, Kai Lu:
Roar: A Router Microarchitecture for In-network Allreduce. ICS 2023: 423-436 - [c82]Guangnan Feng, Dezun Dong, Shizhen Zhao, Yutong Lu:
GRAP: Group-level Resource Allocation Policy for Reconfigurable Dragonfly Network in HPC. ICS 2023: 437-449 - [c81]Yichen Zhang, Shengguo Li, Fan Yuan, Dezun Dong, Xiaojian Yang, Tiejun Li, Zheng Wang:
Memory-aware Optimization for Sequences of Sparse Matrix-Vector Multiplications. IPDPS 2023: 379-389 - [c80]Shaocong Wang, Xiaoyun Zhang, Dezun Dong, Cunlu Li, Zicong Wang, Zongmao Zhang:
LARE: A Linear Approximate Reinforcement Learning Based Adaptive Routing for Network-on-Chips. ISCAS 2023: 1-5 - [c79]Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Shaomeng Cao, Kechi Zhang, Zhi Jin:
Interpretation-based Code Summarization. ICPC 2023: 113-124 - [c78]Mingyang Geng, Dezun Dong, Pingjing Lu:
Hierarchical Semantic Graph Construction and Pooling Approach for Cross-language Code Retrieval. QRS Companion 2023: 393-402 - [c77]Mingyang Geng, Dezun Dong, Pingjing Lu:
Input Transformation for Pre-Trained-Model-Based Cross-Language Code Search. QRS Companion 2023: 403-412 - [c76]Pengyu Wang, Weiling Yang, Jianbin Fang, Dezun Dong, Chun Huang, Peng Zhang, Tao Tang, Zheng Wang:
Optimizing Direct Convolutions on ARM Multi-Cores. SC 2023: 70:1-70:13 - [i5]Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, Xiangke Liao:
An Empirical Study on Using Large Language Models for Multi-Intent Comment Generation. CoRR abs/2304.11384 (2023) - 2022
- [j33]Ke Wu, Dezun Dong, Cunlu Li, Weixia Xu:
Revisiting network congestion avoidance through adaptive packet-chaining reservation. Comput. Networks 212: 109008 (2022) - [j32]Shan Huang, Dezun Dong, Zejia Zhou, Hanyi Shi, Wenxiang Yang, Xiangke Liao:
FastCredit: Expediting credit-based congestion control in datacenters. Comput. Networks 214: 109126 (2022) - [j31]Yuyang Wang, Dezun Dong, Fei Lei:
Understanding node connection modes in Multi-Rail Fat-tree. J. Parallel Distributed Comput. 167: 199-210 (2022) - [j30]Enda Yu, Dezun Dong, Yemao Xu, Shuo Ouyang, Xiangke Liao:
CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation. J. Parallel Distributed Comput. 169: 42-57 (2022) - [j29]Cunlu Li, Dezun Dong, Xiangke Liao:
MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining Routers. ACM Trans. Archit. Code Optim. 19(3): 33:1-33:23 (2022) - [j28]Cunlu Li, Dezun Dong, Xiangke Liao, John Kim:
Hybrid Memory Buffer Microarchitecture for High-Radix Routers. IEEE Trans. Computers 71(11): 2888-2902 (2022) - [j27]Fei Lei, Dezun Dong, Xiangke Liao:
Exploring the Galaxyfly Family to Build Flexible-Scale Interconnection Networks. IEEE Trans. Parallel Distributed Syst. 33(5): 1054-1068 (2022) - [j26]Shengguo Li, Hao Jiang, Dezun Dong, Chun Huang, Jie Liu, Xia Liao, Xuguang Chen:
Efficient Data Redistribution Algorithms From Irregular to Block Cyclic Data Distribution. IEEE Trans. Parallel Distributed Syst. 33(12): 3667-3677 (2022) - [c75]Jiaren Yu, Shan Huang, Guoyuan Yuan, Dezun Dong:
Reservoir: Enhance the Burst-flow Tolerance in Datacenter Networks. CBD 2022: 24-29 - [c74]Yukun Zhou, Dezun Dong, Zhengbin Pang, Junhong Ye, Feng Jin:
ERA: ECN-Ratio-Based Congestion Control in Datacenter Networks. CCGRID 2022: 771-774 - [c73]Yunyang Xu, Ke Wu, Dezun Dong, Cunlu Li, Liquan Xiao:
THperf: Enabling Accurate Network Latency Measurement for Tianhe-2 System. HPCC/DSS/SmartCity/DependSys 2022: 1554-1561 - [c72]Wenhao Gu, Xuchao Xie, Dezun Dong:
LTNoT: Realizing the Trade-Offs Between Latency and Throughput in NVMe over TCP. ICA3PP 2022: 412-432 - [c71]Jian Wang, Enda Yu, Dezun Dong, Zhengbin Pang:
DNNEmu: A Lightweight Performance Emulator for Distributed DNN Training. ICA3PP 2022: 722-736 - [c70]Wenhao Gu, Xuchao Xie, Wei Zhang, Dezun Dong:
A Transformable NVMeoF Queue Design for Better Differentiating Read and Write Request Processing. ICPADS 2022: 546-553 - [c69]Jiaqi Si, Xinbiao Gan, Tiaojie Xiao, Bo Yang, Dezun Dong, Zhengbin Pang:
STEGNN: Spatial-Temporal Embedding Graph Neural Networks for Road Network Forecasting. ICPADS 2022: 826-834 - [c68]Shan Huang, Dezun Dong, Lingbin Zeng, Zejia Zhou, Yukun Zhou, Xiangke Liao:
DC4: Reconstructing Data-Credit-Coupled Congestion Control for Data Centers. ICPP 2022: 61:1-61:11 - [c67]Guangnan Feng, Dezun Dong, Yutong Lu:
Optimized MPI collective algorithms for dragonfly topology. ICS 2022: 14:1-14:11 - [c66]Wenxiang Yang, Xiangke Liao, Dezun Dong, Jie Yu:
A Quantitative Study of the Spatiotemporal I/O Burstiness of HPC Application. IPDPS 2022: 1349-1359 - [c65]Yukun Zhou, Dezun Dong, Zhengbin Pang, Junhong Ye, Feng Jin:
Fast-Converging Congestion Control in Datacenter Networks. ISCC 2022: 1-7 - [c64]Mingyang Geng, Shangwen Wang, Dezun Dong, Shanzhi Gu, Fang Peng, Weijian Ruan, Xiangke Liao:
Fine-grained code-comment semantic interaction analysis. ICPC 2022: 585-596 - [c63]Wenhao Gu, Xuchao Xie, Dezun Dong:
Alleviating Performance Interference Through Intra-Queue I/O Isolation for NVMe-over-Fabrics. NPC 2022: 277-289 - 2021
- [j25]Shan Huang, Dezun Dong, Zejia Zhou, Xiangke Liao:
MP-CREDIT: Multi-path credit for high-speed data center transports. Comput. Networks 193: 108061 (2021) - [j24]Yang Bai, Dinghuang Hu, Dezun Dong, Shan Huang, Xiangke Liao:
CCRP: Converging Credit-Based and Reactive Protocols in Datacenters. Int. J. Parallel Program. 49(5): 685-699 (2021) - [j23]Jianbin Fang, Xiangke Liao, Chun Huang, Dezun Dong:
Performance Evaluation of Memory-Centric ARMv8 Many-Core Architectures: A Case Study with Phytium 2000+. J. Comput. Sci. Technol. 36(1): 33-43 (2021) - [j22]Dinghuang Hu, Dezun Dong, Yang Bai, Shan Huang, Zejia Zhou, Zihao Wei, Xiangke Liao:
Harmonia: Explicit Congestion Notification and Credit-Reservation Transport Converged Congestion Control in Datacenters. J. Comput. Sci. Technol. 36(5): 1071-1086 (2021) - [j21]Shuo Ouyang, Dezun Dong, Yemao Xu, Liquan Xiao:
Communication optimization strategies for distributed deep neural network training: A survey. J. Parallel Distributed Comput. 149: 52-65 (2021) - [j20]Cunlu Li, Dezun Dong, Shazhou Yang, Xiangke Liao, Guangyu Sun, Yongheng Liu:
CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers. ACM Trans. Archit. Code Optim. 18(4): 50:1-50:21 (2021) - [c62]Yuyang Wang, Fei Lei, Dezun Dong:
Exploring Node Connection Modes in Multi-Rail Fat-tree. CLUSTER 2021: 811-812 - [c61]Changhong Wang, Dezun Dong, Zicong Wang, Xiaoyun Zhang, Zhenyu Zhao:
RELAR: A Reinforcement Learning Framework for Adaptive Routing in Network-on-Chips. CLUSTER 2021: 813-814 - [c60]Changhong Wang, Zicong Wang, Dezun Dong, Xiaoyun Zhang, Zhenyu Zhao:
A Novel Reinforcement Learning Framework for Adaptive Routing in Network-on-Chips. HPCC/DSS/SmartCity/DependSys 2021: 336-344 - [c59]Jiaqi Si, Xinbiao Gan, Hao Bai, Dezun Dong, Zhengbin Pang:
NEPG: Partitioning Large-Scale Power-Law Graphs. ICA3PP (3) 2021: 668-690 - [c58]Guoyuan Yuan, Renjie Zhou, Dezun Dong, Shan Huang:
Breaking One-RTT Barrier: Ultra-Precise and Efficient Congestion Control in Datacenter Networks. ICCCN 2021: 1-9 - [c57]Enda Yu, Dezun Dong, Yemao Xu, Shuo Ouyang, Xiangke Liao:
CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation. ICPP 2021: 79:1-79:10 - [c56]Xingyun Qi, Mingche Lai, Dezun Dong, Yi Dai, Junsheng Chang, Jijun Cao:
PFT: A Congestion Avoidance Method based on Proactive Flow Throttling at Endpoints. IM 2021: 572-578 - [c55]Yuyang Wang, Dezun Dong, Fei Lei:
MR-tree: A Parametric Family of Multi-Rail Fat-tree. IPCCC 2021: 1-9 - [c54]Weiling Yang, Jianbin Fang, Dezun Dong:
Characterizing Small-Scale Matrix Multiplications on ARMv8-based Many-Core Architectures. IPDPS 2021: 101-110 - [c53]Yanghai Wang, Dezun Dong, Yemao Xu, Shuo Ouyang, Xiangke Liao:
FastHorovod: Expediting Parallel Message-Passing Schedule for Distributed DNN Training. ISCC 2021: 1-7 - [c52]Renjie Zhou, Dezun Dong, Shan Huang, Yang Bai:
FastTune: Timely and Precise Congestion Control in Data Center Network. ISPA/BDCloud/SocialCom/SustainCom 2021: 238-245 - [c51]Junchao Ma, Dezun Dong, Cunlu Li, Ke Wu, Liquan Xiao:
PAARD: Proximity-Aware All-Reduce Communication for Dragonfly Networks. ISPA/BDCloud/SocialCom/SustainCom 2021: 255-262 - [c50]Yanghai Wang, Shuo Ouyang, Dezun Dong, Enda Yu, Xiangke Liao:
vSketchDLC: A Sketch on Distributed Deep Learning Communication via Fine-grained Tracing Visualization. NPC 2021: 28-39 - [c49]Renjie Zhou, Dezun Dong, Shan Huang, Zejia Zhou, Yang Bai:
Taming Congestion and Latency in Low-Diameter High-Performance Datacenters. NPC 2021: 229-242 - [c48]Junchao Ma, Dezun Dong, Cunlu Li, Ke Wu, Liquan Xiao:
Evaluation of Topology-Aware All-Reduce Algorithm for Dragonfly Networks. NPC 2021: 243-255 - [c47]Guoyuan Yuan, Dezun Dong, Xingyun Qi, Baokang Zhao:
MPICC: Multi-Path INT-Based Congestion Control in Datacenter Networks. NPC 2021: 256-268 - [c46]Weiling Yang, Jianbin Fang, Dezun Dong, Xing Su, Zheng Wang:
LIBSHALOM: optimizing small and irregular-shaped matrix multiplications on ARMv8 multi-cores. SC 2021: 72 - [i4]Enda Yu, Dezun Dong, Yemao Xu, Shuo Ouyang, Xiangke Liao:
CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation. CoRR abs/2106.10796 (2021) - 2020
- [j19]Kang Jin, Dezun Dong, Cunlu Li, Libo Huang, Sheng Ma, Binzhang Fu:
DancerFly: An Order-Aware Network-on-Chip Router On-the-Fly Mitigating Multi-path Packet Reordering. Int. J. Parallel Program. 48(4): 730-749 (2020) - [j18]Yemao Xu, Dezun Dong, Yawei Zhao, Weixia Xu, Xiangke Liao:
OD-SGD: One-Step Delay Stochastic Gradient Descent for Distributed Training. ACM Trans. Archit. Code Optim. 17(4): 30:1-30:26 (2020) - [j17]Jie Yu, Wenxiang Yang, Fang Wang, Dezun Dong, Jinghua Feng, Yuqi Li:
Spatially Bursty I/O on Supercomputers: Causes, Impacts and Solutions. IEEE Trans. Parallel Distributed Syst. 31(12): 2908-2922 (2020) - [c45]Yang Bai, Dezun Dong, Shan Huang, Zejia Zhou, Xiangke Liao:
SSP: Speeding up Small Flows for Proactive Transport in Datacenters. CLUSTER 2020: 153-161 - [c44]Dezun Dong, Ke Wu:
Reducing Tail Latency in Proactive Congestion Control via Moderate Speculation. HPCC/DSS/SmartCity 2020: 417-424 - [c43]Dinghuang Hu, Yang Bai, Dezun Dong, Shan Huang, Xiangke Liao:
Converging Credit-based and Reactive Datacenter Transport using ECN and RTT. HPCC/DSS/SmartCity 2020: 433-440 - [c42]Dezun Dong, Shan Huang, Zejia Zhou, Wenxiang Yang, Hanyi Shi:
FastCredit: Expediting Credit-based Proactive Transports in Datacenters. ICPADS 2020: 528-535 - [c41]Fei Lei, Dezun Dong, Xiangke Liao, José Duato:
Bundlefly: a low-diameter topology for multicore fiber. ICS 2020: 20:1-20:11 - [c40]Renjie Zhou, Guoyuan Yuan, Dezun Dong, Shan Huang:
APCC: Agile and Precise Congestion Control in Datacenters. ISPA/BDCloud/SocialCom/SustainCom 2020: 649-656 - [c39]Yang Bai, Dinghuang Hu, Dezun Dong, Shan Huang, Xiangke Liao:
CCRP: Converging Credit-Based and Reactive Protocols in Datacenters. NPC 2020: 420-434 - [e1]Dezun Dong, Xiaoli Gong, Cunlu Li, Dongsheng Li, Junjie Wu:
Advanced Computer Architecture - 13th Conference, ACA 2020, Kunming, China, August 13-15, 2020, Proceedings. Communications in Computer and Information Science 1256, Springer 2020, ISBN 978-981-15-8134-2 [contents] - [i3]Shuo Ouyang, Dezun Dong, Yemao Xu, Liquan Xiao:
Communication Optimization Strategies for Distributed Deep Learning: A Survey. CoRR abs/2003.03009 (2020) - [i2]Yemao Xu, Dezun Dong, Weixia Xu, Xiangke Liao:
OD-SGD: One-step Delay Stochastic Gradient Descent for Distributed Training. CoRR abs/2005.06728 (2020) - [i1]Yemao Xu, Dezun Dong, Yawei Zhao, Weixia Xu, Xiangke Liao:
ssd-sgd: communication sparsification for distributed deep learning training. CoRR abs/2012.05396 (2020)
2010 – 2019
- 2019
- [j16]Kang Jin, Cunlu Li, Dezun Dong, Binzhang Fu:
HARE: History-Aware Adaptive Routing Algorithm for Endpoint Congestion in Networks-on-Chip. Int. J. Parallel Program. 47(3): 433-450 (2019) - [j15]Yemao Xu, Dezun Dong, Weixia Xu, Xiangke Liao:
SketchDLC: A Sketch on Distributed Deep Learning Communication via Trace Capturing. ACM Trans. Archit. Code Optim. 16(2): 7:1-7:26 (2019) - [c38]Yi Dai, Ke Wu, Mingche Lai, Qiong Li, Dezun Dong:
PPS: A Low-Latency and Low-Complexity Switching Architecture Based on Packet Prefetch and Arbitration Prediction. ICA3PP (1) 2019: 3-16 - [c37]Zihao Wei, Dezun Dong, Shan Huang, Liquan Xiao:
EC4: ECN and Credit-Reservation Converged Congestion Control. ICPADS 2019: 209-216 - [c36]Ke Wu, Dezun Dong, Cunlu Li, Shan Huang, Yi Dai:
Network Congestion Avoidance through Packet-chaining Reservation. ICPP 2019: 58:1-58:10 - [c35]