default search action
37th ICDE 2021: Online Event [Chania, Greece]
- 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19-22, 2021. IEEE 2021, ISBN 978-1-7281-9184-3
Research Papers
Data Integration and Cleaning 1
- Panos Vassiliadis:
Profiles of Schema Evolution in Free Open Source Software Projects. 1-12 - Peng Li, Xi Rao, Jennifer Blase, Yue Zhang, Xu Chu, Ce Zhang:
CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks. 13-24 - Yifeng Jin, Zijing Tan, Weijun Zeng, Shuai Ma:
Approximate Order Dependency Discovery. 25-36 - Matteo Corain, Paolo Garza, Abolfazl Asudeh:
DBSCOUT: A Density-based Method for Scalable Outlier Detection in Very Large Datasets. 37-48 - Jiaqing Liang, Suo Feng, Chenhao Xie, Yanghua Xiao, Jindong Chen, Seung-won Hwang:
Bootstrapping Information Extraction via Conceptualization. 49-60 - Yinan Mei, Shaoxu Song, Chenguang Fang, Haifeng Yang, Jingyun Fang, Jiang Long:
Capturing Semantics for Imputation with Pre-trained Language Models. 61-72
Graph Data Management 1
- Wentao Li, Min Gao, Fan Wu, Wenge Rong, Junhao Wen, Lu Qin:
Manipulating Black-Box Networks for Centrality Promotion. 73-84 - Kai Wang, Wenjie Zhang, Xuemin Lin, Ying Zhang, Lu Qin, Yuting Zhang:
Efficient and Effective Community Search on Large-scale Bipartite Graphs. 85-96 - Boge Liu, Fan Zhang, Wenjie Zhang, Xuemin Lin, Ying Zhang:
Efficient Community Search with Size Constraint. 97-108 - Fangda Guo, Ye Yuan, Guoren Wang, Xiangguo Zhao, Hao Sun:
Multi-attributed Community Search in Road-social Networks. 109-120 - Dong Wei, Ioannis Koutis, Senjuti Basu Roy:
Peer Learning Through Targeted Dynamic Groups Formation. 121-132 - Mengxuan Zhang, Lei Li, Wen Hua, Xiaofang Zhou:
Efficient 2-Hop Labeling Maintenance in Dynamic Small-World Networks. 133-144
Data Privacy
- Peng Tang, Rui Chen, Sen Su, Shanqing Guo, Lei Ju, Gaoyuan Liu:
Differentially Private Publication of Multi-Party Sequential Data. 145-156 - Sepanta Zeighami, Gabriel Ghinita, Cyrus Shahabi:
Secure Dynamic Skyline Queries Using Result Materialization. 157-168 - Shun Takagi, Tsubasa Takahashi, Yang Cao, Masatoshi Yoshikawa:
P3GM: Private High-Dimensional Data Release via Privacy Preserving Phased Generative Model. 169-180 - Xinjian Luo, Yuncheng Wu, Xiaokui Xiao, Beng Chin Ooi:
Feature Inference Attack on Model Predictions in Vertical Federated Learning. 181-192 - Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song:
Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence. 193-204 - Jämes Ménétrey, Marcelo Pasin, Pascal Felber, Valerio Schiavoni:
Twine: An Embedded Trusted Runtime for WebAssembly. 205-216
Crowdsourcing
- Chi Harold Liu, Chengzhe Piao, Xiaoxin Ma, Ye Yuan, Jian Tang, Guoren Wang, Kin K. Leung:
Modeling Citywide Crowd Flows using Attentive Convolutional LSTM. 217-228 - Fariha Tabassum Islam, Tanzima Hashem, Rifat Shahriyar:
A Privacy-Enhanced and Personalized Safe Route Planner with Crowdsourced Data and Computation. 229-240 - Yan Zhao, Jiannan Guo, Xuanhao Chen, Jianye Hao, Xiaofang Zhou, Kai Zheng:
Coalition-based Task Assignment in Spatial Crowdsourcing. 241-252 - Baoyi An, Mingjun Xiao, An Liu, Xike Xie, Xiaofang Zhou:
Crowdsensing Data Trading based on Combinatorial Multi-Armed Bandit and Stackelberg Game. 253-264 - Yan Zhao, Kai Zheng, Jiannan Guo, Bin Yang, Torben Bach Pedersen, Christian S. Jensen:
Fairness-aware Task Assignment in Spatial Crowdsourcing: Game-Theoretic Approaches. 265-276 - Jingru Yang, Xiaoman Zhao, Ju Fan, Gong Chen, Chong Peng, Sheng Yao, Xiaoyong Du:
A Human-in-the-loop Approach to Social Behavioral Targeting. 277-288 - Kaiyu Li, Guoliang Li, Yong Wang, Yan Huang, Zitao Liu, Zhongqin Wu:
CrowdRL: An End-to-End Reinforcement Learning Framework for Data Labelling. 289-300
Spatial and Temporal Data Management 1
- Guanjie Zheng, Chang Liu, Hua Wei, Chacha Chen, Zhenhui Li:
Rebuilding City-Wide Traffic Origin Destination from Road Speed Data. 301-312 - Yishu Wang, Ye Yuan, Hao Wang, Xiangmin Zhou, Congcong Mu, Guoren Wang:
Constrained Route Planning over Large Multi-Modal Time-Dependent Networks. 313-324 - Di Chen, Ye Yuan, Wenjin Du, Yurong Cheng, Guoren Wang:
Online Route Planning over Time-Dependent Road Networks. 325-335 - Mengxuan Zhang, Lei Li, Wen Hua, Rui Mao, Pingfu Chao, Xiaofang Zhou:
Dynamic Hub Labeling for Road Networks. 336-347 - Haitao Yuan, Guoliang Li, Zhifeng Bao, Ling Feng:
An Effective Joint Prediction Model for Travel Demands and Traffic Flows. 348-359 - Shuai Huang, Yong Wang, Tianyu Zhao, Guoliang Li:
A Learning-based Method for Computing Shortest Path Distances on Road Networks. 360-371
Distributed Data Management 1
- Anran Li, Lan Zhang, Junhao Wang, Juntao Tan, Feng Han, Yaxuan Qin, Nikolaos M. Freris, Xiang-Yang Li:
Efficient Federated-Learning Model Debugging. 372-383 - Pan Zhou, Qian Lin, Dumitrel Loghin, Beng Chin Ooi, Yuncheng Wu, Hongfang Yu:
Communication-efficient Decentralized Machine Learning over Heterogeneous Networks. 384-395 - Fei Song, Khaled Zaouk, Chenghao Lyu, Arnab Sinha, Qi Fan, Yanlei Diao, Prashant J. Shenoy:
Spark-based Cloud Data Analytics using Multi-Objective Optimization. 396-407 - Faisal Nawab:
WedgeChain: A Trusted Edge-Cloud Store With Asynchronous (Lazy) Trust. 408-419 - Natasha Mittal, Faisal Nawab:
CooLSM: Distributed and Cooperative Indexing Across Edge and Cloud Machines. 420-431 - Pedro Pedreira, Amit Dutta, Sergey Pershin, Lin Liu, Sushant Shringarpure, Jialiang Tan, Brian Landers, Ge Gao, Karen Pieper:
Interactive Analytic DBMSs: Breaching the Scalability Wall. 432-443
Data Integration and Data Science
- Hazar Harmouch, Thorsten Papenbrock, Felix Naumann:
Relational Header Discovery using Similarity Search in a Table Corpus. 444-455 - Yuyang Dong, Kunihiro Takeoka, Chuan Xiao, Masafumi Oyamada:
Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach. 456-467 - Christos Koutras, George Siachamis, Andra Ionescu, Kyriakos Psarakis, Jerry Brons, Marios Fragkoulis, Christoph Lofi, Angela Bonifati, Asterios Katsifodimos:
Valentine: Evaluating Matching Techniques for Dataset Discovery. 468-479 - Xiangyu Zou, Cai Deng, Wen Xia, Philip Shilane, Haoliang Tan, Haijun Zhang, Xuan Wang:
Odess: Speeding up Resemblance Detection for Redundancy Elimination by Fast Content-Defined Sampling. 480-491 - Guo Zhong, Chi-Man Pun:
Latent Low-rank Graph Learning for Multimodal Clustering. 492-503 - Sarah Masud, Subhabrata Dutta, Sakshi Makkar, Chhavi Jain, Vikram Goyal, Amitava Das, Tanmoy Chakraborty:
Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter. 504-515
Graph Data Management 2
- Xingyu Yao, Yingxia Shao, Bin Cui, Lei Chen:
UniNet: Scalable Network Representation Learning with Metropolis-Hastings Sampling. 516-527 - Shixun Huang, Yuchen Li, Zhifeng Bao, Zhao Li:
Towards Efficient Motif-based Graph Partitioning: An Adaptive Sampling Approach. 528-539 - Himchan Park, Min-Soo Kim:
LineageBA: A Fast, Exact and Scalable Graph Generation for the Barabási-Albert Model. 540-551 - Huan Zhao, Quanming Yao, Weiwei Tu:
Search to aggregate neighborhood for graph neural network. 552-563 - Chaokun Wang, Binbin Wang, Bingyang Huang, Shaoxu Song, Zai Li:
FastSGG: Efficient Social Graph Generation Using a Degree Distribution Generation Model. 564-575 - Lei Yang, Lei Zou:
Noah: Neural-optimized A* Search Algorithm for Graph Edit Distance Computation. 576-587
Indexing
- Yuanzhe Hao, Xiongpai Qin, Yueguo Chen, Yaru Li, Xiaoguang Sun, Yu Tao, Xiao Zhang, Xiaoyong Du:
TS-Benchmark: A Benchmark for Time Series Databases. 588-599 - R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, Renata Borovica-Gajic:
DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. 600-611 - Kecheng Huang, Zhiping Jia, Zhaoyan Shen, Zili Shao, Feng Chen:
Less is More: De-amplifying I/Os for Key-value Stores with a Log-assisted LSM-tree. 612-623 - Matheus Agio Nerone, Pedro Holanda, Eduardo C. de Almeida, Stefan Manegold:
Multidimensional Adaptive & Progressive Indexes. 624-635 - Rongbiao Xie, Meng Li, Zheyu Miao, Rong Gu, He Huang, Haipeng Dai, Guihai Chen:
Hash Adaptive Bloom Filter. 636-647 - Yuxiang Zeng, Yongxin Tong, Lei Chen:
HST+: An Efficient Index for Embedding Arbitrary Metric Spaces. 648-659
Spatial and Temporal Data Management 2
- Chrysanthi Kosyfaki, Nikos Mamoulis, Evaggelia Pitoura, Panayiotis Tsaparas:
Flow Computation in Temporal Interaction Networks. 660-671 - Kaijie Zhu, George Fletcher, Nikolay Yakovets:
Leveraging Temporal and Topological Selectivities in Temporal-clique Subgraph Query Processing. 672-683 - Zheng Wang, Cheng Long, Gao Cong:
Trajectory Simplification with Reinforcement Learning. 684-695 - Ziquan Fang, Yuntao Du, Lu Chen, Yujia Hu, Yunjun Gao, Gang Chen:
E2DTC: An End to End Deep Trajectory Clustering Framework via Self-Training. 696-707 - Bolong Zheng, Lianggui Weng, Xi Zhao, Kai Zeng, Xiaofang Zhou, Christian S. Jensen:
REPOSE: Distributed Top-k Trajectory Similarity Search with Local Reference Point Tries. 708-719 - Junyang Gao, Stavros Sintos, Pankaj K. Agarwal, Jun Yang:
Durable Top-K Instant-Stamped Temporal Records with User-Specified Scoring Functions. 720-731
Data Management on New Hardware
- Andrew Crotty, Alex Galakatos, Connor Luckett, Ugur Çetintemel:
The Case for In-Memory OLAP on "Wimpy" Nodes. 732-743 - Yuchen Li, Qiwei Zhu, Zheng Lyu, Zhongdong Huang, Jianling Sun:
DyCuckoo: Dynamic Hash Tables on GPUs. 744-755 - Jaeyoung Do, Chen Luo, David B. Lomet:
Programming an SSD Controller to Support Batched Writes for Variable-Size Pages. 756-767 - Saeed Kargar, Heiner Litz, Faisal Nawab:
Predict and Write: Using K-Means Clustering to Extend the Lifetime of NVM Storage. 768-779 - Donghui Wang, Peng Cai, Weining Qian, Aoying Zhou:
Discriminative Admission Control for Shared-everything Database under Mixed OLTP Workloads. 780-791 - David B. Lomet, Chen Luo:
Efficiently Reclaiming Space in a Log Structured Store. 792-803
Stream Data Management 1
- Peng Jia, Pinghui Wang, Junzhou Zhao, Ye Yuan, Jing Tao, Xiaohong Guan:
LogLog Filter: Filtering Cold Items within a Large Range over High Speed Data Streams. 804-815 - Taehyung Kwon, Inkyu Park, Dongjin Lee, Kijung Shin:
SliceNStitch: Continuous CP Decomposition of Sparse Tensor Streams. 816-827 - Bogyeong Kim, Kyoseung Koo, Juhun Kim, Bongki Moon:
DISC: Density-Based Incremental Clustering by Striding over Streaming Data. 828-839 - Dongjin Lee, Kijung Shin:
Robust Factorization of Real-world Tensor Streams with Patterns, Missing Values, and Outliers. 840-851 - Muhammad Saad, Abraham Bernstein, Michael H. Böhlen, Daniele Dell'Aglio:
Single Point Incremental Fourier Transform on 2D Data Streams. 852-863 - Ran Ben Basat, Gil Einziger, Michael Mitzenmacher, Shay Vargaftik:
SALSA: Self-Adjusting Lean Streaming Analytics. 864-875
Knowledge Discovery
- Yueji Yang, Yuchen Li, Anthony K. H. Tung:
NewsLink: Empowering Intuitive News Search with Knowledge Graphs. 876-887 - Na Li, Renyu Zhu, Xiaoxu Zhou, Xiangnan He, Wenyuan Cai, Ming Gao, Aoying Zhou:
On Disambiguating Authors: Collaboration Network Reconstruction in a Bottom-up Manner. 888-899 - Pei Yi, Hong Xie, Yongkun Li, John C. S. Lui:
A Bootstrapping Approach to Optimize Random Walk Based Statistical Estimation over Graphs. 900-911 - Xiang Li, Danhao Ding, Ben Kao, Yizhou Sun, Nikos Mamoulis:
Leveraging Meta-path Contexts for Classification in Heterogeneous Information Networks. 912-923 - Rana Alotaibi, Chuan Lei, Abdul Quamar, Vasilis Efthymiou, Fatma Özcan:
Property Graph Schema Optimization for Domain-Specific Knowledge Graphs. 924-935 - Jian Zeng, Leong Hou U, Xiao Yan, Mingji Han, Bo Tang:
Fast Core-based Top-k Frequent Pattern Discovery in Knowledge Graphs. 936-947
Query Processing and Optimization 1
- Fan Zhang, Hanhua Chen, Hai Jin, Pedro Reviriego:
The Logarithmic Dynamic Cuckoo Filter. 948-959 - Xiaolong He, Peng Cai, Xuan Zhou, Aoying Zhou:
Continuously Bulk Loading over Range Partitioned Tables for Large Scale Historical Data. 960-971 - Jinfei Liu, Li Xiong, Qiuchen Zhang, Jian Pei, Jun Luo:
Eclipse: Generalizing kNN and Skyline. 972-983 - Magnus Müller, Daniel Flachs, Guido Moerkotte:
Memory-Efficient Key/Foreign-Key Join Size Estimation via Multiplicity and Intersection Size. 984-995 - Ce Zhang, Cheng Xu, Haixin Wang, Jianliang Xu, Byron Choi:
Authenticated Keyword Search in Scalable Hybrid-Storage Blockchains. 996-1007 - Sofoklis Floratos, Mengbai Xiao, Hao Wang, Chengxin Guo, Yuan Yuan, Rubao Lee, Xiaodong Zhang:
NestGPU: Nested Query Processing on GPU. 1008-1019
Data Management on New Hardware
- Fan Yang, Youmin Chen, Youyou Lu, Qing Wang, Jiwu Shu:
Aria: Tolerating Skewed Workloads in Secure In-memory Key-value Stores. 1020-1031 - Junkai Liang, Yunpeng Chai:
CruiseDB: An LSM-Tree Key-Value Store with Both Better Tail Throughput and Tail Latency. 1032-1043 - Zubeyr F. Eryilmaz, Aarati Kakaraparthy, Jignesh M. Patel, Rathijit Sen, Kwanghyun Park:
FPGA for Aggregate Processing: The Good, The Bad, and The Ugly. 1044-1055
Stream Data Management 2
- Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet, Russel Pears:
Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information. 1056-1067 - Lukasz Korycki, Bartosz Krawczyk:
Concept Drift Detection from Multi-Class Imbalanced Data Streams. 1068-1079 - Keyu Yang, Yunjun Gao, Yifeng Shen, Baihua Zheng, Lu Chen:
DisMASTD: An Efficient Distributed Multi-Aspect Streaming Tensor Decomposition. 1080-1091
Stream Data Management 3
- Bo Hui, Haiquan Chen, Da Yan, Wei-Shinn Ku:
EDGE: Entity-Diffusion Gaussian Ensemble for Interpretable Tweet Geolocation Prediction. 1092-1103 - Shimin Di, Quanming Yao, Yongqi Zhang, Lei Chen:
Efficient Relation-aware Scoring Function Search for Knowledge Graph Embedding. 1104-1115 - Meng-Chieh Lee, Catalina Vajiac, Aayushi Kulshrestha, Sacha Levy, Namyong Park, Cara Jones, Reihaneh Rabbany, Christos Faloutsos:
INFOSHIELD: Generalizable Information-Theoretic Human-Trafficking Detection. 1116-1127 - Yansheng Wang, Yongxin Tong, Dingyuan Shi, Ke Xu:
An Efficient Approach for Cross-Silo Federated Learning to Rank. 1128-1139 - Zhaoyue Cheng, Nick Koudas, Zhe Zhang, Xiaohui Yu:
Efficient Construction of Nonlinear Models over Normalized Data. 1140-1151 - Çigdem Aslay, Martino Ciaperoni, Aristides Gionis, Michael Mathioudakis:
Workload-aware Materialization for Efficient Variable Elimination on Bayesian Networks. 1152-1163
Spatial and Temporal Data
- Yuval Alfassi, Moshe Gabel, Gal Yehuda, Daniel Keren:
A Distance-Based Scheme for Reducing Bandwidth in Distributed Geometric Monitoring. 1164-1175 - Qiyu Liu, Yanyan Shen, Lei Chen:
LHist: Towards Learning Multi-dimensional Histogram for Massive Spatial Data. 1188-1199 - Guang Wang, Shuxin Zhong, Shuai Wang, Fei Miao, Zheng Dong, Desheng Zhang:
Data-Driven Fairness-Aware Vehicle Displacement for Large-Scale Electric Taxi Fleets. 1200-1211 - Ting Wang, Xike Xie, Xin Cao, Torben Bach Pedersen, Yang Wang, Mingjun Xiao:
On Efficient and Scalable Time-Continuous Spatial Crowdsourcing. 1212-1223 - Guanyao Li, Chih-Chieh Hung, Mengyun Liu, Linfei Pan, Wen-Chih Peng, S.-H. Gary Chan:
Spatial-Temporal Similarity for Trajectories with Location Noise and Sporadic Sampling. 1224-1235
Data Integration and Cleaning 2
- Roee Shraga, Ofra Amir, Avigdor Gal:
Learning to Characterize Matching Experts. 1236-1247 - Leonardo Gazzarri, Melanie Herschel:
End-to-end Task Based Parallelization for Entity Resolution on Dynamic Data. 1248-1259 - Youfu Li, Jin Wang, Mingda Li, Ariyam Das, Jiaqi Gu, Carlo Zaniolo:
KDDLog: Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates. 1260-1271 - Alex Bogatu, Norman W. Paton, Mark Douthwaite, Stuart Davie, André Freitas:
Cost-effective Variational Active Entity Resolution. 1272-1283 - Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava:
Structured Object Matching across Web Page Revisions. 1284-1295 - Pei Wang, Weiling Zheng, Jiannan Wang, Jian Pei:
Automating Entity Matching Model Development. 1296-1307
Graph Data Management 3
- Xiaoshuang Chen, Longbin Lai, Lu Qin, Xuemin Lin, Boge Liu:
A Framework to Quantify Approximate Simulation on Graph Data. 1308-1319 - Zhengmin Lai, You Peng, Shiyu Yang, Xuemin Lin, Wenjie Zhang:
PEFP: Efficient k-hop Constrained s-t Simple Path Enumeration on FPGA. 1320-1331 - Michael Yu, Lu Qin, Ying Zhang, Wenjie Zhang, Xuemin Lin:
DPTL+: Efficient Parallel Triangle Listing on Batch-Dynamic Graphs. 1332-1343 - Xiaofan Li, Rui Zhou, Lu Chen, Yong Zhang, Chengfei Liu, Qiang He, Yun Yang:
Finding a Summary for All Maximal Cliques. 1344-1355 - Kaixin Liu, Sibo Wang, Yong Zhang, Chunxiao Xing:
An Efficient Algorithm for the Anchored k-Core Budget Minimization Problem. 1356-1367 - Geonmo Gu, Yehyun Nam, Kunsoo Park, Zvi Galil, Giuseppe F. Italiano, Wook-Shin Han:
Scalable Graph Isomorphism: Combining Pairwise Color Refinement and Backtracking via Compressed Candidate Space. 1368-1379