default search action
BigData Conference 2013: Santa Clara, CA, USA
- Xiaohua Hu, Tsau Young Lin, Vijay V. Raghavan, Benjamin W. Wah, Ricardo Baeza-Yates, Geoffrey C. Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang, Rayid Ghani, Wei Fan, Ronny Lempel, Raghunath Nambiar:
2013 IEEE International Conference on Big Data (IEEE BigData 2013), 6-9 October 2013, Santa Clara, CA, USA. IEEE Computer Society 2013, ISBN 978-1-4799-1292-6
Conference Paper Presentations
- Amgad Madkour, Walid G. Aref, Saleh M. Basalamah:
Knowledge cubes - A proposal for scalable and semantically-guided management of Big Data. 1-7 - Pascal Bianchi, Stéphan Clémençon, Gemma Morral, Jérémie Jakubowicz:
On-line learning gossip algorithm in multi-agent systems with local decision rules. 6-14 - Peter Sanders, Sebastian Schlag, Ingo Müller:
Communication efficient algorithms for fundamental big data problems. 15-23 - Upa Gupta, Leonidas Fegaras:
Map-based graph analysis on MapReduce. 24-30 - Tao Luo, Yin Liao, Guoliang Chen, Yunquan Zhang:
P-DOT: A model of computation for big data. 31-37 - En-Hui Yang, Xiang Yu:
Transparent composite model for large scale image/video processing. 38-44 - Rui Han, Lei Nie, Moustafa Ghanem, Yike Guo:
Elastic algorithms for guaranteeing quality monotonicity in big data mining. 45-50 - Mario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell'Amico, Pietro Michiardi:
HFSP: Size-based scheduling for Hadoop. 51-59 - Benedikt Elser, Alberto Montresor:
An evaluation study of BigData frameworks for graph processing. 60-67 - Bryan N. Lawrence, Victoria L. Bennett, J. Churchill, Martin Juckes, Philip Kershaw, Stephen Pascoe, Sam Pepler, M. Pritchard, Ag Stephens:
Storing and manipulating environmental big data with JASMIN. 68-75 - Hieu Hanh Le, Satoshi Hikida, Haruo Yokota:
Efficient gear-shifting for a power-proportional distributed data-placement method. 76-84 - Patrick Leyshock, David Maier, Kristin Tufte:
Agrios: A hybrid approach to big array analytics. 85-93 - Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, Orestis Tsinalis, Yang Li, Shulin Yan, Moustafa Ghanem, Yike Guo:
Building a generic platform for big sensor data application. 94-102 - Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen:
Locality-driven high-level I/O aggregation for processing scientific datasets. 103-111 - Dheeraj Kumar, Marimuthu Palaniswami, Sutharshan Rajasegarar, Christopher Leckie, James C. Bezdek, Timothy C. Havens:
clusiVAT: A mixed visual/numerical clustering algorithm for big data. 112-117 - Toshimori Honjo, Kazuki Oikawa:
Hardware acceleration of Hadoop MapReduce. 118-124 - Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh:
Optimizing the MapReduce framework on Intel Xeon Phi coprocessor. 125-130 - Eugen Feller, Lavanya Ramakrishnan, Christine Morin:
On the performance and energy efficiency of Hadoop deployment models. 131-136 - D. Michael Freemon:
Optimizing throughput on guaranteed-bandwidth WAN networks for the Large Synoptic Survey Telescope (LSST). 137-142 - Takuya Araki, Kazuyo Narita, Hiroshi Tamano:
Feliss: Flexible distributed computing framework with light-weight checkpointing. 143-149 - Jonas Dias, Eduardo S. Ogasawara, Daniel de Oliveira, Fábio Porto, Patrick Valduriez, Marta Mattoso:
Algebraic dataflows for big data analysis. 150-155 - Wei Yan, Yuan Xue, Bradley A. Malin:
Scalable and robust key group size estimation for reducer load balancing in MapReduce. 156-162 - Chao Yin, Jianzong Wang, Changsheng Xie, Jiguang Wan, Changlin Long, Wenjuan Bi:
Robot: An efficient model for big data storage systems based on erasure coding. 163-168 - Chao Chen, Michael Lang, Yong Chen:
Multilevel Active Storage for big data applications in high performance computing. 169-174 - Chandima Hewa Nadungodage, Yuni Xia, Jaehwan John Lee, Myungcheol Lee, Choon Seo Park:
GPU accelerated item-based collaborative filtering for big-data applications. 175-180 - GuiXin Guo, Shuang Qiu, Zhiqiang Ye, Bingqiang Wang, Lin Fang, Mian Lu, Simon See, Rui Mao:
GPU-accelerated adaptive compression framework for genomics data. 181-186 - Deepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, Calton Pu:
An infrastructure for automating large-scale performance studies and data processing. 187-192 - Li-Yung Ho, Tsung-Han Li, Jan-Jan Wu, Pangfeng Liu:
Kylin: An efficient and scalable graph data processing system. 193-198 - Qunzhi Zhou, Yogesh Simmhan, Viktor K. Prasanna:
Towards hybrid online on-demand querying of realtime data with stateful complex event processing. 199-205 - Jiaran Zhang, Xiaohui Yu, Yang Liu, Liwei Lin:
DDSN: Duplicate detection to reduce both storage and bandwidth consumption. 206-211 - Aalap Tripathy, Ka Chon Ieong, Atish Patra, Rabi N. Mahapatra:
A reconfigurable computing architecture for semantic information filtering. 212-218 - Oyindamola O. Akande, Philip J. Rhodes:
Iteration aware prefetching for unstructured grids. 219-227 - Elad Yom-Tov, Mounia Lalmas, Ricardo Baeza-Yates, Georges Dupret, Janette Lehmann, Pinar Donmez:
Measuring inter-site engagement. 228-236 - Ting Chen, Kenjiro Taura:
A selective checkpointing mechanism for query plans in a parallel database system. 237-245 - Kyumars Sheykh Esmaili, Lluis Pamies-Juarez, Anwitaman Datta:
CORE: Cross-object redundancy for efficient data repair in storage systems. 246-254 - Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, Nectarios Koziris:
H2RDF+: High-performance distributed joins over large-scale RDF graphs. 255-263 - Austin R. Benson, David F. Gleich, James Demmel:
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. 264-272 - Radu Tudoran, Alexandru Costan, Ramin Rezai Rad, Goetz Brasche, Gabriel Antoniu:
Adaptive file management for scientific workflows on the Azure cloud. 273-281 - Tian Guo, Thanasis G. Papaioannou, Karl Aberer:
Model-view sensor data management in the cloud. 282-290 - Anthony D. Fox, Christopher N. Eichelberger, James N. Hughes, Skylar Lyon:
Spatio-temporal indexing in non-relational distributed databases. 291-299 - Lefteris Sidirourgos, Martin L. Kersten, Peter A. Boncz:
Scientific discovery through weighted sampling. 300-306 - Stefan Pröll, Andreas Rauber:
Scalable data citation in dynamic, large databases: Model and reference implementation. 307-312 - Krish K. R., Aleksandr Khasymski, Guanying Wang, Ali Raza Butt, Gaurav Makkar:
On the use of shared storage in shared-nothing environments. 313-318 - Alexander Artikis, Matthias Weidlich, Avigdor Gal, Vana Kalogeraki, Dimitrios Gunopulos:
Self-adaptive event recognition for intelligent transport management. 319-325 - Leonardo Arturo Bautista-Gomez, Franck Cappello:
Improving floating point compression through binary masks. 326-331 - Junjie Chen, Philip C. Roth, Yong Chen:
Using pattern-models to guide SSD deployment for Big Data applications in HPC systems. 332-337 - Zhiquan Liu, Luo Luo, Wu-Jun Li:
Robust crowdsourced learning. 338-343 - Jialin Liu, Surendra Byna, Yong Chen:
Segmented analysis for reducing data movement. 344-349 - Simon Chan, Philip C. Treleaven, Licia Capra:
Continuous hyperparameter optimization for large-scale recommender systems. 350-358 - Hoang Vu Nguyen, Emmanuel Müller, Klemens Böhm:
4S: Scalable subspace search scheme overcoming traditional Apriori processing. 359-367 - Lars Arge, Michael T. Goodrich, Freek van Walderveen:
Computing betweenness centrality in external memory. 368-375 - Rong Gu, Furao Shen, Yihua Huang:
A parallel computing platform for training large scale neural networks. 376-384 - Raghvendra Mall, Rocco Langone, Johan A. K. Suykens:
Self-tuned kernel spectral clustering for large scale networks. 385-393 - Yuichiro Yasui, Katsuki Fujisawa, Kazushige Goto:
NUMA-optimized parallel breadth-first search on multicore single-node system. 394-402 - Arash Fard, M. Usman Nisar, Lakshmish Ramaswamy, John A. Miller, Matthew Saltz:
A distributed vertex-centric approach for pattern matching in massive graphs. 403-411 - Lee Parnell Thompson, Weijia Xu, Daniel P. Miranker:
Fast scalable selection algorithms for large scale data. 412-420 - Yoshiki Sakai, Kenji Yamanishi:
An NML-based model selection criterion for general relational data modeling. 421-429 - Rajiv Khanna, Liang Zhang, Deepak Agarwal, Bee-Chung Chen:
Parallel matrix factorization for binary response. 430-438 - Desheng Zhang, Tian He, Yunhuai Liu, John A. Stankovic:
CallCab: A unified recommendation system for carpooling and regular taxicab services. 439-447 - Abhirup Chakraborty:
Top-K aggregation over a large graph using shared-nothing systems. 448-457 - Nemanja Djuric, Mihajlo Grbovic, Slobodan Vucetic:
Distributed confidence-weighted classification on MapReduce. 458-466 - Zhiwei Yu, Raymond K. Wong, Chi-Hung Chi:
Scalable context-aware role mining with MapReduce. 467-474 - Yusheng Xie, Zhengzhang Chen, Kunpeng Zhang, Chen Jin, Yu Cheng, Ankit Agrawal, Alok N. Choudhary:
Elver: Recommending Facebook pages in cold start situation without content features. 475-479 - Paul Logasa Bogen, Christopher T. Symons, Amber McKenzie, Robert M. Patton, Robert E. Gillen:
Massively scalable near duplicate detection in streams of documents using MDSH. 480-486 - Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, Ümit V. Çatalyürek:
Incremental algorithms for closeness centrality. 487-492 - Bo Zhang, Zhongzhi Shi:
Classification of big velocity data via cross-domain Canonical Correlation Analysis. 493-498 - Frank K. H. A. Dehne, Q. Kong, Andrew Rau-Chaplin, Hamidreza Zaboli, R. Zhou:
A distributed tree data structure for real-time OLAP on cloud architectures. 499-505 - Jiangling Yin, Andrew Foran, Jun Wang:
DL-MPI: Enabling data locality computation for MPI-based data-intensive applications. 506-511 - Chenxia Wu, Haiqin Yang, Jianke Zhu, Jiemi Zhang, Irwin King, Michael R. Lyu:
Sparse Poisson coding for high dimensional document clustering. 512-517 - Martin Weidner, Jonathan Dees, Peter Sanders:
Fast OLAP query execution in main memory on large data in a cluster. 518-524 - Xudong Zhang, Wayne Xin Zhao, Dongdong Shan, Hongfei Yan:
Group-Scheme: SIMD-based compression algorithms for web text data. 525-530 - Chun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang, De-Nian Yang, Ming-Syan Chen:
Efficient large graph pattern mining for big data in the cloud. 531-536 - Rui Wang, Kenneth Chiu:
A stream partitioning approach to processing large scale distributed graph datasets. 537-542 - Richard McCreadie, Craig Macdonald, Iadh Ounis, Miles Osborne, Sasa Petrovic:
Scalable distributed event detection for Twitter. 543-549 - Barbara Furletti, Lorenzo Gabrielli, Chiara Renso, Salvatore Rinzivillo:
Analysis of GSM calls data for understanding user mobility behavior. 550-555 - Haizhou Fu, HyeongSik Kim, Kemafor Anyanwu:
Scaling concurrency of personalized Semantic search over Large RDF data. 556-562 - Hui Miao, Xiangyang Liu, Bert Huang, Lise Getoor:
A hypergraph-partitioned vertex programming approach for large-scale consensus optimization. 563-568 - Simon Price, Peter A. Flach:
A Higher-order data flow model for heterogeneous Big Data. 569-574 - Daniel Trabold, Henrik Grosskreutz:
Parallel subgroup discovery on computing clusters - First results. 575-579 - Darakhshan J. Mir, Sibren Isaacman, Ramón Cáceres, Margaret Martonosi, Rebecca N. Wright:
DP-WHERE: Differentially private modeling of human mobility. 580-588 - Min-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee, Hsing-Kuo Pao:
Malicious URL filtering - A big data application. 589-596 - Maryam Shoaran, Alex Thomo, Jens H. Weber-Jahnke:
Zero-knowledge private graph summarization. 597-605 - Lei Shi, Qi Liao, Xiaohua Sun, Yarui Chen, Chuang Lin:
Scalable network traffic visualization using compressed graphs. 606-612 - Duncan Hodges, Sadie Creese:
Breaking the Arc: Risk control for Big Data. 613-621 - Tim Hegeman, Bogdan Ghit, Mihai Capota, Jan Hidders, Dick H. J. Epema, Alexandru Iosup:
The BTWorld use case for big data analytics: Description, MapReduce logical workflow, and empirical evaluation. 622-630 - Bin Liu, Haifeng Chen, Abhishek B. Sharma, Guofei Jiang, Hui Xiong:
Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems. 631-638 - Wei Lu, Gang Chen, Anthony K. H. Tung, Feng Zhao:
Efficiently extracting frequent subgraphs using MapReduce. 639-647 - Diego Pennacchioli, Michele Coscia, Salvatore Rinzivillo, Dino Pedreschi, Fosca Giannotti:
Explaining the product range effect in purchase data. 648-656 - Natasha Balac, Tamara B. Sipes, Nicole Wolter, Kenneth Nunes, Robert S. Sinkovits, Homa Karimabadi:
Large Scale predictive analytics for real-time energy management. 657-664 - Geoffrey C. Fox, Deepak R. Mani, Saumyadipta Pyne:
Parallel deterministic annealing clustering and its application to LC-MS data analysis. 665-673 - Diana Moise, Denis Shestakov, Gylfi Þór Gudmundsson, Laurent Amsaleg:
Terabyte-scale image similarity search: Experience and best practice. 674-682 - Matthieu-P. Schapranow, Hasso Plattner:
HIG - An in-memory database platform enabling real-time analyses of genome data. 691-696 - András Garzó, András A. Benczúr, Csaba István Sidló, Daniel Tahara, Erik Francis Wyatt:
Real-time streaming mobility analytics. 697-702 - Andrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, Norbert Zeh:
QuPARA: Query-driven large-scale portfolio aggregate risk analysis on MapReduce. 703-709 - Mauricio A. Hernández, Kirsten Hildrum, Prateek Jain, Rohit Wagle, Bogdan Alexe, Rajasekar Krishnamurthy, Ioana Roxana Stanoi, Chitra Venkatramani:
Constructing consumer profiles from social media data. 710-716 - Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, Jan-Ming Ho:
CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework. 717-722 - Jungsuk Kwac, Ram Rajagopal:
Demand response targeting using big data analytics. 683-690 - Adrian Albert, Ram Rajagopal:
Building dynamic thermal profiles of energy consumption for individuals and neighborhoods. 723-728 - Peter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe Chalfoun, Mary Brady:
Terabyte-sized image computations on Hadoop cluster platforms. 729-737 - Ron Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach, Roi Saltzman:
A fast and scalable method for threat detection in large-scale DNS logs. 738-741 - Matthew Hayes, Sam Shah:
Hourglass: A library for incremental processing on Hadoop. 742-752 - Qi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, Wentao Tang:
Correlation-based performance analysis for full-system MapReduce optimization. 753-761 - Mihajlo Grbovic, Jon Malkin, Hirakendu Das:
Large scale ad latency analysis. 762-767 - Alessandro Morari, Vito Giovanni Castellana, David Haglin, John Feo, Jesse Weaver, Antonino Tumeo, Oreste Villa:
Accelerating semantic graph databases on commodity clusters. 768-772 - Peter Lubell-Doughtie, Jon Sondag:
Practical distributed classification using the Alternating Direction Method of Multipliers algorithm. 773-776 - Varun Sharma, Jeremy Carroll, Abhi Khune:
Scaling deep social feeds at Pinterest. 777-783 - Thibaud Chardonnens, Philippe Cudré-Mauroux, Martin Grund, Benoit Perroud:
Big data analytics on high Velocity streams: A case study. 784-787
Workshop 1: Distributed Storage Systems and Coding for Big Data
- Iryna Andriyanova, Alan Jule, Emina Soljanin:
The Code rebalancing problem for a storage-flexible Data Center Network. 1-6 - Wasim Ahmad Bhat, S. M. K. Quadri:
suvfs: A virtual file system in userspace that supports large files. 7-11 - Antonio Campello, Vinay A. Vaishampayan:
Reliability of erasure coded storage systems: A geometric approach. 12-16 - Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian, Vinay A. Vaishampayan:
Distributed storage evaluation on a three-wide inter-data center deployment. 17-22 - Vinay Deolalikar:
Paired-replicas with constant repair time: Loss functions and memorylessness. 23-27 - Kyumars Sheykh Esmaili, Aatish Chiniah, Anwitaman Datta:
Efficient updates in cross-object erasure-coded storage systems. 28-32 - Hanxu Hou, Kenneth W. Shum, Hui Li:
Construction of exact-BASIC codes for distributed storage systems at the MSR point. 33-38