School of Computing, Queen's University

Bibliography 2011

Original Bibliography by Prof. Patrick Martin for a graduate course on Data Manangement in Cloud Computing (CISC 839) with some additions from Mian and Al Harkan.

Publication Venues

The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010). May 17-20, Melbourne, Victoria, Australia.]

IEEE Cloud 2010: The 3rd International Conference on Cloud Computing.]

CLOUDCOMP 2009: First International Conference on Cloud Computing October 19 - 21, 2009, Munich, Germany 

IEEE International Conference on Cloud Computing, 2009. CLOUD '09.]

Virtual Conference on Cloud Computing. CLOUDSLAM 

CloudApp 2010: The First IEEE International Workshop on Emerging Applications for Cloud Computing. 
[|]; (acknowledgment: Haroon Malik)


Bégin, M., An egee comparative study: Grids and clouds-evolution or revolution, in Enabling Grids for E-Science. 2008, CERN. (probably the first academic perspective on cloud).

M. Vouk. Cloud Computing – Issues, Research and Implementations, Journal of Computing and Information Technology – CIT 16(4), 235 – 246, 2008.,%20Research%20and%20Implementations.pdf

L. Vaquero, L. Rodero-Merino, J. Caceres and M. Linder. A Break in the Clouds: Towards a Cloud Definition, ACM SIGCOMM Computer Communication Review 39(1), 50 – 55, January 2009.

M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica and M. Zaharia. Above the Clouds: A Berkeley View of Cloud Computing, Technical Report No. UCB/EECS-2009-28, Electrical Engineering and Computer Sciences, University of California at Berkeley, February 2009.

R. Buyya, C. Yeo, and S. Venugopal, Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities, Keynote Paper, Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications (HPCC 2008, IEEE CS Press, Los Alamitos, CA, USA), Sept. 25-27, 2008, Dalian, China.

A. Weiss. Computing in the clouds. netWorker 11, 4 (Dec. 2007), 16-25.

D. DeWitt and J. Gray. Parallel Database Systems: The Future of High Performance Database Systems. Communications of the ACM 35(6), 85 – 98, 1992.

M. Satyanarayanan. A Survey of Distributed File Systems, 1989 .

B. Ooi and S. Parthasarathy (eds). Special Issue on Data Management on Cloud Computing Platforms, IEEE Data Engineering Bulletin 32(1), March 2009.

Service-Oriented Computing

M. Papazoglou and W.  van den Heuvel. Service oriented architectures: approaches, technologies and research issues, VLDB JOURNAL,vol.16,no.3, pp.389-415, 2007,

M. Papazoglou. Service-Oriented Computing: Concepts, Characteristics and Directions, Proc of Fourth International Conference on Web Information Systems Engineering (WISE¿03), 2003,

M. Huhns and M. Singh. Service-Oriented Computing: Key Concepts and Principles, IEEE Internet Computing, Jan-Feb 2005, pp. 75 – 81,

R. Fielding and R. Taylor. Principled Design of the Modern Web Architecture, ACM Trans on Internet Technology 2(2), May 2002, pp. 115 – 150.

N. Milanovic and M. Malek. Current Solutions for Web Service Composition, IEEE Internet Computing 8(6), 51 – 59, November 2004.

M. Little. Transactions and Web Services, Communications of the ACM 46(10), October 2003.

Steffen Staab, Wil M. P. van der AalstV. Richard BenjaminsAmit P. ShethJohn A. MillerChristoph BusslerAlexander MaedcheDieter FenselDennis Gannon: Web Services: Been There, Done That? IEEE Intelligent Systems 18(1): 72-85 (2003)

Michael L. Brodie: Illuminating the Dark Side of Web Services. VLDB 2003: 1046-1049

Virtualization and Resource Allocation

VRA1. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt and A. Warfield. Xen and the Art of Virtualization, Proc of 19thACM Symposium on Operating System Principals, Bolton Landing NY, October 2003, pp. 164 – 177,

VRA2. D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff and D. Zagorodnov. The Eucalyptus Open-Source Cloud-Computing System, CCGRID 2009.

VRA3. A. Soror, U. Minhas, A. Aboulnaga, K. Salem, P. Kokosielis, and S. Kamath. Automatic Virtual Machine Configuration for Database Workloads.SIGMOD 2008.

VRA4. L. Ramakrishnan, D. Irwin, L. Grit, A. Yumerefendi, A. Iamnitchi, and J. Chase. Toward a Doctrine of Containment: Grid Hosting with Adaptive Resource Control. SC 2006.

D. Irwin, J. Chase, L. Grit, A. Yumerefendi, D. Becker, and K. Yocum. Sharing Networked Resources with Brokered Leases. USENIX Annual Conference 2006.

VMWare. Virtualization Overview,

S. Aulbach, T. Grust, D. Jacobs, A. Kemper, and J. Rittinger. Multi-tenant Databases for Software as a Service: Schema-mapping Techniques. SIGMOD 2008.

U. Minhas, J.  Yadav, A. Aboulnaga and K. Salem. Database systems on virtual machines: How much do you lose? Proc of IEEE 24th International Conference on Data Engineering Workshop, 35 – 41, April 2008.

Lagar-Cavilla, H.A., et al., SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing, in Eurosys'09: Proceedings Of The Fourth Eurosys Conference. 2009, Assoc Computing Machinery: New York. p. 1-12.

Distributed Data Processing

DDP1. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters, Proc of USENIX 6thSymposium on Operating System Design and Implementation, San Franciso CA, Dec 2004,

DDP2. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. SIGMOD 2008.

DDP3. R.  Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the Data: Parallel Analysis with Sawzall. Scientific Programming 13(4), 2005.

DDP4. R. Chaiken, B. Jenkins, P. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets.VLDB 2008.

DDP5. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks. EuroSys 2007.

DDP6. D. DeWitt, E. Robinson, S. Shankar, E. Paulson, J. Naughton, A.  Krioukov, and J. Royalty. Clustera: An Integrated Computation and Data Management System. VLDB 2008.

DDP7. Parag Agrawal, Daniel Kifer, and Christopher Olston. Scheduling Shared Scans of Large Data Files. VLDB 2008.

DDP8. Lei Chen, Christopher Olston, and Raghu Ramakrishnan. Parallel Evaluation of Composite Aggregate Queries. ICDE 2008.

H.Yang,  A. Dasdan, R. Hsiao, and D. Stott Parker Jr. Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. SIGMOD 2007.

Apache. Hadoop web site.

R. Shankar and G. Narenda. MapReduce Programming with Apache Hadoop, JavaWorld, September 2008,

E. Friedman, P. Pawlowski and J. Cieslewicz. SQL/MapReduce: A Practical Approach to Self-describing, Polymorphic and Parallelizable User-defined Functions, VLDB 2009.

Shang, W.Y., et al., MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR), in 2009 6th Ieee International Working Conference On Mining Software Repositories. 2009, Ieee: New York. p. 21-30.

Storage and Retrieval

SR1. S. Ghemawat, H. Gobioff, and S. Leung. The Google File System. SOSP 2003.

SR2. F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber. Bigtable: A Distributed Storage System for Structured Data. OSDI 2006.

SR3. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's Highly Available Key-Value Store. SOSP 2007.

SR4. M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska. Building a Database on S3. SIGMOD 2008.

SR5. B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!'s Hosted Data Serving Platform. VLDB 2008.

SR6. A. Silberstein, B. Cooper, U. Srivastava, E. Vee, R. Yerneni, and R. Ramakrishnan. Efficient Bulk Insertion into a Distributed Ordered Table. SIGMOD 2008.

SR7. T. Kraska, M. Hentschel, G. Alonso and D. Kossmann. Consistency Rationing in the Cloud: Pay only when it matters, VLDB 2009.

SR8. M. Aguilera, W. Golab, and M. Shah. A Practical Scalable Distributed B-Tree. VLDB 2008.

C. Plattner and G. Alonso. Ganymed: Scalable Replication for Transactional Web Applications. Middleware 2004.

K. Daudjee and K. Salem. Lazy Database Replication with Snapshot Isolation. VLDB 2006.

M. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A New Paradigm for Building Scalable Distributed Systems. SOSP 2007.

E. Cecchet, G. Candea and A. Ailamaki. Middleware-based Replication: The Gaps between Theory and Practice, SIGMOD 2008.

Daniel J. Abadi: Data Management in the Cloud: Limitations and Opportunities.IEEE Data Eng. Bull. 32(1): 3-12 (2009)

Large-Scale Data Analysis / Warehousing

LSDA1. A. Pavlo, E. Paulson, A. Rasin, D. Abadi, D. DeWitt, S. Madden and M. Stonebraker. A Comparison of Approaches to Large-Scale Data Analysis,SIGMOD 2009.

LSDA2. J. Dean and S. Ghemawat. MapReduce: A Flexible Data Processing Tool, CACM2010.

EDIT - Proper link:

LSDA3. A. Abouzeid, K. Pawlikowski, D. Abadi, A. Silbershatz and A. Rasin. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads, VLDB 2009.

LSDA4. M. Ahuja, C. Chen, R. Gottapu, J. Hallman, W. Hasan, R. Johnson, M. Kozyczak, R. Pabbati, N. Pandit, S. Pokuri and K. Uppala. Peta-Scale Data Warehousing at Yahoo! , SIGMOD 2009.

LSDA5. Y. Xu, P. Kostamaa, X. Zhou and L. Chen. Handling Data Skew in Parallel Joins in Shared-Nothing Systems, SIGMOD 2008.

LSDA6. G. Candea, N. Polyzotis and R. Vingralek. A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses, VLDB 2009.

LSDA7.  J. Cohen, B. Dolan, M. Dunlap, J. Hellerstein and C. Welton. MAD Skills: New Analysis Practices for Big Data, VLDB 2009.

LSDA8. A. Gates, O. Natkovich, S. Chopra, P. Kamath, S. Narayanamurthy, C. Olston and B. Reed. Building a High-Level Data Flow System on top of MapReduce: The Pig Experience, VLDB 2009.

M Stonebraker, D. Abadi, D. DeWitt, S. Madden, E. Paulson, A. Pavlo and A. Rasin. MapReduce and Parallel DBMSs: Friends or Foes? CACM 2010.

D. Lomet and M. Mokbel. Locking Key Ranges with Unbundled Transaction Services, VLDB 2009.

B. Panda, J. Herbach, S. Basu and R. Bayardo. PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce, VLDB 2009.


Rajkumar Buyya, Rajiv Ranjan and Rodrigo N. Calheiros, Modeling and Simulation of Scalable Cloud Computing Environments and the CloudSim Toolkit: Challenges and Opportunities, Proceedings of the 7th High Performance Computing and Simulation Conference (HPCS 2009, ISBN: 978-1-4244-4907-1, IEEE Press, New York, USA), Leipzig, Germany, June 21-24, 2009.

Anthony Sulistio,Chee Shin Yeo,Rajkumar Buyya: A taxonomy of computer-based simulations and its mapping to parallel and distributed systems simulation tools.Softw., Pract. Exper. 34(7): 653-673 (2004)

Workload Execution

Paton, N. W., M. A. T. Aragão, et al. (2009). "Optimizing Utility in Cloud Computing through Autonomic Workload Execution." IEEE Data Engineering Bulletin32(1): 51-58.

Grid vs Cloud

Foster, I., Z. Yong, et al. (2008). Cloud Computing and Grid Computing 360-Degree Compared. Grid Computing Environments Workshop, 2008. GCE '08.

Cloud Security

The Information Warfare Monitor / (Citizen Lab , Munk School of Global Affairs, University of Toronto and the SecDev Group , Ottawa) and the Shadowserver Foundation announce the release of "Shadows in the Cloud: An investigation into cyber espionage 2.0. " FULL REPORT. April 6, 2010