This workshop explores the important role of humans, in support roles, in making cloud computing useful in research settings. Cloud computing is clearly a type of cyberinfrastructure, which workshop organizer Craig Stewart defines as comprising “computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible" (2017). This workshop will focus on the "and people" part of cyberinfrastructure, and in particular on the role of people in supporting the use of commercial cloud resources in research.
The overall goal of this workshop is to share information about best practices, successes, and challenges in supporting research use of commercial clouds. To accomplish this, we will:
- Provide detailed, concrete examples of the effective use of cloud computing, comparing and contrasting use of cloud resources with traditional campus and national infrastructure alternatives.
- Provide a forum for a free exchange of ideas, challenges, and best-practices in supporting the use of commercial cloud computing in advancing research across many disciplines.
- Foster networking among attendees, broadening the impact of efforts to put greater emphasis on the human component of cyberinfrastructure, including Campus Champions, CaRCC (the Campus Research Computing Consortium), and a new effort led by IU called Humanware; working to build a larger and more effective community of experts.
Workshop Format
The format of presentations for the day will be:
- Introductory talk by Craig Stewart and Brian Voss
- A mixture of talks by accepted participants and CSREs
- An ending panel to discuss lessons learned and share information
Workshop Proceedings
Humanware: The Critical Role of People in Supporting Research in the CloudBrian D. Voss
Cyberinfrastructure is generally defined as consisting of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked by high-speed networks to enable scholarly innovation and discoveries. Usually, the "people" component is overlooked in favor of all the other "wares" ... hardware, software, vizware, and netware. The term "humanware" provides an analogous term to emphasize the important role that people play in supporting the use of cyberinfrastructure's other components. Humans have always been important to the deployment and effective use of technology. This paper seeks to provide historical perspective on the role of people in deploying technology, and to illuminate the challenges associated with emphasizing humanware's importance to the use of cloud-provided cyberinfrastructure.
Craig A. Stewart, Amy Apon, David Y. Hancock, Thomas Furlani, Alan Sill, Julie Wernert, David Lifka, Nicholas Berente, Thomas Cheatham, Shawn D. Slavin
In recent years, considerable attention has been given to assessing the value of investments in cyberinfrastructure (CI). This paper focuses on assessment of value measured in ways other than financial benefits - what might well be termed impact or outcomes. This paper is a companion to a paper presented at the PEARC'19 conference, which focused on methods for assessing financial returns on investment. In this paper we focus on methods for assessing impacts such as effect on publication production, importance of publications, and assistance with major scientific accomplishments as signified by major awards. We in particular focus on the role of humans in the loop - humanware. This includes a brief description of the roles humans play in facilitating use of research cyberinfratructure - including clouds - and then a discussion of how those impacts have been assessed. Our conclusion overall is that there has been more progress in the past very few years in developing methods for the quantitative assessment of financial returns on investment than there has been in assessing non-quantitative impacts. There are a few clear actions that many research institutions could take to start better assessing the non-financial impacts of investment in cyberinfrastructure. However, there is a great need for assessment efforts to turn more attention to the assessment of non-financial benefits of investment in cyberinfrastructure, particularly the benefits of investing in humans and the benefits to humans who are involved in supporting and using cyberinfrastructure, including clouds.
Gregor von Laszewski, Fugang Wang, Geoffrey C. Fox, Shawn Strande, Christopher Irving, Trevor Cooper. Dmitry Mishin, Michael L. Norman
The Comet petascale system is an XSEDE resource with the goal of serving a large user community. The Comet project has served a large number of users while using traditional supercomputing as well as science gateways. In addition to these offerings, Comet also includes a non traditional virtual machine framework that allows users to access entire Virtual Clusters instead of just focusing on individual virtual machines. The virtual machine framework integrates a custom administration interface, a novel virtual machine image management back-end, industry standard hardware virtualization technology and leverages the Comet resource manager and job scheduler to provide access to Comet compute nodes. However, to access and manage it user input is required. In this paper, we summarize the efforts of human in the loop-for-cloud require as part of the computing activities on comet. This includes a discussion of how to get access, how to use the system, how to obtain support and what lessons we learned form the operation of this facility for users.
Richard Knepper, Susan Mehringer, Adam Brazier, Brandon Barker, Resa Reynolds
Campus cloud resources represent significant resources for research computing tasks, with the caveat that transitioning to cloud contexts and scaling analyses is not always as simple as it might seem. We detail Red Cloud, Cornell's campus research cloud, and some of the work undertaken by the Center for Advanced Computing (CAC) to help researchers make use of cloud computing technologies. In 2015, Cornell CAC joined with two other universities to develop the Aristotle Cloud Federation, composed of separate campus cloud resources and data sources, supporting a range of science use cases. We discuss the lessons learned from helping researchers leverage both of these science cloud resources as well as leveraging other research cloud infrastructure and transitioning to public cloud.
Kate Keahey, Jason Anderson, Paul Ruth, Jacob Colleran, Cody Hammock, Joe Stubbs, Zhuo Zhen
Chameleon is a large-scale, deeply reconfigurable testbed built to support Computer Science experimentation. Unlike traditional systems of this kind, Chameleon has been configured using an adaptation of a mainstream open source infrastructure cloud system called OpenStack. In this paper, we discuss operational challenges for experimental testbeds and explain what impact they have on the profile of the operating team. We then discuss methods we developed to alleviate the operational burden and show how they can be used in practice. We conclude with a discussion of our interaction with the user community and describe how experimental platforms create a high potential for community involvement.
Yongwook Song, Xu Fu, Chris Richards
Humanware, the human component of cyberinfrastructure, is focused on understanding and developing the human expertise needed to support computationally-based research with the goal of maximizing efficiency, productivity, and return on investment associated with cyberinfrastructure. In this paper, we present an example use case describing the humanware challenges associated with leveraging cloud-based cyberinfrastructure to implement a machine learning software framework that classifies ambiguous time-series data sets. Our project demonstrates that collaboration between researchers and cyberinfrastructure experts significantly advanced our empirical research efforts and maximized the return on investment by utilizing a cost-efficient cloud-based cyberinfrastructure.
Dan Sholler
Researchers in various scientific disciplines are leveraging cloud computing resources to enhance the scale, speed, and portability of research processes and products. Like any other cyberinfrastructure, cloud computing deployment in scientific research requires arrangements of technologies, people, organizations, and institutions to ensure smooth functioning and sustainability. To date, published discussions of cloud computing's promise have often placed technological capabilities and financial benefits front-and-center. These accounts are in line with traditional return-on-investment models for assessing a new technology's viability. The attention to ROI, however, has left assessment of the "invisible" work (and costs) placed on human actors in reaching cloud computing's promise relatively unexplored. In this paper, I report on the findings of 45 interviews with career researchers, research support staff, and student researchers engaged in cloud vendor-enabled research. The purpose of conducting the interviews was to identify commonalities in the labor required to start, maintain, or migrate research processes to the cloud via vendors (e.g., AWS, Azure, and Google Cloud). Two central types of "invisible" labor emerged as themes across the interviews: absorbing the time costs of learning new skills to migrate research to the cloud and managing billing for multiple, decentralized projects. In the discussion, I contextualize these two types of seemingly mundane work in broader debates about the burden new cloud computing technologies might place on scientists and research support staff. I conclude by suggesting that continued documentation and analysis of cloud research-enabling labor is needed to ensure that these invisible costs are understood and, perhaps in the future, shared appropriately among vendors, universities, and researchers.
J. Eric Coulter, Eroma Abeysinghe, Sudhakar Pamidighantam, Marlon Pierce
We discuss our work providing resources for batch computing via the Jetstream cloud, in the form of SLURM clusters. While these are mainly used by science gateways, there have been a few used in the more traditional commandline manner. The flexible nature of these has also lent itself well to educational work, and has provided the basis for a very successful series of tutorials and workshops. This paper discusses the technical evolution of the Virtual Cluster product, and gives an overview of the science enabled. We discuss the challenges in supporting an ecosystem of these virtual clusters, and in supporting research on cloud resources in general.
John Mulligan
This paper outlines an attempt to migrate some humanistic research into the cloud. This undertaking raises a number of questions, but it will be clarifying to focus on two: why would humanists want to use the cloud and why should the cloud have more active humanists on it? In order to properly answer these questions, the paper first examines what we mean by "the cloud" when we talk about it in our roles as academic research computing specialists; second, by laying out my particular use case's processes and products in this context; and finally, by reflecting on why this might be significant for both research computing and what is broadly called the digital humanities.