A number of new cloud-based computing platforms have been established by the NIH to facilitate collaboration and maximize the scientific utility of genomic and linked biomedical data. Compared with pre-existing data sharing mechanisms, these platforms offer new and potentially more efficient alternatives for accessing, storing, and analyzing data, promising to widen access and promote data science equity. To better understand platform policies and procedures, as well as the experiences of platform users, we performed semi-structured interviews with 21 platform developers and 22 early adopters across five platforms. Interviews with developers focused on the promise of these new data sharing mechanisms, while interviews with users focused largely on perceived benefits and challenges of cloud-based data sharing and analysis. The platform developers we interviewed routinely invoked the “democratizing” potential of cloud-based data sharing, while also recognizing barriers to achieving that promise. Democratization was understood as a platform’s ability to broaden or expand data access and sharing, including with non-traditional users or institutions. Early adopters were more equivocal on platforms’ ability to democratize, with interviewees noting both the steep learning curve and management of compute costs that users must negotiate and which could interfere with broader adoption. These findings and the observed differences of perspective among developers and users raise important questions about the potential of cloud platforms to broaden both data access and use and therefore meet claims underlying their substantial public investment. Additional research will be needed to better understand whether and how cloud-based data sharing does indeed promote democratization.
Authors: Sarah Nelson, University of Washington; Jacklyn Dahlquist, University of Washington; Stephanie Malia Fullerton, University of Washington