The Modern Research Data Portal: a design pattern for networked, data-invensive science

被引:21
作者
Chard, Kyle [1 ,2 ]
Dart, Eli [3 ]
Foster, Ian [1 ,2 ]
Shifflett, David [1 ,2 ]
Tuecke, Steven [1 ,2 ]
Williams, Jason [1 ,2 ]
机构
[1] Univ Chicago, Chicago, IL 60637 USA
[2] Argonne Natl Lab, Lemont, IL 60439 USA
[3] Lawrence Berkeley Natl Lab, Energy Sci Network, Berkeley, CA USA
来源
PEERJ COMPUTER SCIENCE | 2018年
基金
美国国家科学基金会;
关键词
Portal; High-speed network; Globus; Science DMZ; Data transfer node; GATEWAYS;
D O I
10.7717/peerj-cs.144
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.
引用
收藏
页数:30
相关论文
共 42 条
[1]  
[Anonymous], 2009, THE 4 PARADIGM
[2]  
[Anonymous], The OAuth 2.0 Authorization Framework, DOI DOI 10.17487/RFC6749
[3]  
[Anonymous], 2010, Synthesis Lectures on Information Concepts, Retrieval, and Services, DOI [10.2200/S00233ED1V01Y200912ICR012, DOI 10.2200/S00233ED1V01Y200912ICR012]
[4]  
[Anonymous], 2005, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, DOI DOI 10.1109/SC.2005.72
[5]  
[Anonymous], 2011, D LIB MAG, DOI [DOI 10.1045/JANUARY2011-CROSAS, 10.1045/january2011-crosas]
[6]  
Babuji YN, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P302, DOI 10.1109/BigData.2016.7840616
[7]  
Barnett W., 2011, A roadmap for using NSF cyberinfrastructure with InCommon, DOI 2022/13024
[8]  
Berners-Lee Tim., 1989, INFORM MANAGEMENT PR
[9]   The Conundrum of Sharing Research Data [J].
Borgman, Christine L. .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2012, 63 (06) :1059-1078
[10]   Globus Nexus: A Platform-as-a-Service provider of research identity, profile, and group management [J].
Chard, Kyle ;
Lidman, Mattias ;
McCollam, Brendan ;
Bryan, Josh ;
Ananthakrishnan, Rachana ;
Tuecke, Steven ;
Foster, Ian .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 56 :571-583