A platform for eXtreme Analytics

被引:1
作者
Balmin, Andrey [1 ]
Beyer, Kevin [2 ]
Ercegovac, Vuk [1 ]
McPherson, John [1 ]
Oezcan, Fatma [1 ]
Pirahesh, Hamid [1 ]
Shekita, Eugene [3 ]
Sismanis, Yannis [1 ]
Tata, Sandeep [1 ]
Tian, Yuanyuan [1 ]
机构
[1] IBM Res Div, Almaden Res Ctr, San Jose, CA 95120 USA
[2] Platfora Inc, San Mateo, CA 94401 USA
[3] Google Inc, Mountain View, CA 94043 USA
关键词
D O I
10.1147/JRD.2013.2242693
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid increase in the volume of data that enterprises are producing, enterprises are adopting large-scale data processing platforms such as Hadoop (R) to store, manage, and run deep analytics to gain actionable insights from their "big data." At IBM Research - Almaden, we have been helping enterprise customers build solutions exploiting data-intensive analytics. Our deep experience with actual users has led to an extensive understanding of the platform requirements needed to support these solutions, and our goal is to provide a powerful analytics platform, which we call eXtreme Analytics Platform (XAP), that can be used to create solutions for customer problems that have not been economically feasible to solve until now. XAP provides Jaql [i.e., JavaScript (R) Object Notation (JSON) query language, a scripting language to specify data flows, tools, and techniques to optimize the runtime execution of these flows], an improved task scheduler, connectors to data warehouses, and libraries for advanced analytics. Many of these technologies have been transferred to the IBM InfoSphere BigInsights (TM) product. In this paper, we describe the overall design principles and technology of XAP.
引用
收藏
页数:11
相关论文
共 28 条
[1]  
Abadi DJ, 2007, PROC INT CONF DATA, P441
[2]  
[Anonymous], 2005, Scientific Programming
[3]  
[Anonymous], 2006, NIPS
[4]  
[Anonymous], 2011, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)
[5]  
[Anonymous], Q B IEEE TC DATA ENG
[6]  
[Anonymous], 2009, Proceedings of the VLDB Endowment
[7]  
Balakrishnan S., 2010, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD'10, P1187
[8]  
Beyer KS, 2011, PROC VLDB ENDOW, V4, P1272
[9]  
DAS S., 2010, ACM SIGMOD INT C MAN, P987, DOI DOI 10.1145/1807167.1807275
[10]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137