What you get
The BigData Analysis GE is intended to deploy means for analyzing both batch and stream data (in order to get, in the end, insights on such a data).
The batch part has been widely developed throught the adoption and/or the in-house creation of the following tools:
- A Hadoop As A Service (HAAS) engine, either the ''official'' one based on Openstack's Sahara, either the light version based on a shared Hadoop cluster.
- Cosmos GUI, for Cosmos accounts managing.
- Cosmos Auth, managing OAuth2-based authentication tokens for Cosmos REST APIs.
- Cosmos Proxy, enforcing OAuth2-based authentication and authorization for Cosmos REST APIs.
- Cosmos Hive Auth Provider, a custom auth provider for HiveServer2 based in OAuth2.
- Tidoop API for MapReduce job submission and management.
- Cosmos Admin, a set of admin scripts.
- Cygnus, a tool for connecting Orion Context Broker with several data storages, HDFS and STH (see velow) included, in order to create context data historics.
- STH Comet, a MongoDB context data storage for short-term historic.
The streaming part has both adopted and in-house created certain parts:
- A Storm As A Service (SAAS) engine based on a shared Storm cluster.
- Sinfonier Backend.
- Sinfonier Frontend.
- Sinfonier Frontend API.
Why to get it
Cosmos and Sinfonier are mainly addressed to those service providers aiming to expose a BigData Analysis GE-like services. For those service providers, the data analysis is not a goal itself but providing ways others can perform such data analysis. This especially applies to Openstack's Sahara installation.
If you are a data scientist willing to get some insights on certain data; or you are a software engineer in charge of productizing an application based on a previous data scientist analysis, then please visit the User and Programmer Guide; and/or go directly to the FIWARE Lab Global Instances of Cosmos and/or Sinfonier, there you will find an already deployed infrastructure ready to be used through the different APIs.
If you don't relay on FIWARE Lab Global Instances of Cosmos and/or Sinfonier and you want to use Hadoop and of Storm, do not install them; that will be as installing a complete Cloud just for creating a single virtual machine. Instead, simply install a private instance of Hadoop and/or Storm!
If you still have doubts, we have built this flow diagram below in order to help you identifying which kind of Big Data user you are (if any).