What you get

The BigData Analysis GE is intended to deploy means for analyzing both batch and stream data (in order to get, in the end, insights on such a data).

The batch part has been widely developed throught the adoption and/or the in-house creation of the following tools:

  • A Hadoop As A Service (HAAS) engine, either the ''official'' one based on Openstack's Sahara, either the light version based on a shared Hadoop cluster.
  • Cosmos GUI, for Cosmos accounts managing.
  • Cosmos Auth, managing OAuth2-based authentication tokens for Cosmos REST APIs.
  • Cosmos Proxy, enforcing OAuth2-based authentication and authorization for Cosmos REST APIs.
  • Cosmos Hive Auth Provider, a custom auth provider for HiveServer2 based in OAuth2.
  • Tidoop API for MapReduce job submission and management.
  • Cosmos Admin, a set of admin scripts.
The batch part is enriched with the following tools, that conform the so-called Cosmos Ecosystem:
  • Cygnus, a tool for connecting Orion Context Broker with several data storages, HDFS and STH (see velow) included, in order to create context data historics.
  • STH Comet, a MongoDB context data storage for short-term historic.

The streaming part has both adopted and in-house created certain parts:

Why to get it

Cosmos and Sinfonier are mainly addressed to those service providers aiming to expose a BigData Analysis GE-like services. For those service providers, the data analysis is not a goal itself but providing ways others can perform such data analysis. This especially applies to Openstack's Sahara installation.

If you are a data scientist willing to get some insights on certain data; or you are a software engineer in charge of productizing an application based on a previous data scientist analysis, then please visit the User and Programmer Guide; and/or go directly to the FIWARE Lab Global Instances of Cosmos and/or Sinfonier, there you will find an already deployed infrastructure ready to be used through the different APIs.

If you don't relay on FIWARE Lab Global Instances of Cosmos and/or Sinfonier and you want to use Hadoop and of Storm, do not install them; that will be as installing a complete Cloud just for creating a single virtual machine. Instead, simply install a private instance of Hadoop and/or Storm!

If you still have doubts, we have built this flow diagram below in order to help you identifying which kind of Big Data user you are (if any).

