This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Concepts

Deep dive into ScienceBox charts and sub-components

ScienceBox is a complex project with a lot of interdependencies and it can be duanting for an user or a potential contributor. This section tries to make life easier by describing various core concepts around the sciencebox project and how the solution was built from ground up.

1 - Architecture Reference

ScienceBox architecture and component reference

As already mentioned, ScienceBox is a software bundle packaged as a Helm Chart to deploy CERN IT services on Kubernetes. These services in itself are complex softwares that are deployed independently here at CERN. (Side note: not all of the services offered by ScienceBox run on Kubernetes natively at CERN)

ScienceBox was created so that all the services, namely CERNBox, EOS, CVMFS and SWAN could be deployed outside CERN with ease. Helm Charts proved to be the most hassle free way to ship all of these services in a package that could be easily deployed on kubernetes cluster and hence it was chosen to be the solution to package all the mentioned services. Along with the ease of deployment, HELM chart also proves to be highly configurable enabling one to configure the deployment as per their liking.

ScienceBox is a single helm chart that contains multiple subcharts, which in turn functions as a whole. As per the Helm community this practice is referred to as “Umbrella” Chart and is the de-facto standard to embed each component into a single package. The ScienceBox chart expresses dependencies on the CERNBox, EOS, and SWAN “sub-charts”. This can be easily visualized with the architecture below:

As seen in the above architecture, ScienceBox embeds all the individual components and configures them to run together. Along with all the major components, ScienceBox also requires some “satellite components” to glue all the services together. The detailed working of each service and the corresponding glue component is mentioned in their subsections.

To summarize, the ScienceBox umbrella consists of following sub-charts:

  • CERNBox Charts:
    • Revad Charts - Backbone of CERNBox, interoperability platform for sync and share systems.
      • 3 StorageProviders - Interface to EOS
      • AuthProvider - Revad Authentication service
    • CERNBox Web - Nginx server that serves CERNBox web.
    • OwnCloud Infinite Scale Charts - oCIS charts to run OCIS extenstions - IDP and Proxy. IDP - Identity Provider used for authentication.
  • EOS Charts:
    • 1 MGM - headnode of the cluster
    • 4 FST - storage daemons to write files’ payload
    • 3 QDB - highly available namespace and instance configuration
  • SWAN Charts
    • Fusex - EOS Client
    • JupyterHub - Upstream JupyterHub charts
  • CVMFS Charts

Satellite Components:

2 - CERNBox Charts

CERNBox charts and description

This section gives a brief of how CERNBox is functioning as a part of ScienceBox. Configuring and running CERNBox is a bit of a dauting task solely because of lot of microservices and satellite components running as a part of the deployment. This section hopes to simplify and make the understanding of CERNBox deployment much clearer.

The below architecture depicts the CERNBox deployment.

As seen in the architecture, in order to run CERNBox on kubernetes, there are many components involved:

  • OCIS Proxy: Web Proxy provided by ownCloud to incoming requests to REVA services.
  • OCIS IDP: OAuth provider by ownCloud - Backed by an LDAP Server.
  • CERNBox Web: CERNBox Web component.
  • Reva Services:
    • Storage Services: Public, User and Home services are the CERNBox storage services that interface with EOS.
    • Auth Service: Bearer service is the CERNBox authentication service.
  • MariaDB: Database to store all the cernbox share information.

All of the above described elements run as a kubernetes pods (deployment/statefulset) and interact with each other via kubernetes service mechanism.

3 - SWAN Charts

SWAN charts and description

This section gives a brief of how SWAN is used as a part of ScienceBox. Running SWAN as a part of ScienceBox is relatively easy task since we “mimic” the upstream SWAN deployment. The upstream documentation for SWAN could be found here. ScienceBox uses the upstream SWAN charts and configures it run with the custom OCIS IDP (which is also used by CERNBox for authentication) for authentication purpose.

As seen in the architecture, whenever there is a request at /swan endpoint the ingress routes the request to the running SWAN instance, which then uses OCIS IDP for the authentication and the EOS deployment as a storage backend. The components involved are:

  • OCIS IDP: OAuth Provider
  • EOS: Storage Provider
  • SWAN: Upstream SWAN charts