This is the documentation for the ScienceBox project.
We recommend you to start reading the Overview.
If you find a typo or you want to contribute to the documentation go to Contribution Guidelines.
This is the multi-page printable view of this section. Click here to print.
This is the documentation for the ScienceBox project.
We recommend you to start reading the Overview.
If you find a typo or you want to contribute to the documentation go to Contribution Guidelines.
ScienceBox is an integrated software bundle with storage and computing services for general purpose and scientific use. It features container based version of distributed scalable storage, sync and share functionalities, and a web-based data analysis service, and can be deployed on a single machine or scaled-out across multiple servers leveraging modern technologies in Helm and Kubernetes.
ScienceBox delivers an integrated set of services for storage, sync&share and data analysis and enables their deployment on a Kubernetes cluster using the Helm package manager:
What is it good for?: ScienceBox leverages modern technologies like Helm to package the services that could be installed on a kubernetes cluster with just one command, instead of writing yaml manifests.
What is it not good for?: ScienceBox is a all-in-one package to deploy CERN services, which comes with share of complexity, stacked services and inter-dependencies, if you want to just run SWAN, EOS or CERNBox separately/independently it would be better to look at other deployment options.
Head to the Getting Started section to get started with Sciencebox
ScienceBox utilizes and leverages on the latest Cloud Native technologies to package and distribute the services that could be easily run on any cloud, be it on-premise or any commercial cloud. ScienceBox leverages on Helm to manage and install all kubernetes manifests. We uses the concept of “Umbrella Charts” i.e. Sciencebox expresses dependencies on various upstream sub-charts which makes ScienceBox highly pluggable and configurable. This provides users the flexibility to install and configure any sub-charts as they want.
There are two ways to install ScienceBox charts:
Before discussing the installation methods, there are certain pre-requisites that are needed to be installed:
In order to install and run sciencebox on your kubernetes cluster, there are a set of tools and software needed to be installed. Sciencebox has been tested and developed on:
We provide a demonstrator version of ScienceBox, called mboxed that installs all the helm chart on minikube. Mboxed is a one click installation of ScienceBox. It can be considered as a self-contained, containerized demo for cloud storage and computing services for scientific and general-purpose use.
We have a dedicated repository which contains all the installation scripts and provides a single script to install all the workloads on a minikube based kubernetes cluster.
Follow simple steps to install ScienceBox on your cluster:
# clone the repo
$ git clone https://github.com/sciencebox/mboxed.git
$ cd mboxed
# install the required software
$ ./SetupInstall.sh
# Install sciencebox
$ ./ScienceBox.sh
After the installation, you can access your installation on https://${HOSTNAME}/sciencebox
ScienceBox can also be installed on a multi-node kubernetes cluster. Using HELM really simplifies the deployment process and it enables you to get your workloads up and running with just a matter of couple of helm commands. (This section assumes that you have all the pre-requisites already installed in your machine).
In order to install the sciencebox umbrella chart on your kubernetes cluster:
helm repo add sciencebox https://registry.cern.ch/chartrepo/sciencebox
helm install sciencebox/sciencebox
Please note that you do need to configure certain parameters before running the installation. The configurations for the parameters can be found here
After installation of the charts, the users can access ScienceBox on https://${HOSTNAME}/sciencebox
, wherein the user would be welcomed with the welcome screen:
The services can be accessed throught following URL:
https://${HOSTNAME}/sciencebox
https://${HOSTNAME}/swan
https://${HOSTNAME}
ScienceBox is a complex project with a lot of interdependencies and it can be duanting for an user or a potential contributor. This section tries to make life easier by describing various core concepts around the sciencebox project and how the solution was built from ground up.
As already mentioned, ScienceBox is a software bundle packaged as a Helm Chart to deploy CERN IT services on Kubernetes. These services in itself are complex softwares that are deployed independently here at CERN. (Side note: not all of the services offered by ScienceBox run on Kubernetes natively at CERN)
ScienceBox was created so that all the services, namely CERNBox, EOS, CVMFS and SWAN could be deployed outside CERN with ease. Helm Charts proved to be the most hassle free way to ship all of these services in a package that could be easily deployed on kubernetes cluster and hence it was chosen to be the solution to package all the mentioned services. Along with the ease of deployment, HELM chart also proves to be highly configurable enabling one to configure the deployment as per their liking.
ScienceBox is a single helm chart that contains multiple subcharts, which in turn functions as a whole. As per the Helm community this practice is referred to as “Umbrella” Chart and is the de-facto standard to embed each component into a single package. The ScienceBox chart expresses dependencies on the CERNBox, EOS, and SWAN “sub-charts”. This can be easily visualized with the architecture below:
As seen in the above architecture, ScienceBox embeds all the individual components and configures them to run together. Along with all the major components, ScienceBox also requires some “satellite components” to glue all the services together. The detailed working of each service and the corresponding glue component is mentioned in their subsections.
To summarize, the ScienceBox umbrella consists of following sub-charts:
Satellite Components:
This section gives a brief of how CERNBox is functioning as a part of ScienceBox. Configuring and running CERNBox is a bit of a dauting task solely because of lot of microservices and satellite components running as a part of the deployment. This section hopes to simplify and make the understanding of CERNBox deployment much clearer.
The below architecture depicts the CERNBox deployment.
As seen in the architecture, in order to run CERNBox on kubernetes, there are many components involved:
Public
, User
and Home
services are the CERNBox storage services that interface with EOS.Bearer
service is the CERNBox authentication service.All of the above described elements run as a kubernetes pods (deployment/statefulset) and interact with each other via kubernetes service mechanism.
This section gives a brief of how SWAN is used as a part of ScienceBox. Running SWAN as a part of ScienceBox is relatively easy task since we “mimic” the upstream SWAN deployment. The upstream documentation for SWAN could be found here. ScienceBox uses the upstream SWAN charts and configures it run with the custom OCIS IDP (which is also used by CERNBox for authentication) for authentication purpose.
As seen in the architecture, whenever there is a request at /swan
endpoint the ingress routes the request to the running SWAN instance, which then uses OCIS IDP for the authentication and the EOS deployment as a storage backend. The components involved are:
ScienceBox relies on Helm Charts to template, package and deploy all the Sciencebox services. Helm Chart helps one to define, install and upgrade Kubernetes application.
All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult GitHub Help for more information on using pull requests.
Here’s a quick guide to get started with Sciencebox. It assumes you’re familiar with the GitHub workflow:
If you want to run your own local Kubernetes cluster to preview your changes as you work: Note: We suggest you to use Minikube to run and test your services.
Follow the instructions in Getting started to clone and install ScienceBox and the other pre-requisite tools.
Clone the Mboxed repo:
git clone https://github.com/sciencebox/mboxed.git
Edit the etc/deploy.sh
file in mboxed, to point the helm install
command to the locally checked out Sciencebox charts.
Run ./ScienceBox.sh
to install the charts into your local kubernetes cluster for testing.
Continue with the usual GitHub workflow to edit files, commit them, push the changes up to your fork, and create a pull request.
If you’ve found a problem in the docs, but you’re not sure how to fix it yourself, please create an issue in the ScienceBox repo. You can also create an issue about a specific page by clicking the Create Issue button in the top right hand corner of the page.