The platform is handling all data in the form of datasets. A dataset can contain entire tree of files and folders, can be encrypted or compressed. Each dataset has a set of metadata values indexed in an ElasticSearch instance and stored as iRODS metadata.

The data management APIs exposed by the platform manage data upload, download, staging and transfer as well as metadata definition and modification. In this section we describe all services implementing the APIs and the way they are deployed on the locations.

Currently, iRODS zones are main system for storing data and their metadata in the platform, future integration with object storages such as S3 or Swift is planned.

Integration with LEXIS Distributed Data Interface is realised through an integration component called staging worker. The worker uses ordinary user-space SSH tools to transfer data (SFTP, SCP, rsync) and connects to local or remote object storage systems. It operates outside of the HPC cluster, usually deployed as Docker container in a virtual machine.