Genomic Platform โ iGAP
Integrated Genomic Analysis Platform
In Zettagene we believe that the analysis of genomic data should be straightforward and efficient. We do not want our users to take care about security, performance, storage or integrity of their computing environment.
After several research projects within the genomic data field, we came up with a solution that addresses the most important challenges that Genomics puts in front of Computer Science โ a solution that uses the latest advancements of Big Data technologies and Cloud computing, deployable on-premise or in the cloud.
Our Design Principles
Distributed Pipelines
We use standard well-known pipelines for automated processing of genomic data, adjusted to distributed architecture for performance gains and enhanced maintainability.
Multi-sample Analysis
We optimize query algorithms to reflect genomic data specificity and allow efficient data access. iGAP is designed to store large WES/WGS data with effective on-demand reprocessing.
Open Source Technologies
iGAP is built with well-established Big Data technologies tailored together to provide open but consistent architecture.
Common Data Model
Genomics cannot be only about files. We use a defined data model to reach out to data from different sources and pipelines in one unified data environment.
Data Security & Governance
All-or-nothing data access is not enough! iGAP provides enterprise-ready access mechanisms with granularity down to a single variant, plus audit capabilities for GDPR compliance.
Flexible Data Access
We provide JupyterLab interface and the possibility to perform your own analysis using external tools, combining variants, genomic intervals and phenotypic information.
iGAP Features
Automated Secondary & Tertiary Analysis
iGAP uses standard bioinformatics pipelines (e.g. bcbio) prepared for highly distributed environments built on Apache Spark. This approach leads to significant performance gains. All workflows are automated and task execution can be monitored from a single place.
Optimized Genomic Operations
Regular databases and query algorithms were not designed for genomic data at scale. iGAP includes our analytics engine, SeQuiLa, providing efficient querying and processing of genomic intervals for depth of coverage analysis and quality control at scale.
Secured by Design
Genomic data is the most personal information ever. iGAP brings enterprise-level security with data secured on a single variant level. With advanced metadata management and governance capabilities, you can classify, govern, and collaborate on data while meeting compliance requirements.