I found this quote quite very interesting:
From http://ceur-ws.org/Vol-2022/paper01.pdf:
” (Analysis) of NGS data is classified as:
- primary analysis is essentially responsible of producing raw data;
- secondary analysis is responsible of extracting (“calling”) the signal from raw data and align the signals to the reference genome; and
- tertiary analysis is responsible of a number of tasks all concerned with data integration.
The bioinformatics community has produced a huge number of tools for secondary analysis. So far, it has not been equally engaged in tertiary data analysis, which is clearly the most important aspect of future research.
(There are a) number and variety of tools for secondary analysis. Instead, only four few systems are focused on tertiary analysis.”
- GMQL:
- DeepBlue
- FireCloud (From Broad):
- SciDB (from Paradigm4):
Disclaimer: I work at Paradigm4, the company that maintains the SciDB computational array database.