Compressed Alignment File
2.5 | 2 Votes
What is a CRAM file?
Compressed alignment file format developed by EBI, an outstation of the European Molecular Biology Laboratory; stores short DNA sequence read alignments generated from short read aligners such as BWA, BBMap, and BLAT.
The CRAM file format is similar to the Binary Alignment/Map (BAM) file format but features better file compression and a restructured column-oriented binary container format. As data volumes increase in the genetic software field, the CRAM format was designed to reduce the disk foot print of alignment data.
Short read alignment is the process of deciphering where a sequence comes from in the genome. This can be difficult since the reference genome is rather large and you aren't always looking for exact matches in the reference genome. Short read aligners are used to search enormous sequences for matches and near-matches. Since the sequences are so large, compression provided by formats like CRAM is necessary for keeping file sizes manageable.