Getting Started
Contents
We have designed Sandy based on three principles:
- to be easy to install;
- to be easy to use;
- to resemble variabilities found in a real NGS assay.
Installation
Sandy is easy to install in the three most commonly used operating systems (OS): Linux, Apple’s macOS, and Microsoft Windows. For more details, see the section Installation.
Genome simulation
Sandy is easy to use because it requires only an input (fasta
) file in a streamline command line to
simulate DNA and RNA sequencing for Illumina’s, PacBio, and Oxford Nanopore platforms. The user needs
to provide only the reference genomic (for simulating DNA sequencing) or transcriptomic data (for
simulating RNA sequencing) in fasta
format and run Sandy command-line interface. For example, to
simulate a whole-genome sequencing (human genome) in an Illumina HiSeq platform, users need to type the
following command only:
Reference genome
If you don’t have the reference genome, first follow this step:
or
Sandy example for genome
with quality-profile for Illumina HiSeq 101 read length and coverage of 1x.
Sandy example for genome on Docker
Transcriptome simulation
It is also straightforward to simulate an RNA sequencing (RNAseq) run using Sandy. The line below is an example of an RNAseq simulation for the Illumina HiSeq platform with 30 million paired-end reads of 101 bases in length.
Reference annotation
If you don’t have the transcripts fasta file, first follow this step:
or