Manual of the latest releasePlease contact Lin Huang <linhuang@cs.stanford.edu> for questions or comments
ContentI/O formatReveel requires a mandatory input file Reveel produces a BCF file, containing the inferred genotypes of the query genomes. Compared to our previous release, we made the following improvements: (1) the REF field of the output BCF is consistent with that field of the IndexingThis step partitions a chromosome into a series of non-overlap segments. UsageCommand line is Arguments
OutputOptions of command
|
Option name | Type | Details |
| | chunk a chromosome by the number of markers [default: chunk by positions] |
| | maximum number of markers per segment, valid if -m is set [default: 10000] |
| | segment length in bps, valid if -m is *not* set [default: 1000000] |
| | SNP desert length in bps [default: 200000] |
This step infers genotypes of the query genomes from genotype likelihoods, incorporating the reference panel into the genotyping procees.
With a reference panel, command line is
Without a reference panel, command line is
Argument name | Type | Details |
| | the genotypes of reference genomes in VCF/BCF format |
| | the genotype likelihoods of query genomes in VCF/BCF format |
| | the prefix of output filenames |
Option name | Type | Details |
| | no reference panel [default: using a reference panel] |
| | focus the computation on a region described as chr:beg-end |
| | mode, 0: regular; 2: with Beagle [default: 0]. This release supports Beagle 3.3.2. |
| | minimum number of SNPs processed in a batch [default: 5000] |
| | candidate threshold [default: 0.5]. This threshold is set to differentiate likely polymorphic sites from constant sites. If the data set includes thousands of samples, we recommend setting this parameter to be 1. Higher threshold gives better precision but lower recall of SNP calling. |
| | per base error rate [default: 0.001]. Users can give a rough estimation, because the experiments show our tool is not sensitive to the estimation error. In our experiments, we used 0.001. Users can try this value to begin with. |
| | refinement iteration [default: 10] |
| | specical model for memory use estimation [default: genotype inference]. If this flag is set, Reveel will allocate and free memory as normal without doing the computation. This flag enables the estimation of memory usage in an efficiency manner. |
If Beagle is used, this step merges the outputs of Reveel and Beagle. Otherwise, this step converts the output of
Command line is
Argument name | Type | Details |
| | the genotype likelihoods of query genomes in VCF format |
| | the output of |
| | the prefix of output filenames. This should be identical to the <output> argument of |
| | dose.gz file(s) produced by Beagle. This argument is required only if Beagle is used. |
Sample 1: regular mode; with a reference panel; partition by the number of markers, no more than 12000 markers per segment
Sample 2: regular mode; without a reference panel; partition by position, no longer than 500000 bps per segment
Sample 3: with Beagle mode; without a reference panel; partition by position, no longer than 1000000 bps per segment
Sample 4: estimate the memory usage of
Sample 5: run Reveel in parallel
Start with
Apply the
Concatenate the output BCF files using BCFtools.
The released executable is compiled on 64-bit Ubuntu 12.04.4.
This release supports Beagle 3.3.2.
The memory usage and computation overhead of