Advanced Features
Parameter Priors
discoal supports uniform and exponential prior distributions for parameters, useful for ABC and machine learning applications.
Uniform Priors
Most parameters support uniform priors specified with -P flags:
Flag |
Parameters |
Description |
|---|---|---|
|
low high |
Prior on θ (mutation rate) |
|
low high |
Prior on ρ (recombination rate) |
|
low high |
Prior on α (selection coefficient) |
|
low high |
Prior on τ (time since fixation) |
|
low high |
Prior on uA (recurrent adaptive mutation) |
|
low high |
Prior on sweep position |
|
low high |
Prior on f₀ (initial frequency) |
|
low high |
Prior on final sweep frequency |
Example:
# Variable mutation and selection parameters
./discoal 20 1000 10000 -Pt 5 50 -Pa 100 2000 -ws 0.01 -x 0.5
Exponential Prior
Recombination rate can follow a truncated exponential:
# Exponential with mean 10, truncated at 100
./discoal 20 100 10000 -t 20 -Pre 10 100
Demographic Priors
Population size changes can have priors:
# Prior on first size change: time 0.01-0.5, size 0.1-10
./discoal 20 100 10000 -t 20 -Pe1 0.01 0.5 0.1 10
# Prior on second size change
./discoal 20 100 10000 -t 20 -Pe1 0.01 0.5 0.1 10 -Pe2 0.5 2.0 0.5 5.0
Conditional Simulations
Simulate conditional on observing recombination in a specific region:
# Condition on recombination between sites 400-600
./discoal 20 100 1000 -t 10 -r 20 -C 400 600
The simulator will retry until this condition is met.
Tree Output Mode
Output genealogical trees in Newick format with -T:
./discoal 10 1 10000 -t 20 -r 10 -T
Output format:
//
[100](0:0.5,1:0.5);
[50]((0:0.2,2:0.2):0.3,1:0.5);
...
Each line shows: [number_of_sites]newick_tree
Recording Recent Mutations
Only record mutations more recent than a specified time:
# Only mutations from last 0.1 time units
./discoal 20 10 10000 -t 50 -U 0.1
Memory and Performance
Large-scale Simulations
For very large simulations:
Increase MAXSITES: Edit
discoal.hand recompileSample size limit: The maximum sample size is now 65,535 (previously 254)
Performance Tuning
Time discretization: Lower
-ivalues speed up sweeps at potential accuracy costMemory efficiency: Current version uses 70-99% less memory than older versions
Parallel runs: Use different random seeds for embarrassingly parallel execution
Setting Random Seeds
For reproducibility:
# Specify both seeds
./discoal 20 10 10000 -t 20 -d 12345 67890
Seeds must be positive integers less than 2^31-1.
Debugging Features
Build with ancestry verification:
make discoal_debug
./discoal_debug 10 1 1000 -t 10 -r 10
This enables additional checks for debugging genealogy construction.