WGS über Alles(?)
Over the last 3 years the human Y DNA phylogenetic tree has exploded with the popularity of low cost Next Generation Sequencing. Thousands of new branches have been discovered largely as the result of FTDNA’s $575 Big Y. Haplogroup projects and sites such as YFULL have crafted algorithms to estimate branch ages, which promise to help explain a degree of relation with matches.
A side effect of the popularity of a test like this is advances in similar technologies are often overlooked. Since 2014 the cost of a 30x WGS test has progressively come down in price from $1800 to around $1200. There is also the Veritas Genetic’s option of $999 with a doctor’s orders, which may yield savings. Unfortunately, these are still rather large investments for the genetic genealogist.
FullGenomes has offered several WGS options over the years which trade-off average coverage for a lower cost. The tests start at 7x coverage for $440. The economy of the test becomes better with the more coverage you order. Their 15x test is $730 and a 20x option is $850. YSEQ, a leader in Y DNA Sanger sequencing, has also joined in with a 15x WGS test priced at $740.
Tools for Comparison
One of the primary questions has always been is there enough effective read depth to be sure of comparisons. The statistics page at haplogroup-r.org attempts to answer this question using reports generated with Broad Institute’s CallableLoci tool. The reports contain a summary of regions evaluated to be low coverage, poorly mapped, and most importantly callable. The callable regions have at least 4 reads where 90% are scored to be aligned with at least 90% accuracy. The summary reports the median callable loci over 1,500 NGS and WGS tests from various labs. A “combBED” column is also included to estimate the ability to compare a given test with Big Y results.
The “combBED” region was introduced by Adamov et al (2015), Defining a New Rate Constant for Y-Chromosome SNPs based on Full Sequencing Data. The Poznik et al. (2013) and Big Y White Paper region intersection create the “combBED”. The result is a filter that can be used to identify 8,473, 821 base pair of the Y chromosome known to be targeted by Big Y and reliably sequenced using 100 base pair reads. As the haplogroup-r.org summary shows, the average Big Y captures 90% of the regions. While only two 15x to 20x WGS tests are available, they can be used to begin to form impressions of how well the results can be compared.
Low-Coverage WGS Comparison to Big Y Targeted Testing
The first thing to note is both WGS resolutions handily exceeds Big Y in callable alleles. From a variant discovery point of view this is the most important metric. SNP mutations rates are estimated in terms of years per base pair. The more callable base pair the shorter the average number of years between mutations detected. In terms of genetic genealogy the few years per mutation, the more useful the markers are in determining a time to most recent common ancestor.
The second factor is combBED loci. This metric estimates how well a test can be compared with a current generation Big Y test. Here we can see the 15x coverage option contains 1.7 million fewer bases in the target zone even though 1.8 million more bases were sampled. This indicates a 69.7% chance of missing a significant SNP that occurred in the last 500 years in a direct comparison. The 20x option fairs much better. 7.4 million combBED loci falls within the 4% coefficient of variability in the Big Y tests. This makes the BAM file directly comparable with existing Big Y BAMs, while providing an additional 4.5 million locations to compare with future tests.
Price to Performance
While the WGS options are significant investments, one should keep in mind the overall price to performance ratio in the tests. The larger the “Loci/$” ratio, the more value offered by the testing platform.
|Test||Median Callable loci (Y DNA)||Price||Loci/$|
|Y Elite 2.1||13,758,146||$795||17,306|
As previously mentioned Y Elite 2.1 continues to offer the best absolute value option for Y DNA variant discovery. What is interesting is that the 20x WGS option will also unlock your autosomal DNA, and provide better value in terms for male specific regions than the lowest cost targeted option. The test provides all the benefits of the $575 Big Y, $89 Family Finder, and $199 mtDNA full sequence test.
The affordability of new approaches to DNA sequencing continues to advance. While Big Y continues to offer the best option in terms of absolute price, a crossroad has been reached. 15x WGS offers more Y DNA variant discovery for each dollar invested. The 20x WGS results are comparable to Y Elite 2.1, which remains the value leader. However, the 20x WGS test also delivers your atDNA for just $55 more. Those autosomal results can be converted into a 23andMe V3 compatible report and uploaded into matching databases like GedMatch.
As WGS continues on the march to $100 per genome, the 20x WGS today offers better shelf life than the dedicated Y DNA tests. The overall coverage difference between 20x and 30x is not significant for Y DNA applications. How well the autosomal segments hold up needs further study in the future.