Category Archives: Genetics

Genetic Tests for Predicting Clopidogrel Response Gain Traction: AHA – TCTMD

Its time for genetic testing of clopidogrel response to move into the mainstream, suggests a new scientific statement from the American Heart Association (AHA) that outlines the supporting evidence but also acknowledges the obstacles that still stand in the way of wider adoption.

Clopidogrel, the mainstay oral P2Y12 inhibitor, is a prodrug thats metabolized by the enzyme CYP2C19 before becoming biologically active. But a substantial part of the populationthe prevalence varies by race/ethnicityhas a loss-of-function variation in the CYP2C19 gene. For decades, its been known that patients with this allele have more platelet aggregation and ischemic events than noncarriers while on clopidogrel therapy.

Still, US and European guidelines addressing antiplatelet therapy in CAD havent gone so far as to recommend routine genetic testing, though a few of these documents did give a nod to selective use in situations like dual antiplatelet therapy (DAPT) de-escalation after PCI for ACS.

Writing group chair Naveen L. Pereira, MD (Mayo Clinic, Rochester, MN), told TCTMD that one driver of their new AHA statement is the fact that the guidelines havent yet addressed the latest published clinical trials (POPular Genetics, TAILOR-PCI, PHARMCLO, and IAC-PCI), observational studies, and meta-analyses.

We felt that incorporating data from these studies and providing some guidance to clinicians by interpreting the data, which can be pretty complicated sometimes, would be helpful, said Pereira, who served as a co-principal investigator for the TAILOR-PCI trial.

Beyond this, the authors also collected information on the pharmacology and pharmacokinetics of P2Y12 inhibitors, both genetic and nongenetic determinants of patients response to the drugs, as well as practicalities like reimbursement and how to choose among assays.

Our conclusion was that the evidence to date supports genetic testing, Pereira noted. But in an AHA statement, we cannot directly say, You should do genetic testing. That's up to the guidelines.

As the document points out, many clinicians have positive perceptions about pharmacogenetic testing and its clinical implications, [but fewer than] 10% adopt pharmacogenetic testing in their routine clinical practice, primarily because of a lack of clinical guidelines and pharmacogenetic education.

Indeed, only a very small fraction of practices preemptively genotype, said Pereira. For patients who go on clopidogrel only to fare poorly and experience an event, genetic testing is moot by that point, he explained, since the answer would be to simply give an alternative antiplatelet drug.

Why Clopidogrel Shouldnt Be Skipped

Increasingly, the oral P2Y12 inhibitor of choice isnt clopidogrel but ticagrelor or prasugrelneither of which are dependent on CYP2C19. Some clinicians wonder, why not just avoid the problem of clopidogrel response entirely?

There are physicians who say, I know that having a loss-of-function genotype is a problem when I give clopidogrel, but if I give ticagrelor or prasugrel to all my patients, I don't have to worry about genetic testing, Pereira commented. The problem with this blanket approach is that these drugs are more potent antiplatelets, so on the whole there will be an increase in bleeding incidence. If you want to balance the ischemic and bleeding event risk, it appears that genetic-guided therapy [from the outset] would be an optimal strategy, he added.

Pereira pointed out that multiple studies have shown the cost-effectiveness of using genetic testing to guide antiplatelet therapy. Both clopidogrel and prasugrel are now generic, but not ticagrelor (Brilinta; AstraZeneca), which is considerably more expensive. Medicare considers genetic testing for CYP2C19 loss-of-function alleles to be medically necessary in certain situations, such as when an ACS patient is undergoing PCI, and thus covers its cost. Some commercial payers also offer reimbursement for the testing.

With the availability of point-of-care assays, the logistical hurdles to widespread adoption are also less high. Previously, it could take 2 or 3 days to get results after sending a blood sample for analysis, he noted, but now the testing can be done at bedside with a buccal swab, producing results within an hour.

Naturally, the field loves to see data, Pereira said. While it would be ideal to have a clinical trial comparing genetic-guided therapy versus no testing, with that design, there would be a lot of overlap, since perhaps 70% of patients in the testing arm and 100% of those in the control arm would be taking clopidogrelwith any difference driven by the 30% in the testing arm on another P2Y12 inhibitor. You're going to need tens of thousands of patients to show a difference, so I think doing a trial like that is very difficult at this point, he said. Its easier to see the impact of testing when, as he pointed out was done in a prespecified analysis of TAILOR-PCI, only patients found to have a loss-of-function variant are compared: those given clopidogrel versus those given ticagrelor or prasugrel.

In an AHA statement, we cannot directly say, You should do genetic testing. That's up to the guidelines. Naveen L. Pereira

Overall, Pereira urged, I think it's important to pay attention to evidence in a holistic way. . . . All the data, even though there's not that one big trial showing a difference, really points to [the need] to be careful giving loss-of-function patients clopidogrel. This is especially true when talking about the monotherapy thats happening with newer stents after DAPT de-escalation.

What hed like to see next, said Pereira, is for guidelines to give specific advice on how to use CYP2C19 testing. Clinicians in the meantime should consider looking at [point-of-care] platforms and see how they can incorporate that in their practices so it becomes easy and intuitive. Implementation, the statement adds, depends on the ease not only of performing the tests but also of interpreting their results, as well as knowledge about how to adjust therapy accordingly and the ability to integrate each patients genetic status into the electronic health record for care teams to access.

But What About Platelet Function?

In a commentary published on the AHAs Professional Heart Daily website, Mark B. Effron, MD (John Ochsner Heart and Vascular Institute, New Orleans, LA), highlights the fact that platelet function testing (PFT) is another option for predicting who will benefit from a more-potent antiplatelet agent versus clopidogrel, or from de-escalation of therapy.

In most institutions in the United States, it is easier to obtain results of a platelet aggregation test using VerifyNow than it is to obtain genomics on the patient, he writes. Until . . . there are studies evaluating the benefit of an all-comers genomic strategy versus a directed PFT, there will still be controversy as to which is more appropriate in the management of patients receiving P2Y12 inhibitor therapy.

In their statement, Pereira and colleagues point out that platelet function testing and genetic testing each has advantages and disadvantages.

The key advantage of PFT lies in directly defining the intermediate phenotype of interest (ie, levels of on-treatment platelet reactivity) for which studies have shown an association with clinical outcomes (ie, increased thrombotic and bleeding risks with high and low platelet reactivity, respectively), they say. Nevertheless, its clinical implementation has been challenging given the need for multiple repeated assessments due to potential of variability of results over time and the need for a patient to be on treatment for a certain length of time with a given antiplatelet agent (eg, for at least 12 weeks with clopidogrel) to be able to assess antiplatelet effects and define responsiveness adequately.

Effron agrees that a tailored approach is the way forward, though the exact strategy is still being debated. Whether directed P2Y12 therapy is accomplished through genotype-guided antiplatelet therapy or through PFT, he says, it is becoming clear that patient profiling is needed to determine the best therapy for the patient.

See the original post:
Genetic Tests for Predicting Clopidogrel Response Gain Traction: AHA - TCTMD

Bringing Gene Therapy to the Brain – The Scientist

This webinar will be hosted live and available on-demand

Thursday, August 8, 2024 2:30 - 4:00 PMET

The blood-brain barrier (BBB) is a semi-permeable membrane between the blood and the interstitium of the brain that regulates molecule and ion movement between the circulation and the brain. This barrier poses an obstacle to gene therapy delivery, as strategies that work for other organs may not necessarily be able to cross the BBB. In this webinar brought to you by The Scientist, Douglas Marchuk and Viviana Gradinaru will explain the obstacles posed by the BBB, as well as how overcoming the BBB allows them to investigate new approaches for combatting neurological disease.

Topics to be covered

Douglas A. Marchuk, PhD James B. Duke Distinguished Professor Department of Molecular Genetics and Microbiology Duke University School of Medicine

Viviana Gradinaru, PhD Lois and Victor Troendle Professor of Neuroscience and Biological Engineering Director, Molecular and Cellular Neuroscience Center of the Tianqiao and Chrissy Chen Institute for Neuroscience Director and Allen V.C. and Lenabelle Davis Leadership Chair Merkin Institute for Translational Research California Institute of Technology

Read the original post:
Bringing Gene Therapy to the Brain - The Scientist

UW initiative aims to bring together social sciences and genetics – Wisbusiness.com

Integrating the fields of genetics and social science is putting us on the right track for understanding the world better, a UW-Madison expert says.

The universitys La Follette School of Public Affairs yesterday held a panel discussion on its Initiative in Social Genomics, which aims to bring together these disciplines to explore how genetics are connected to behavior, socio-economic outcomes and other factors.

Jason Fletcher, a professor of public affairs with the school, underlined the complexity involved with combining two nuanced areas of study into one discipline. Still, conducting research focused on just one while excluding the other fails to recognize that both genetics and social factors interact with one another, he said.

By focusing on your one domain, youre not including all of the relevant factors, he said, noting that only in the past two decades or so has information from both fields been combined into the same data structures.

He said the university is making major investments into training more people to wrangle this firehose of data to conduct meaningful social genomics research.

Because it is so complicated, the solutions so far have not been obvious, theyve required a lot of work, he said. And were not there. We dont have the solutions yet, but I think thats the enterprise here, is that we need collaborations to build this bridge where both sides are building at the same time, and coming together.

Lauren Schmitz, an assistant professor of public affairs with the school, said that ever since the human genome was first fully mapped in 2003, we have many more questions than answers about what makes us tick. She noted rapid advancements in computing and genome sequencing have led to a flood of new genetic data that scientists are still working to understand.

In part, sequencing the human genome wasnt the silver bullet humanity hoped for, because we realized that we cant study the human genome in isolation, she said yesterday. If we want to gain a better understanding of how genetic diversity shapes who we are, we need to understand and get outside the lab, to study genetic diversity and our genes in the wild.

Conditions of work, environmental factors and even economic trends also powerfully shape our life outcomes, she noted.

Her own research, focused on aging and longevity, explores how social conditions and disadvantages affect biological age. She said scientists can now calculate biological age quite accurately based on analysis of epigenetics, or how various factors affect gene expression. With just a blood sample, they can calculate how life circumstances are accelerating or slowing down the aging process.

This scientific explosion of data is really allowing us to see the impacts of public policy on the cellular level, she said.

In a 2022 study focused on the Great Depression, Schmitz sought to understand how this period of economic hardship affected biological aging.

What we found is that individuals who were in utero, who were in the womb during the Great Depression, were aging faster decades later, she said. And so here we see this really important connection between early life conditions and late-life aging.

Watch the video.

Excerpt from:
UW initiative aims to bring together social sciences and genetics - Wisbusiness.com

Women have a higher genetic risk for PTSD, according to study by VCU and Swedish researchers – VCU News

By Olivia Trani

Women are twice as likely as men to develop post-traumatic stress disorder, but the factors contributing to this disparity have largely remained unsettled. A research team led by Virginia Commonwealth University and Lund University in Sweden conducted the largest twin-sibling study of PTSD to date to shed light on how genetics may play a role. Their results, published Tuesday in theAmerican Journal of Psychiatry, are the first to demonstrate that women have a higher genetic risk for the disorder compared with men.

By analyzing health data from over 16,000 twin pairs and 376,000 sibling pairs, the research team discovered that heritability for PTSD was 7 percentage points higher in women (35.4%) than in men (28.6%). They also found evidence that the genes that make up the heritable risk for PTSD vary between the two sexes.

The researchers say their findings could inform strategies for PTSD prevention and intervention following a traumatic event, as well as help address stigmas related to womens mental health.

Women are at higher risk for developing PTSD than men, even when controlling for the type of trauma, income level, social support and other environmental factors. Some of the theories as to why that is have frankly been unkind to women, such as attributing the sex difference to a weakness or lack of ability to cope, saidAnanda B. Amstadter, Ph.D., a professor in theVCU School of Medicinesdepartments ofPsychiatryandHuman and Molecular Geneticsand lead author of the study. I think this study can help move the narrative that people can have an inherited biological risk for PTSD, and that this genetic risk is greater in women.

Nearly 70% of the global population are exposed to at least one traumatic event in their lifetime, such as physical or sexual assault, a motor vehicle accident, exposure to combat or a natural disaster. About 6% of those who are exposed to trauma develop PTSD.Amstadters research focuses on understanding the conditions that might increase or decrease a persons risk of experiencing PTSD, particularly how a persons genes impact their risk.

If you think of risk for PTSD like a pie chart, were trying to better understand what factors make up the pieces of this pie, she said. Some of the risk is influenced by a persons environment, such as the experiences they have while growing up. On the other hand, some of the risk will be influenced by the genes they inherit from their parents.

Previous research has looked into how genes influence the likelihood of developing PTSD, but the study conducted by Amstadter and her colleagues is the first of its kind to investigate how genetic risk varies by sex.

For this project, the research team examined anonymized clinical data from Swedish population-based registries. Their analysis consisted of more than 400,000 pairs of twins or siblings born up to two years apart in Sweden between 1955 and 1980. Studies on twins and siblings, because of their genetic similarities, can help researchers determine how a persons genes influence their risk for mental illnesses.

Every time a person within this age group interacts with Swedens health care system, whether thats visiting their primary care doctor, filling a prescription or going to the hospital, that information is recorded in their national registries. This kind of data is a really powerful tool for addressing questions related to genetic risk for medical conditions, Amstadter said. Prior PTSD studies involving twins and siblings have typically only included a few thousand individuals. Because our sample size was so large in comparison, we were able to make calculations with a higher degree of certainty.

Through statistical modeling, the researchers calculated how much a persons genetic makeup influenced their likelihood of developing PTSD following a traumatic event. In finding that PTSD was 35.4% heritable in women but only 28.6% heritable in men, they demonstrated that women have a higher biological risk for PTSD.

Their models also revealed that the genes associated with PTSD were highly correlated (0.81) but not entirely the same between men and women. This suggests that the genetic underpinnings of sex hormones, like testosterone, estrogen and progesterone, may be involved in the development of PTSD. The research team is collaborating with the Psychiatric Genomics Consortium to identify the molecular genetic variants that may contribute to sex-specific pathways of risk.

Amstadter conducted the research at theVirginia Institute for Psychiatric and Behavioral Geneticsat VCU alongside co-authors Shannon Cusack, Ph.D., a postdoctoral scholar; and Kenneth Kendler, M.D., the institutes director, professor of psychiatry and eminent scholar. They collaborated with Lund University co-authors Sara Lnn, Ph.D.; Jan Sundquist, M.D., Ph.D.; and Kristina Sundquist, M.D., Ph.D.

Subscribe to VCU News at newsletter.vcu.edu and receive a selection of stories, videos, photos, news clips and event listings in your inbox.

Read the rest here:
Women have a higher genetic risk for PTSD, according to study by VCU and Swedish researchers - VCU News

Genetics study points to potential treatments for restless leg syndrome – University of Cambridge news

Restless leg syndrome can cause an unpleasant crawling sensation in the legs and an overwhelming urge to move them. Some people experience the symptoms only occasionally, while others get symptoms every day. Symptoms are usually worse in the evening or at night-time and can severely impair sleep.

Despite the condition being relatively common up to one in 10 older adults experience symptoms, while 2-3% are severely affected and seek medical help little is known about its causes. People with restless leg syndrome often have other conditions, such as depression or anxiety, cardiovascular disorders, hypertension, and diabetes, but the reason why is not known.

Previous studies had identified 22 genetic risk loci that is, regions of our genome that contain changes associated with increased risk of developing the condition. But there are still no known biomarkers such as genetic signatures that could be used to objectively diagnose the condition.

To explore the condition further, an international team led by researchers at the Helmholtz Munich Institute of Neurogenomics, Institute of Human Genetics of the Technical University of Munich (TUM) and the University of Cambridge pooled and analysed data from three genome-wide association studies. These studies compared the DNA of patients and healthy controls to look for differences more commonly found in those with restless leg syndrome. By combining the data, the team was able to create a powerful dataset with more than 100,000 patients and over 1.5 million unaffected controls.

The results of the study are published today in Nature Genetics.

Co-author Dr Steven Bell from the University of Cambridge said: This study is the largest of its kind into this common but poorly understood condition. By understanding the genetic basis of restless leg syndrome, we hope to find better ways to manage and treat it, potentially improving the lives of many millions of people affected worldwide.

The team identified over 140 new genetic risk loci, increasing the number known eight-fold to 164, including three on the X chromosome. The researchers found no strong genetic differences between men and women, despite the condition being twice as common in women as it is men this suggests that a complex interaction of genetics and the environment (including hormones) may explain the gender differences we observe in real life.

Two of the genetic differences identified by the team involve genes known as glutamate receptors 1 and 4 respectively, which are important for nerve and brain function. These could potentially be targeted by existing drugs, such as anticonvulsants like perampanel and lamotrigine, or used to develop new drugs. Early trials have already shown positive responses to these drugs in patients with restless leg syndrome.

The researchers say it would be possible to use basic information like age, sex, and genetic markers to accurately rank who is more likely to have severe restless leg syndrome in nine cases out of ten.

To understand how restless leg syndrome might affect overall health, the researchers used a technique called Mendelian randomisation. This uses genetic information to examine cause-and-effect relationships. It revealed that the syndrome increases the risk of developing diabetes.

Although low levels of iron in the blood are thought to trigger restless leg syndrome because they can lead to a fall in the neurotransmitter dopamine the researchers did not find strong genetic links to iron metabolism. However, they say they cannot completely rule it out as a risk factor.

Professor Juliane Winkelmann from TUM, one of senior authors of the study, said: For the first time, we have achieved the ability to predict restless leg syndrome risk. It has been a long journey, but now we are empowered to not only treat but even prevent the onset of this condition in our patients.

Professor Emanuele Di Angelantonio, a co-author of the study and Director of the NIHR and NHS Blood and Transplant-funded Research Unit in Blood Donor Health and Behaviour, added: "Given that low iron levels are thought to trigger restless leg syndrome, we were surprised to find no strong genetic links to iron metabolism in our study.It may be that the relationship is more complex than we initially thought, and further work is required."

The dataset included the INTERVAL study of Englands blood donors in collaboration with NHS Blood and Transplant.

A full list of funders can be found in the study paper.

Reference Schormair et al. Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology, and risk prediction. Nature Genetics; 5 June 2024; DOI: 10.1038/s41588-024-01763-1

See more here:
Genetics study points to potential treatments for restless leg syndrome - University of Cambridge news

Genetic association mapping leveraging Gaussian processes | Journal of Human Genetics – Nature.com

Gaussian Process (GP)

Gaussian Process (GP) is a type of stochastic processes, whose application in the machine learning field enables us to infer a nonlinear function f(x) over a continuous domain x (e.g., time and space). Precisely, f(x) is a draw from a GP, if {f(x1), , f(xN)} follows a N-dimensional multivariate normal distribution for the N input data points ({{{x}_{i}}}_{i = 1}^{N}). Let us denote (X={({x}_{1},ldots ,{x}_{N})}^{top }) and (f={(f({x}_{1}),ldots ,f({x}_{N}))}^{top }), a GP is formally written as

$$f sim {{{{{{{mathcal{N}}}}}}}}(m(X),k(X,X)),$$

where m() denotes the mean function and k(,) denotes the kernel function [11]. The simplest kernel function would be the linear kernel, such that k(X, X)=2XX, while the automatic relevance determination squared exponential (ARD-SE) kernel is defined as

$$k({x}_{j},{x}_{k})={sigma }^{2}exp left[-{sum }_{q=1}^{Q}frac{{({x}_{jq}-{x}_{kq})}^{2}}{2{rho }_{q}}right]$$

for the (j, k) element of k(X, X), where ({x}_{j},{x}_{k}in {{mathbb{R}}}^{Q}) are Q-dimensional input vectors. Here 2 is the kernel variance parameter and (rho ={({rho }_{1},ldots ,{rho }_{Q})}^{top }) is the vector of characteristic length scales, whose inverse determines the relevance of each element of the input vector. Typically, the mean function is defined as m(X)=0.

Because the GP yielding f(x) has various useful properties inherited from the normal distribution, GP can be used to estimate a nonlinear function f(X) from output data (y={({y}_{1},ldots ,{y}_{N})}^{top }) along continuous factor X. The extended linear model y=f(X)+ is referred to as the GP regression and widely used in the machine learning framework [12]. This model can be used to map dynamic genetic associations for normalized gene expression or other common complex quantitative traits (e.g., human height) along the continuous factor x (e.g., cellular states or donors age). Let us denote the genotype vector (g={({g}_{1},ldots ,{g}_{N})}^{top }) and the kinship matrix R among N individuals, the mapping model, as proposed by us or others [8, 10] can be expressed as follows:

$$y=alpha +beta odot g+gamma +varepsilon ,$$

(1)

where

$$alpha sim {{{{{{{mathcal{N}}}}}}}}(0,K),quad beta sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{g}K),quad gamma sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{d}Kodot R)$$

are all GPs with similar covariance matrices, where denotes element wise product between two vectors or matrices with the same dimensions, K=k(X, X) denotes the covariance matrix with a kernel function, and denotes the residuals. Intuitively, models the average baseline change of y in relation to x, while represents the dynamic genetic effect along x. The effect size is multiplied by the genotype vector g, indicating that the output yi varies between different genotype groups (gi {0, 1, 2}). In fact, the effect size (xi) is additive to the baseline (xi) at each xi, which is the same as the standard association mapping. Here statistical hypothesis testing is performed under the null hypothesis of g=0, as the strength of genetic association is determined by g.

It is important to note that the model (1) includes a correction term that accounts for the between-donor variation of dynamic changes along x, particularly when multiple data points are measured from the same donor or samples are taken from related donors. This term is essential for statistical calibration of the genetic effect , because other genetic associations scattered over the genome (trans effects) can confound the target genotype effect. Therefore, to adjust for the confounding effect, we need to include the extra GP , which is drawn from a normal distribution with the covariance matrix of K multiplied by the kinship matrix R.

Here, the kinship matrix is estimated by (hat{R}=sumnolimits_{l = 1}^{L}{tilde{g}}_{l}{tilde{g}}_{l}^{top }/L) using genome-wide variants gl(l=1, ,L), where ({tilde{g}}_{l}) is a standardized genotype vector (centered and scalced) based on the allele frequency at genetic variant l, while L denotes the total number of all variants across the genome [6]. The matrix is initially a NN dense matrix, but it can be simplified if donors are (sufficiently) unrelated. Let us introduce a design matrix of donor configuration, (Zin {{mathbb{R}}}^{Ntimes {N}_{d}}), for the Nd donors (i.e., zij=1 if the sample i is taken from the donor j; otherwise zij=0), the kinship matrix can then be approximated as R=ZZ. Thus, can be expressed as a linear combination of Nd independent GPs ({{gamma }_{j} sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{d}K);j=1,ldots ,{N}_{d}}), such that (gamma =mathop{sum }nolimits_{j = 1}^{{N}_{d}}{gamma }_{j}odot {z}_{j}), where zj denotes the jth column vector of Z. This approximation is particularly useful for parameter estimation with large Nd (as discussed in section 2.4).

When the sample size N is large, an ordinary GP faces a severe scalability issue due to the dimension of the dense matrix K being NN, resulting in a total computational cost of ({{{{{{{mathcal{O}}}}}}}}({N}^{3})). As a result, the application of GP in the GWAS field is hindered, as the sample sizes often reach a million these days. However, there are several alternatives to approximate the full GP model, including Nystrm approximation (low-rank approximation), Projected Process approximation [13], Sparse Pseudo-inputs GP [14], Fully Independent Training Conditional approximation and Variational Free Energy approximation [15]. In this section, we introduce a sparse GP approximation proposed by [16].

The sparse GP is a scalable model using the technique of inducing points [14]. Since the computational cost of the sparse GP is ({{{{{{{mathcal{O}}}}}}}}(N{M}^{2})) with M inducing points, we can greatly reduce the computational cost, which is essentially linear to N under the assumption of MN. Let us denote M inducing points by (T={({t}_{1},ldots ,{t}_{M})}^{top }) and corresponding GPs by (u={(u({t}_{1}),ldots ,u({t}_{M}))}^{top }), the joint distribution of f and u becomes a multivariate normal distribution. Therefore a lower bound of the conditional distribution p(yu) can be written as

$$log p(y| u) = log int,p(y| f)p(f| u)dfge intleft[log p(y| f)right]p(f| u)df\ = log {{{{{{{mathcal{N}}}}}}}}(y| bar{f},{sigma }^{2}I)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}equiv {{{{{{{{mathcal{L}}}}}}}}}_{1},$$

where

$$bar{f}={K}_{NM}{K}_{MM}^{-1}u,quad {tilde{K}}_{NN}={K}_{NN}-{K}_{NM}{K}_{MM}^{-1}{K}_{MN},$$

and

$${K}_{NN}=k(X,X),quad {K}_{NM}=k(X,T),quad {K}_{MM}=k(T,T).$$

Therefore, the marginal distribution of the output y is approximated by

$$p(y) = int,p(y| u)p(u)duge intexp {{{{{{{{{mathcal{L}}}}}}}}}_{1}}p(u)du\ = log {{{{{{{mathcal{N}}}}}}}}(y| 0,V)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}equiv exp {{{{{{{{{mathcal{L}}}}}}}}}_{2}},$$

where (V={sigma }^{2}I+{K}_{NM}{K}_{MM}^{-1}{K}_{MN}). The lower bound ({{{{{{{{mathcal{L}}}}}}}}}_{2}) is referred to as the Titsias bound and can be used for parameter estimation as well as statistical hypothesis testing.

Selecting the optimal number of inducing points M and their coordinates is crucial for accurately approximating a GP. Although a larger value of M provides a better approximation of GP, it is not feasible to increase M when N reaches hundreds of thousands in large-scale genetic association studies. Additionally, the accuracy of the GP is influenced by the complexity of nonlinearity of y and the dimension Q of input points x. There are few approaches inferring an optimal value of M from data [17], but the size of the example used in the study is too small (48 genes437 samples) to be applied to real-world data. However, it is worth noting that the optimal coordinate of inducing points with a fixed M can be easily learned from data, as described in the next section.

Genetic association mapping involves performing tens of millions of hypothesis tests. Therefore, it is almost impossible to estimate the parameters of GPs from each pair of trait and variant across the genome, even with use of the sparse approximation mentioned in the last subsection. Furthermore, both the baseline and the correction term share the characteristic length parameter (rho ={({rho }_{1},ldots ,{rho }_{Q})}^{top }) and the inducing points T. This can lead to unstable optimization and prolonged parameter estimation times. To address this issue, we have previously proposed a three-step parameter estimation strategy for performing the statistical hypothesis testing [10]. Especially, optimizing with respect to using a quasi-Newton approach (such as the BFGS method) is sufficient in the first step, because the variance explained by is typically much smaller than that explained by . The three steps are:

y=+ (baseline model: H0) to estimate and T.

y=++ (baseline model: H1) to estimate variance parameters d and 2. Here (hat{rho }) and (hat{T}) estimated in H0 are plugged into H1.

y=+g++ (full model: H2) to test whether g=0. Here ({hat{rho },hat{T},{hat{delta }}_{d},{hat{sigma }}^{2}}) estimated in H0 and H1 are used.

Here the Titsias bounds for these models are given by

$${{{{{{{{mathcal{L}}}}}}}}}_{2}^{h}=left{begin{array}{ll}log {{{{{{{mathcal{N}}}}}}}}(y| 0,V)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}},hfill &h={H}_{0},\ log {{{{{{{mathcal{N}}}}}}}}(y| 0,{V}_{d})-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{(1+{delta }_{d}){tilde{K}}_{NN}},hfill&h={H}_{1},\ log {{{{{{{mathcal{N}}}}}}}}(y| 0,{V}_{g})-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{(1+{delta }_{d}){tilde{K}}_{NN}+{delta }_{g}G{tilde{K}}_{NN}G},&h={H}_{2},end{array}right.$$

where

$${V}_{d}=V+{delta }_{d}({K}_{NM}{K}_{MM}^{-1}{K}_{MN})odot R,quad {V}_{g}={V}_{d}+{delta }_{g}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G,$$

and G=diag(g) denotes the diagonal matrix whose diagonal elements are given by the elements of g. The estimators (hat{rho }) and (hat{T}) are obtained by maximizing ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{0}}) with respect to and T, and ({hat{delta }}_{d}) and ({hat{sigma }}^{2}) are obtained by maximizing ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}) with respect to d and 2 given (hat{rho }) and (hat{T}).

It is worth noting that, when the kinship matrix R can be expressed as R=ZZ with a lower rank matrix (Z=({z}_{1},ldots ,{z}_{{N}_{d}})) with Nd

$${V}_{d}=V+{delta }_{d}left({K}_{NM}{K}_{MM}^{-1}{K}_{MN}right)odot (Z{Z}^{top })={sigma }^{2}I+A{B}^{-1}{A}^{top },$$

where

$$A = , (C,{{{{{{{rm{diag}}}}}}}}({z}_{1})C,ldots ,{{{{{{{rm{diag}}}}}}}}({z}_{D})C),quad \ B = , {{{{{{{rm{diag}}}}}}}}({K}_{MM},{delta }_{d}{K}_{MM},ldots ,{delta }_{d}{K}_{MM}),$$

and (C={K}_{NM}{K}_{MM}^{-1}), and B becomes a M(Nd+1)M(Nd+1) block diagonal matrix. Since the computational complexity of H1 or H2 is ({{{{{{{mathcal{O}}}}}}}}({N}_{d}^{2}{M}^{2}N)), for large Nd such as MNd>N, the total complexity is over ({{{{{{{mathcal{O}}}}}}}}({N}^{3})) and we again face the scalability issue.

However, if the donors in the data are unrelated, we can significantly reduce the memory usage and the computational burden to be ({{{{{{{mathcal{O}}}}}}}}({N}_{d}{M}^{2}N)). This is because the matrix A becomes a sparse matrix, with ({z}_{i}^{top }{z}_{{i}^{{prime} }}=0) for (ine {i}^{{prime} }), resulting in NM(Nd1) elements out of NMNd bing 0. Additionaly, non-zero elements of A are repeated and identical to the elements of C, and the block diagonal element of B is essentially ({K}_{MM}^{-1}).

To perform GWAS with GP, it is crucial to reduce the computational time required to map a genetic association for each variant. The Score statistic to test g=0 can be computed from the first derivative of ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}) with respect to g, and the variance parameters ({{hat{sigma }}^{2},{hat{delta }}_{d}}) of Vd are estimated from ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}) once for every single variant to be tested. Therefore, it is ideal to test tens of millions of variants independently. To use the fact that the first derivative of ({V}_{g}^{-1}) given g=0 depends only on Vd, such that

$${left.frac{partial {V}_{g}^{-1}}{partial {delta }_{g}}rightvert }_{{delta }_{g} = 0}=-{V}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G{V}_{d}^{-1},$$

the Score statistic can be explicitly written as

$$S={y}^{top }{hat{V}}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G{hat{V}}_{d}^{-1}y,$$

(2)

whose distribution is the generalized 2 distribution, that is, the distribution of the weighted sum of M independent 2 statistics, such as (mathop{sum }nolimits_{m = 1}^{M}{lambda }_{m}{chi }_{m}^{2}) [8, 10]. It is known that the weights m(m=1, , M) are given by the non-negative eigenvalues of

$${K}_{MM}^{-1/2}{K}_{MN}G{hat{V}}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-top /2},$$

where ({K}_{MM}^{-1/2}) can be computed using the Cholesky decomposition of ({K}_{MM}={K}_{MM}^{top /2}{K}_{MM}^{1/2}).

To compute the p-value from S, we can use the Davies exact method, implemented in the CompQuadForm package on R. Note that, if we use a linear kernel, S can be simplified as described [8]. Although the Score based approach is an easy and quick solution for genome-wide mapping, to check the asymptotic behavior and the statistical calibration of the Score statistics, we should use a QQ-plot to verify that the p-values obtained from multiple variants follow a uniform distribution under the null hypothesis.

If the collocalisation analysis [18] or Bayesian hierarchical model [19] is considered as a downstream analysis using the test statistics, a Bayes factor can also be computed using the Titsias bounds, such as

$$log (BF)={{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}-{{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}.$$

Here we would use some empirical values g={0.01, 0.1, 0.5} to average the Bayes factor, instead of integrating out g from ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}) [20].

In a real genetic association mapping, most of genetic associations are indeed static and ubiquitous over the factor x. To capture such a static association, we can come up with the following model

$$y={alpha }_{0}{1}_{N}+alpha +{beta }_{0}g+beta odot g+{gamma }_{0}+gamma +varepsilon ,$$

where 0 denotes the intercept, 1N denotes the N-dimensional vector of all 1s, 0 denotes the effect size of the static genetic association, and ({gamma }_{0} sim {{{{{{{mathcal{N}}}}}}}}(0,{sigma }^{2}{delta }_{d0}R)) denotes the donor variation which confounds 0. For instance, in [8], the static genetic association 0 is modeled as a fixed effect, and the dynamic effect is tested using the Score statistic. On the other hand, in [10], the authors modeled both the static and dynamic associations as a random effect to test via a Bayes factor. In this case, the covariance matrix K can be rewritten as

$${K}^{* }={sigma }^{2}{e}^{-{rho }_{0}}{1}_{N}{1}_{N}^{top }+K$$

to estimate the model parameters in (1), and then the variance g=0 for is tested.

Note here that, the kernel parameter 0 is not necessarily common and shared across , and . Indeed, in [10], the authors estimated ({hat{rho }}_{0}^{alpha }) and ({hat{rho }}_{0}^{gamma }) independently in ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}). To compute the Score statistic, the authors assumed that ({hat{rho }}_{0}^{beta }={hat{rho }}_{0}^{gamma }) for and , because the ratio of the static effect to the dynamic effect can be the same for cis and trans genetic effects.

In longitudinal studies, the factor x is typically observed explicitly (e.g., donors age or physical locations where samples were taken). This makes it straightforward to perform genetic association mapping along x using the Score statistics or Bayes factors, as described above. However, this is not often the case for the molecular studies, and therefore we need to estimate the underlying biological state from the data.

In single-cell biology, typically, the hidden cellular state x is often referred to as pseudotime", and the principal component analysis is normally used to estimate it as part of dimension reduction [21]. Gaussian process latent variable model (GPLVM) is a strong alternative to extract the pseudotime when the molecular phenotype gradually changes along pseudotime x in a nonlinear fashion [22, 23].

We have also proposed a GPLVM that uses the baseline model H0 to estimate the latent variable X from the single-cell RNA-seq data (see Section 3 for more details). Let Y=(y1,,yJ) be the gene expression matrix of J genes, whose column is a vector of gene expression for the gene j, the Titsias lower bound of the GPLVM based on the baseline model H0 can be written as

$$p(Y| X)ge {{{{{{{mathcal{MN}}}}}}}}(Y| 0,Sigma ,I+{K}_{NM}{K}_{MM}^{-1}{K}_{MN})-frac{J}{2}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}={{{{{{{{mathcal{L}}}}}}}}}_{2}.$$

To obtain the optimal cellular state (hat{X}), this lower bound can be maximized with respect to {, X, T, } [10, 24]. Here (Sigma ={{{{{{{rm{diag}}}}}}}}({sigma }_{j}^{2};j=1,ldots ,J)) denotes the residual variance parameters of J genes, and ({{{{{{{mathcal{MN}}}}}}}}(cdot )) denotes the matrix normal distribution. Due to the uniqueness of the model parameters, the variance parameter in the kernel function is set to be 2=1. In addition, to maintain the uniqueness of the latent variable estimation, a prior probability on X is required. It is quite common to assume independent standard normal distributions for each of the elements of (X sim {{{{{{{mathcal{MN}}}}}}}}(0,I,I)) [24], although there are multiple alternatives to consider depending on the nature of the modeled data [10, 23].

In the parameter estimation, the limited-memory BFGS method can be used to implement GPLVM for large N. In addition, the stochastic variational Bayes approach can be used to fit GPLVM to larger data sets, while reducing the fitting time [25,26,27].

For the non-Gaussian output y, the Titsias bound ({{{{{{{{mathcal{L}}}}}}}}}_{2}) is not analytically available. However, for the Poisson distribution case, a lower bound of the conditional probability p(yu) can be computed as follows:

$${{{{{{{{mathcal{L}}}}}}}}}_{1}=mathop{sum}_{i}left[-log ({y}_{i}!)+{y}_{i}{bar{f}}_{i}-exp left({bar{f}}_{i}+frac{{tilde{k}}_{ii}}{2}right)right],$$

where ({tilde{k}}_{ii}) denotes the ith diagonal element of ({tilde{K}}_{NN}). Let i and wi be the working response and the iterative weight of GLM for the ith sample, such that

$${nu }_{i}={bar{f}}_{i}+({y}_{i}-{w}_{i})/{w}_{i}quad {{{{{{{rm{and}}}}}}}}quad {w}_{i}=exp left({bar{f}}_{i}+frac{{tilde{k}}_{ii}}{2}right)$$

for i=1, , N, the optimal (hat{u}) which maximizes (exp {{{{{{{{{mathcal{L}}}}}}}}}_{1}}p(u)) satisfies

$$left({K}_{MM}^{-1}+{K}_{MM}^{-1}{K}_{MN}W{K}_{NM}{K}_{MM}^{-1}right)u=Wnu ,$$

(3)

where W=diag(wi; i=1, , N), which suggests

$$nu | u sim {{{{{{{mathcal{N}}}}}}}}(bar{f},{W}^{-1})$$

as described in elsewhere [28]. Therefore, we can maximize

$${{{{{{{{mathcal{L}}}}}}}}}_{2}={{{{{{{mathcal{N}}}}}}}}(nu | 0,{W}^{-1}+{K}_{NM}{K}_{MM}^{-1}{K}_{MN})$$

with respect to {2, } where (u=hat{u}) is iteratively updated as in (3). Thus, to obtain the Score statistic for non-Gaussian y, we replace y= and ({hat{V}}_{d}={W}^{-1}+A{B}^{-1}A) in (2).

For a binary output y, it is more complicated than the Poisson case, bacause it is even impossible to analytically compute the ({{{{{{{{mathcal{L}}}}}}}}}_{1}) bound with logit or Probit link function. For logit link function, several useful alternatives to the ({{{{{{{{mathcal{L}}}}}}}}}_{1}) bound have been proposed [29]. For Probit link function [30], proposed an approximation of ({{{{{{{{mathcal{L}}}}}}}}}_{1}) using the Gauss-Hermite quadrature. However, in both cases, the computational cost is much higher than the Poisson case and it is rather impractical to conduct a large genome-wide association mapping at this moment.

See the rest here:
Genetic association mapping leveraging Gaussian processes | Journal of Human Genetics - Nature.com

Minimally destructive hDNA extraction method for retrospective genetics of pinned historical Lepidoptera specimens … – Nature.com

Pyke, G. H. & Ehrlich, P. R. Biological collections and ecological/environmental research: A review, some observations and a look to the future. Biol. Rev. 85, 247266 (2010).

Article PubMed Google Scholar

Wheeler, Q. D. et al. Mapping the biosphere: Exploring species to understand the origin, organization and sustainability of biodiversity. Syst. Biodivers. 10, 120 (2012).

Article CAS Google Scholar

Lane, M. A. Roles of natural history collections. Ann. Mo. Bot. Gard. 83, 536545 (1996).

Article Google Scholar

Raxworthy, C. J. & Smith, B. T. Mining museums for historical DNA: Advances and challenges in museomics. Trends Ecol. Evol. 36, 10491060 (2021).

Article CAS PubMed Google Scholar

Cavill, E. L., Liu, S., Zhou, X. & Gilbert, M. T. P. To bee, or not to bee? One leg is the question. Mol. Ecol. Resour. 22, 18681874 (2022).

Article CAS PubMed Google Scholar

OBrien, D. et al. Bringing together approaches to reporting on within species genetic diversity. J. Appl. Ecol. 59, 22272233 (2022).

Article Google Scholar

Pearman, P. B. et al. Monitoring of species genetic diversity in Europe varies greatly and overlooks potential climate change impacts. Nat. Ecol. Evol. 2024, 115. https://doi.org/10.1038/s41559-023-02260-0 (2024).

Article Google Scholar

Gauthier, J. et al. Museomics identifies genetic erosion in two butterfly species across the 20th century in Finland. Mol. Ecol. Resour. 20, 11911205 (2020).

Article CAS PubMed PubMed Central Google Scholar

Jensen, E. L. et al. Ancient and historical DNA in conservation policy. Trends Ecol. Evol. 37, 420429 (2022).

Article CAS PubMed Google Scholar

Fountain, T. et al. Predictable allele frequency changes due to habitat fragmentation in the Glanville fritillary butterfly. Proc. Natl. Acad. Sci. USA 113, 26782683 (2016).

Article ADS CAS PubMed PubMed Central Google Scholar

Dabney, J., Meyer, M. & Pbo, S. Ancient DNA damage. Cold Spring Harb. Perspect. Biol. 5, 012567 (2013).

Article Google Scholar

Briggs, A. W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl. Acad. Sci. USA 104, 1461614621 (2007).

Article ADS CAS PubMed PubMed Central Google Scholar

Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3e3 (2012).

Article CAS PubMed Google Scholar

Campos, P. F. & Gilbert, M. T. P. DNA extraction from keratin and chitin. Methods Mol. Biol. 1963, 5763 (2019).

Article CAS PubMed Google Scholar

Chen, F., Shi, J., Luo, Y. Q., Sun, S. Y. & Pu, M. Genetic characterization of the gypsy moth from China (Lepidoptera, Lymantriidae) using inter simple sequence repeats markers. PLoS ONE 8, e73017 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Palma, J., Valmorbida, I., da Costa, I. F. D. & Guedes, J. V. C. Comparative analysis of protocols for DNA extraction from soybean caterpillars. Genet. Mol. Res. 15, 15027094 (2016).

Article Google Scholar

Blaimer, B. B., Lloyd, M. W., Guillory, W. X. & Brady, S. G. Sequence capture and phylogenetic utility of genomic ultraconserved elements obtained from pinned insect specimens. PLoS ONE 11, e0161531 (2016).

Article PubMed PubMed Central Google Scholar

Marn, D. V., Castillo, D. K., Lpez-Lavalle, L. A. B., Chalarca, J. R. & Prez, C. R. An optimized high-quality DNA isolation protocol for spodoptera frugiperda J. E. smith (Lepidoptera: Noctuidae). MethodsX 8, 101255 (2021).

Article PubMed PubMed Central Google Scholar

Thomsen, P. F. et al. Non-destructive sampling of ancient insect DNA. PLoS ONE 4, e5048 (2009).

Article ADS PubMed PubMed Central Google Scholar

Lalonde, M. M. L. & Marcus, J. M. How old can we go? Evaluating the age limit for effective DNA recovery from historical insect specimens. Syst. Entomol. 45, 505515 (2020).

Article Google Scholar

Latorre, S. M. et al. Museum phylogenomics of extinct Oryctes beetles from the Mascarene Islands. bioRxiv https://doi.org/10.1101/2020.02.19.954339 (2020).

Article Google Scholar

Cavill, E. & Liu, S. To bee, or not to bee? One leg is the question. Mol. Ecol. Resour. 22, 18681874 (2022).

Article CAS PubMed Google Scholar

Starks, P. T. & Peters, J. M. Semi-nondestructive genetic sampling from live eusocial wasps, Polistes dominulu and Polistes fuscatu. Insectes Soc. 49, 2022 (2002).

Article Google Scholar

Gilbert, M. T. P., Moore, W., Melchior, L. & Worobey, M. DNA extraction from dry museum beetles without conferring external morphological damage. PLoS ONE 2, e272 (2007).

Article ADS PubMed PubMed Central Google Scholar

Korlevi, P. et al. A minimally morphologically destructive approach for DNA retrieval and whole-genome shotgun sequencing of pinned historic dipteran vector species. Genome Biol. Evol. 13, 226 (2021).

Article Google Scholar

Kristensent, N. P., Scoble, M. J. & Karsholt, O. Lepidoptera phylogeny and systematics: The state of inventorying moth and butterfly diversity. Zootaxa 1668, 699747 (2007).

Google Scholar

Neff, F. et al. Different roles of concurring climate and regional land-use changes in past 40 years insect trends. Nat. Commun. 13, 7611 (2022).

Article ADS CAS PubMed PubMed Central Google Scholar

Wagner, D. L., Grames, E. M., Forister, M. L., Berenbaum, M. R. & Stopak, D. Insect decline in the Anthropocene: Death by a thousand cuts. Proc. Natl. Acad. Sci. USA 118, 39891188 (2021).

Article Google Scholar

Casas-Marce, M., Revilla, E. & Godoy, J. A. Searching for DNA in museum specimens: A comparison of sources in a mammal species. Mol. Ecol. Resour. 10, 502507 (2010).

Article CAS PubMed Google Scholar

Silva, P. C., Malabarba, M. C., Vari, R. & Malabarba, L. R. Comparison and optimization for DNA extraction of archived fish specimens. MethodsX 6, 14331442 (2019).

Article PubMed PubMed Central Google Scholar

Tsai, W. L. E., Schedl, M. E., Maley, J. M. & McCormack, J. E. More than skin and bones: Comparing extraction methods and alternative sources of DNA from avian museum specimens. Mol. Ecol. Resour. 20, 12201227 (2020).

Article CAS PubMed Google Scholar

Fernndez-Vizarra, E., Enrquez, J. A., Prez-Martos, A., Montoya, J. & Fernndez-Silva, P. Tissue-specific differences in mitochondrial activity and biogenesis. Mitochondrion 11, 207213 (2011).

Article PubMed Google Scholar

Menail, H. A. et al. Flexible thermal sensitivity of mitochondrial oxygen consumption and substrate oxidation in flying insect species. Front. Physiol. 13, 897174 (2022).

Article PubMed PubMed Central Google Scholar

Ferrari, G. et al. Developing the protocol infrastructure for DNA sequencing natural history collections. Biodivers. Data J. 11, 102317 (2023).

Article Google Scholar

Hundsdoerfer, A. & Kitching, I. A method for improving DNA yield from older specimens of large Lepidoptera while minimizing damage to external and internal abdominal characters. Arthropod. Syst. Phylogeny 68, 151155 (2010).

Article Google Scholar

Twort, V. G., Minet, J., Wheat, C. W. & Wahlberg, N. Museomics of a rare taxon: Placing Whalleyanidae in the Lepidoptera Tree of Life. Syst. Entomol. 46, 926937 (2021).

Article Google Scholar

Bhler-Cortesi, T. & Wymann, H. P. Schmetterlinge: Tagfalter Der Schweiz. (Verlag Paul Haupt, 2019).

Smith, A. D. et al. Recovery and analysis of ancient beetle DNA from subfossil packrat middens using high-throughput sequencing. Sci. Rep. 11, 12635 (2021).

Article CAS PubMed PubMed Central Google Scholar

Gutaker, R. M., Reiter, E., Furtwngler, A., Schuenemann, V. J. & Burbano, H. A. Extraction of ultrashort DNA molecules from herbarium specimens. Biotechniques 62, 7679 (2017).

Article CAS PubMed Google Scholar

Caldern-Corts, N., Quesada, M., Cano-Camacho, H. & Zavala-Pramo, G. A simple and rapid method for DNA isolation from xylophagous insects. Int. J. Mol. Sci. 11, 50565064 (2010).

Article PubMed PubMed Central Google Scholar

Caligiuri, L. G. et al. Optimization of DNA extraction from individual sand flies for PCR amplification. Methods Protoc. 2, 115 (2019).

Article Google Scholar

El-Ashram, S., Al Nasr, I. & Suo, X. Nucleic acid protocols: Extraction and optimization. Biotechnol. Rep. 12, 3339 (2016).

Article Google Scholar

Healey, A., Furtado, A., Cooper, T. & Henry, R. J. Protocol: A simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods 10, 21 (2014).

Article PubMed PubMed Central Google Scholar

Poinar, H. N. et al. Molecular coproscopy: Dung and diet of the extinct ground sloth Nothrotheriops shastensis. Science 1979(281), 402406 (1998).

Article ADS Google Scholar

Jaenicke-Desprs, V. et al. Early allelic selection in maize as revealed by ancient DNA. Science 1979(302), 12061208 (2003).

Article ADS Google Scholar

Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl. Acad. Sci. USA 110, 1575815763 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Schubert, M. et al. Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat. Protoc. 9, 10561082 (2014).

Article CAS PubMed Google Scholar

Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: Rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).

Article PubMed PubMed Central Google Scholar

Vasimuddin, M., Misra, S., Li, H. & Aluru, S. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 314324 (IEEE, 2019). https://doi.org/10.1109/IPDPS.2019.00041.

Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: Fast processing of NGS alignment formats. Bioinformatics 31, 20322034 (2015).

Article CAS PubMed PubMed Central Google Scholar

Link, V. et al. ATLAS: Analysis tools for low-depth and ancient samples. bioRxiv. https://doi.org/10.1101/105346 (2017).

Jnsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage20: Fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 16821684 (2013).

Read the original:
Minimally destructive hDNA extraction method for retrospective genetics of pinned historical Lepidoptera specimens ... - Nature.com