Category Archives: Genetics

Women have a higher genetic risk for PTSD, according to study by VCU and Swedish researchers – VCU News

By Olivia Trani

Women are twice as likely as men to develop post-traumatic stress disorder, but the factors contributing to this disparity have largely remained unsettled. A research team led by Virginia Commonwealth University and Lund University in Sweden conducted the largest twin-sibling study of PTSD to date to shed light on how genetics may play a role. Their results, published Tuesday in theAmerican Journal of Psychiatry, are the first to demonstrate that women have a higher genetic risk for the disorder compared with men.

By analyzing health data from over 16,000 twin pairs and 376,000 sibling pairs, the research team discovered that heritability for PTSD was 7 percentage points higher in women (35.4%) than in men (28.6%). They also found evidence that the genes that make up the heritable risk for PTSD vary between the two sexes.

The researchers say their findings could inform strategies for PTSD prevention and intervention following a traumatic event, as well as help address stigmas related to womens mental health.

Women are at higher risk for developing PTSD than men, even when controlling for the type of trauma, income level, social support and other environmental factors. Some of the theories as to why that is have frankly been unkind to women, such as attributing the sex difference to a weakness or lack of ability to cope, saidAnanda B. Amstadter, Ph.D., a professor in theVCU School of Medicinesdepartments ofPsychiatryandHuman and Molecular Geneticsand lead author of the study. I think this study can help move the narrative that people can have an inherited biological risk for PTSD, and that this genetic risk is greater in women.

Nearly 70% of the global population are exposed to at least one traumatic event in their lifetime, such as physical or sexual assault, a motor vehicle accident, exposure to combat or a natural disaster. About 6% of those who are exposed to trauma develop PTSD.Amstadters research focuses on understanding the conditions that might increase or decrease a persons risk of experiencing PTSD, particularly how a persons genes impact their risk.

If you think of risk for PTSD like a pie chart, were trying to better understand what factors make up the pieces of this pie, she said. Some of the risk is influenced by a persons environment, such as the experiences they have while growing up. On the other hand, some of the risk will be influenced by the genes they inherit from their parents.

Previous research has looked into how genes influence the likelihood of developing PTSD, but the study conducted by Amstadter and her colleagues is the first of its kind to investigate how genetic risk varies by sex.

For this project, the research team examined anonymized clinical data from Swedish population-based registries. Their analysis consisted of more than 400,000 pairs of twins or siblings born up to two years apart in Sweden between 1955 and 1980. Studies on twins and siblings, because of their genetic similarities, can help researchers determine how a persons genes influence their risk for mental illnesses.

Every time a person within this age group interacts with Swedens health care system, whether thats visiting their primary care doctor, filling a prescription or going to the hospital, that information is recorded in their national registries. This kind of data is a really powerful tool for addressing questions related to genetic risk for medical conditions, Amstadter said. Prior PTSD studies involving twins and siblings have typically only included a few thousand individuals. Because our sample size was so large in comparison, we were able to make calculations with a higher degree of certainty.

Through statistical modeling, the researchers calculated how much a persons genetic makeup influenced their likelihood of developing PTSD following a traumatic event. In finding that PTSD was 35.4% heritable in women but only 28.6% heritable in men, they demonstrated that women have a higher biological risk for PTSD.

Their models also revealed that the genes associated with PTSD were highly correlated (0.81) but not entirely the same between men and women. This suggests that the genetic underpinnings of sex hormones, like testosterone, estrogen and progesterone, may be involved in the development of PTSD. The research team is collaborating with the Psychiatric Genomics Consortium to identify the molecular genetic variants that may contribute to sex-specific pathways of risk.

Amstadter conducted the research at theVirginia Institute for Psychiatric and Behavioral Geneticsat VCU alongside co-authors Shannon Cusack, Ph.D., a postdoctoral scholar; and Kenneth Kendler, M.D., the institutes director, professor of psychiatry and eminent scholar. They collaborated with Lund University co-authors Sara Lnn, Ph.D.; Jan Sundquist, M.D., Ph.D.; and Kristina Sundquist, M.D., Ph.D.

Subscribe to VCU News at newsletter.vcu.edu and receive a selection of stories, videos, photos, news clips and event listings in your inbox.

Read the rest here:
Women have a higher genetic risk for PTSD, according to study by VCU and Swedish researchers - VCU News

Genetics study points to potential treatments for restless leg syndrome – University of Cambridge news

Restless leg syndrome can cause an unpleasant crawling sensation in the legs and an overwhelming urge to move them. Some people experience the symptoms only occasionally, while others get symptoms every day. Symptoms are usually worse in the evening or at night-time and can severely impair sleep.

Despite the condition being relatively common up to one in 10 older adults experience symptoms, while 2-3% are severely affected and seek medical help little is known about its causes. People with restless leg syndrome often have other conditions, such as depression or anxiety, cardiovascular disorders, hypertension, and diabetes, but the reason why is not known.

Previous studies had identified 22 genetic risk loci that is, regions of our genome that contain changes associated with increased risk of developing the condition. But there are still no known biomarkers such as genetic signatures that could be used to objectively diagnose the condition.

To explore the condition further, an international team led by researchers at the Helmholtz Munich Institute of Neurogenomics, Institute of Human Genetics of the Technical University of Munich (TUM) and the University of Cambridge pooled and analysed data from three genome-wide association studies. These studies compared the DNA of patients and healthy controls to look for differences more commonly found in those with restless leg syndrome. By combining the data, the team was able to create a powerful dataset with more than 100,000 patients and over 1.5 million unaffected controls.

The results of the study are published today in Nature Genetics.

Co-author Dr Steven Bell from the University of Cambridge said: This study is the largest of its kind into this common but poorly understood condition. By understanding the genetic basis of restless leg syndrome, we hope to find better ways to manage and treat it, potentially improving the lives of many millions of people affected worldwide.

The team identified over 140 new genetic risk loci, increasing the number known eight-fold to 164, including three on the X chromosome. The researchers found no strong genetic differences between men and women, despite the condition being twice as common in women as it is men this suggests that a complex interaction of genetics and the environment (including hormones) may explain the gender differences we observe in real life.

Two of the genetic differences identified by the team involve genes known as glutamate receptors 1 and 4 respectively, which are important for nerve and brain function. These could potentially be targeted by existing drugs, such as anticonvulsants like perampanel and lamotrigine, or used to develop new drugs. Early trials have already shown positive responses to these drugs in patients with restless leg syndrome.

The researchers say it would be possible to use basic information like age, sex, and genetic markers to accurately rank who is more likely to have severe restless leg syndrome in nine cases out of ten.

To understand how restless leg syndrome might affect overall health, the researchers used a technique called Mendelian randomisation. This uses genetic information to examine cause-and-effect relationships. It revealed that the syndrome increases the risk of developing diabetes.

Although low levels of iron in the blood are thought to trigger restless leg syndrome because they can lead to a fall in the neurotransmitter dopamine the researchers did not find strong genetic links to iron metabolism. However, they say they cannot completely rule it out as a risk factor.

Professor Juliane Winkelmann from TUM, one of senior authors of the study, said: For the first time, we have achieved the ability to predict restless leg syndrome risk. It has been a long journey, but now we are empowered to not only treat but even prevent the onset of this condition in our patients.

Professor Emanuele Di Angelantonio, a co-author of the study and Director of the NIHR and NHS Blood and Transplant-funded Research Unit in Blood Donor Health and Behaviour, added: "Given that low iron levels are thought to trigger restless leg syndrome, we were surprised to find no strong genetic links to iron metabolism in our study.It may be that the relationship is more complex than we initially thought, and further work is required."

The dataset included the INTERVAL study of Englands blood donors in collaboration with NHS Blood and Transplant.

A full list of funders can be found in the study paper.

Reference Schormair et al. Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology, and risk prediction. Nature Genetics; 5 June 2024; DOI: 10.1038/s41588-024-01763-1

See more here:
Genetics study points to potential treatments for restless leg syndrome - University of Cambridge news

Genetic association mapping leveraging Gaussian processes | Journal of Human Genetics – Nature.com

Gaussian Process (GP)

Gaussian Process (GP) is a type of stochastic processes, whose application in the machine learning field enables us to infer a nonlinear function f(x) over a continuous domain x (e.g., time and space). Precisely, f(x) is a draw from a GP, if {f(x1), , f(xN)} follows a N-dimensional multivariate normal distribution for the N input data points ({{{x}_{i}}}_{i = 1}^{N}). Let us denote (X={({x}_{1},ldots ,{x}_{N})}^{top }) and (f={(f({x}_{1}),ldots ,f({x}_{N}))}^{top }), a GP is formally written as

$$f sim {{{{{{{mathcal{N}}}}}}}}(m(X),k(X,X)),$$

where m() denotes the mean function and k(,) denotes the kernel function [11]. The simplest kernel function would be the linear kernel, such that k(X, X)=2XX, while the automatic relevance determination squared exponential (ARD-SE) kernel is defined as

$$k({x}_{j},{x}_{k})={sigma }^{2}exp left[-{sum }_{q=1}^{Q}frac{{({x}_{jq}-{x}_{kq})}^{2}}{2{rho }_{q}}right]$$

for the (j, k) element of k(X, X), where ({x}_{j},{x}_{k}in {{mathbb{R}}}^{Q}) are Q-dimensional input vectors. Here 2 is the kernel variance parameter and (rho ={({rho }_{1},ldots ,{rho }_{Q})}^{top }) is the vector of characteristic length scales, whose inverse determines the relevance of each element of the input vector. Typically, the mean function is defined as m(X)=0.

Because the GP yielding f(x) has various useful properties inherited from the normal distribution, GP can be used to estimate a nonlinear function f(X) from output data (y={({y}_{1},ldots ,{y}_{N})}^{top }) along continuous factor X. The extended linear model y=f(X)+ is referred to as the GP regression and widely used in the machine learning framework [12]. This model can be used to map dynamic genetic associations for normalized gene expression or other common complex quantitative traits (e.g., human height) along the continuous factor x (e.g., cellular states or donors age). Let us denote the genotype vector (g={({g}_{1},ldots ,{g}_{N})}^{top }) and the kinship matrix R among N individuals, the mapping model, as proposed by us or others [8, 10] can be expressed as follows:

$$y=alpha +beta odot g+gamma +varepsilon ,$$

(1)

where

$$alpha sim {{{{{{{mathcal{N}}}}}}}}(0,K),quad beta sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{g}K),quad gamma sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{d}Kodot R)$$

are all GPs with similar covariance matrices, where denotes element wise product between two vectors or matrices with the same dimensions, K=k(X, X) denotes the covariance matrix with a kernel function, and denotes the residuals. Intuitively, models the average baseline change of y in relation to x, while represents the dynamic genetic effect along x. The effect size is multiplied by the genotype vector g, indicating that the output yi varies between different genotype groups (gi {0, 1, 2}). In fact, the effect size (xi) is additive to the baseline (xi) at each xi, which is the same as the standard association mapping. Here statistical hypothesis testing is performed under the null hypothesis of g=0, as the strength of genetic association is determined by g.

It is important to note that the model (1) includes a correction term that accounts for the between-donor variation of dynamic changes along x, particularly when multiple data points are measured from the same donor or samples are taken from related donors. This term is essential for statistical calibration of the genetic effect , because other genetic associations scattered over the genome (trans effects) can confound the target genotype effect. Therefore, to adjust for the confounding effect, we need to include the extra GP , which is drawn from a normal distribution with the covariance matrix of K multiplied by the kinship matrix R.

Here, the kinship matrix is estimated by (hat{R}=sumnolimits_{l = 1}^{L}{tilde{g}}_{l}{tilde{g}}_{l}^{top }/L) using genome-wide variants gl(l=1, ,L), where ({tilde{g}}_{l}) is a standardized genotype vector (centered and scalced) based on the allele frequency at genetic variant l, while L denotes the total number of all variants across the genome [6]. The matrix is initially a NN dense matrix, but it can be simplified if donors are (sufficiently) unrelated. Let us introduce a design matrix of donor configuration, (Zin {{mathbb{R}}}^{Ntimes {N}_{d}}), for the Nd donors (i.e., zij=1 if the sample i is taken from the donor j; otherwise zij=0), the kinship matrix can then be approximated as R=ZZ. Thus, can be expressed as a linear combination of Nd independent GPs ({{gamma }_{j} sim {{{{{{{mathcal{N}}}}}}}}(0,{delta }_{d}K);j=1,ldots ,{N}_{d}}), such that (gamma =mathop{sum }nolimits_{j = 1}^{{N}_{d}}{gamma }_{j}odot {z}_{j}), where zj denotes the jth column vector of Z. This approximation is particularly useful for parameter estimation with large Nd (as discussed in section 2.4).

When the sample size N is large, an ordinary GP faces a severe scalability issue due to the dimension of the dense matrix K being NN, resulting in a total computational cost of ({{{{{{{mathcal{O}}}}}}}}({N}^{3})). As a result, the application of GP in the GWAS field is hindered, as the sample sizes often reach a million these days. However, there are several alternatives to approximate the full GP model, including Nystrm approximation (low-rank approximation), Projected Process approximation [13], Sparse Pseudo-inputs GP [14], Fully Independent Training Conditional approximation and Variational Free Energy approximation [15]. In this section, we introduce a sparse GP approximation proposed by [16].

The sparse GP is a scalable model using the technique of inducing points [14]. Since the computational cost of the sparse GP is ({{{{{{{mathcal{O}}}}}}}}(N{M}^{2})) with M inducing points, we can greatly reduce the computational cost, which is essentially linear to N under the assumption of MN. Let us denote M inducing points by (T={({t}_{1},ldots ,{t}_{M})}^{top }) and corresponding GPs by (u={(u({t}_{1}),ldots ,u({t}_{M}))}^{top }), the joint distribution of f and u becomes a multivariate normal distribution. Therefore a lower bound of the conditional distribution p(yu) can be written as

$$log p(y| u) = log int,p(y| f)p(f| u)dfge intleft[log p(y| f)right]p(f| u)df\ = log {{{{{{{mathcal{N}}}}}}}}(y| bar{f},{sigma }^{2}I)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}equiv {{{{{{{{mathcal{L}}}}}}}}}_{1},$$

where

$$bar{f}={K}_{NM}{K}_{MM}^{-1}u,quad {tilde{K}}_{NN}={K}_{NN}-{K}_{NM}{K}_{MM}^{-1}{K}_{MN},$$

and

$${K}_{NN}=k(X,X),quad {K}_{NM}=k(X,T),quad {K}_{MM}=k(T,T).$$

Therefore, the marginal distribution of the output y is approximated by

$$p(y) = int,p(y| u)p(u)duge intexp {{{{{{{{{mathcal{L}}}}}}}}}_{1}}p(u)du\ = log {{{{{{{mathcal{N}}}}}}}}(y| 0,V)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}equiv exp {{{{{{{{{mathcal{L}}}}}}}}}_{2}},$$

where (V={sigma }^{2}I+{K}_{NM}{K}_{MM}^{-1}{K}_{MN}). The lower bound ({{{{{{{{mathcal{L}}}}}}}}}_{2}) is referred to as the Titsias bound and can be used for parameter estimation as well as statistical hypothesis testing.

Selecting the optimal number of inducing points M and their coordinates is crucial for accurately approximating a GP. Although a larger value of M provides a better approximation of GP, it is not feasible to increase M when N reaches hundreds of thousands in large-scale genetic association studies. Additionally, the accuracy of the GP is influenced by the complexity of nonlinearity of y and the dimension Q of input points x. There are few approaches inferring an optimal value of M from data [17], but the size of the example used in the study is too small (48 genes437 samples) to be applied to real-world data. However, it is worth noting that the optimal coordinate of inducing points with a fixed M can be easily learned from data, as described in the next section.

Genetic association mapping involves performing tens of millions of hypothesis tests. Therefore, it is almost impossible to estimate the parameters of GPs from each pair of trait and variant across the genome, even with use of the sparse approximation mentioned in the last subsection. Furthermore, both the baseline and the correction term share the characteristic length parameter (rho ={({rho }_{1},ldots ,{rho }_{Q})}^{top }) and the inducing points T. This can lead to unstable optimization and prolonged parameter estimation times. To address this issue, we have previously proposed a three-step parameter estimation strategy for performing the statistical hypothesis testing [10]. Especially, optimizing with respect to using a quasi-Newton approach (such as the BFGS method) is sufficient in the first step, because the variance explained by is typically much smaller than that explained by . The three steps are:

y=+ (baseline model: H0) to estimate and T.

y=++ (baseline model: H1) to estimate variance parameters d and 2. Here (hat{rho }) and (hat{T}) estimated in H0 are plugged into H1.

y=+g++ (full model: H2) to test whether g=0. Here ({hat{rho },hat{T},{hat{delta }}_{d},{hat{sigma }}^{2}}) estimated in H0 and H1 are used.

Here the Titsias bounds for these models are given by

$${{{{{{{{mathcal{L}}}}}}}}}_{2}^{h}=left{begin{array}{ll}log {{{{{{{mathcal{N}}}}}}}}(y| 0,V)-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}},hfill &h={H}_{0},\ log {{{{{{{mathcal{N}}}}}}}}(y| 0,{V}_{d})-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{(1+{delta }_{d}){tilde{K}}_{NN}},hfill&h={H}_{1},\ log {{{{{{{mathcal{N}}}}}}}}(y| 0,{V}_{g})-frac{1}{2{sigma }^{2}}{{{{{{{rm{tr}}}}}}}}{(1+{delta }_{d}){tilde{K}}_{NN}+{delta }_{g}G{tilde{K}}_{NN}G},&h={H}_{2},end{array}right.$$

where

$${V}_{d}=V+{delta }_{d}({K}_{NM}{K}_{MM}^{-1}{K}_{MN})odot R,quad {V}_{g}={V}_{d}+{delta }_{g}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G,$$

and G=diag(g) denotes the diagonal matrix whose diagonal elements are given by the elements of g. The estimators (hat{rho }) and (hat{T}) are obtained by maximizing ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{0}}) with respect to and T, and ({hat{delta }}_{d}) and ({hat{sigma }}^{2}) are obtained by maximizing ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}) with respect to d and 2 given (hat{rho }) and (hat{T}).

It is worth noting that, when the kinship matrix R can be expressed as R=ZZ with a lower rank matrix (Z=({z}_{1},ldots ,{z}_{{N}_{d}})) with Nd

$${V}_{d}=V+{delta }_{d}left({K}_{NM}{K}_{MM}^{-1}{K}_{MN}right)odot (Z{Z}^{top })={sigma }^{2}I+A{B}^{-1}{A}^{top },$$

where

$$A = , (C,{{{{{{{rm{diag}}}}}}}}({z}_{1})C,ldots ,{{{{{{{rm{diag}}}}}}}}({z}_{D})C),quad \ B = , {{{{{{{rm{diag}}}}}}}}({K}_{MM},{delta }_{d}{K}_{MM},ldots ,{delta }_{d}{K}_{MM}),$$

and (C={K}_{NM}{K}_{MM}^{-1}), and B becomes a M(Nd+1)M(Nd+1) block diagonal matrix. Since the computational complexity of H1 or H2 is ({{{{{{{mathcal{O}}}}}}}}({N}_{d}^{2}{M}^{2}N)), for large Nd such as MNd>N, the total complexity is over ({{{{{{{mathcal{O}}}}}}}}({N}^{3})) and we again face the scalability issue.

However, if the donors in the data are unrelated, we can significantly reduce the memory usage and the computational burden to be ({{{{{{{mathcal{O}}}}}}}}({N}_{d}{M}^{2}N)). This is because the matrix A becomes a sparse matrix, with ({z}_{i}^{top }{z}_{{i}^{{prime} }}=0) for (ine {i}^{{prime} }), resulting in NM(Nd1) elements out of NMNd bing 0. Additionaly, non-zero elements of A are repeated and identical to the elements of C, and the block diagonal element of B is essentially ({K}_{MM}^{-1}).

To perform GWAS with GP, it is crucial to reduce the computational time required to map a genetic association for each variant. The Score statistic to test g=0 can be computed from the first derivative of ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}) with respect to g, and the variance parameters ({{hat{sigma }}^{2},{hat{delta }}_{d}}) of Vd are estimated from ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}) once for every single variant to be tested. Therefore, it is ideal to test tens of millions of variants independently. To use the fact that the first derivative of ({V}_{g}^{-1}) given g=0 depends only on Vd, such that

$${left.frac{partial {V}_{g}^{-1}}{partial {delta }_{g}}rightvert }_{{delta }_{g} = 0}=-{V}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G{V}_{d}^{-1},$$

the Score statistic can be explicitly written as

$$S={y}^{top }{hat{V}}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-1}{K}_{MN}G{hat{V}}_{d}^{-1}y,$$

(2)

whose distribution is the generalized 2 distribution, that is, the distribution of the weighted sum of M independent 2 statistics, such as (mathop{sum }nolimits_{m = 1}^{M}{lambda }_{m}{chi }_{m}^{2}) [8, 10]. It is known that the weights m(m=1, , M) are given by the non-negative eigenvalues of

$${K}_{MM}^{-1/2}{K}_{MN}G{hat{V}}_{d}^{-1}G{K}_{NM}{K}_{MM}^{-top /2},$$

where ({K}_{MM}^{-1/2}) can be computed using the Cholesky decomposition of ({K}_{MM}={K}_{MM}^{top /2}{K}_{MM}^{1/2}).

To compute the p-value from S, we can use the Davies exact method, implemented in the CompQuadForm package on R. Note that, if we use a linear kernel, S can be simplified as described [8]. Although the Score based approach is an easy and quick solution for genome-wide mapping, to check the asymptotic behavior and the statistical calibration of the Score statistics, we should use a QQ-plot to verify that the p-values obtained from multiple variants follow a uniform distribution under the null hypothesis.

If the collocalisation analysis [18] or Bayesian hierarchical model [19] is considered as a downstream analysis using the test statistics, a Bayes factor can also be computed using the Titsias bounds, such as

$$log (BF)={{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}-{{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}.$$

Here we would use some empirical values g={0.01, 0.1, 0.5} to average the Bayes factor, instead of integrating out g from ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{2}}) [20].

In a real genetic association mapping, most of genetic associations are indeed static and ubiquitous over the factor x. To capture such a static association, we can come up with the following model

$$y={alpha }_{0}{1}_{N}+alpha +{beta }_{0}g+beta odot g+{gamma }_{0}+gamma +varepsilon ,$$

where 0 denotes the intercept, 1N denotes the N-dimensional vector of all 1s, 0 denotes the effect size of the static genetic association, and ({gamma }_{0} sim {{{{{{{mathcal{N}}}}}}}}(0,{sigma }^{2}{delta }_{d0}R)) denotes the donor variation which confounds 0. For instance, in [8], the static genetic association 0 is modeled as a fixed effect, and the dynamic effect is tested using the Score statistic. On the other hand, in [10], the authors modeled both the static and dynamic associations as a random effect to test via a Bayes factor. In this case, the covariance matrix K can be rewritten as

$${K}^{* }={sigma }^{2}{e}^{-{rho }_{0}}{1}_{N}{1}_{N}^{top }+K$$

to estimate the model parameters in (1), and then the variance g=0 for is tested.

Note here that, the kernel parameter 0 is not necessarily common and shared across , and . Indeed, in [10], the authors estimated ({hat{rho }}_{0}^{alpha }) and ({hat{rho }}_{0}^{gamma }) independently in ({{{{{{{{mathcal{L}}}}}}}}}_{2}^{{H}_{1}}). To compute the Score statistic, the authors assumed that ({hat{rho }}_{0}^{beta }={hat{rho }}_{0}^{gamma }) for and , because the ratio of the static effect to the dynamic effect can be the same for cis and trans genetic effects.

In longitudinal studies, the factor x is typically observed explicitly (e.g., donors age or physical locations where samples were taken). This makes it straightforward to perform genetic association mapping along x using the Score statistics or Bayes factors, as described above. However, this is not often the case for the molecular studies, and therefore we need to estimate the underlying biological state from the data.

In single-cell biology, typically, the hidden cellular state x is often referred to as pseudotime", and the principal component analysis is normally used to estimate it as part of dimension reduction [21]. Gaussian process latent variable model (GPLVM) is a strong alternative to extract the pseudotime when the molecular phenotype gradually changes along pseudotime x in a nonlinear fashion [22, 23].

We have also proposed a GPLVM that uses the baseline model H0 to estimate the latent variable X from the single-cell RNA-seq data (see Section 3 for more details). Let Y=(y1,,yJ) be the gene expression matrix of J genes, whose column is a vector of gene expression for the gene j, the Titsias lower bound of the GPLVM based on the baseline model H0 can be written as

$$p(Y| X)ge {{{{{{{mathcal{MN}}}}}}}}(Y| 0,Sigma ,I+{K}_{NM}{K}_{MM}^{-1}{K}_{MN})-frac{J}{2}{{{{{{{rm{tr}}}}}}}}{{tilde{K}}_{NN}}={{{{{{{{mathcal{L}}}}}}}}}_{2}.$$

To obtain the optimal cellular state (hat{X}), this lower bound can be maximized with respect to {, X, T, } [10, 24]. Here (Sigma ={{{{{{{rm{diag}}}}}}}}({sigma }_{j}^{2};j=1,ldots ,J)) denotes the residual variance parameters of J genes, and ({{{{{{{mathcal{MN}}}}}}}}(cdot )) denotes the matrix normal distribution. Due to the uniqueness of the model parameters, the variance parameter in the kernel function is set to be 2=1. In addition, to maintain the uniqueness of the latent variable estimation, a prior probability on X is required. It is quite common to assume independent standard normal distributions for each of the elements of (X sim {{{{{{{mathcal{MN}}}}}}}}(0,I,I)) [24], although there are multiple alternatives to consider depending on the nature of the modeled data [10, 23].

In the parameter estimation, the limited-memory BFGS method can be used to implement GPLVM for large N. In addition, the stochastic variational Bayes approach can be used to fit GPLVM to larger data sets, while reducing the fitting time [25,26,27].

For the non-Gaussian output y, the Titsias bound ({{{{{{{{mathcal{L}}}}}}}}}_{2}) is not analytically available. However, for the Poisson distribution case, a lower bound of the conditional probability p(yu) can be computed as follows:

$${{{{{{{{mathcal{L}}}}}}}}}_{1}=mathop{sum}_{i}left[-log ({y}_{i}!)+{y}_{i}{bar{f}}_{i}-exp left({bar{f}}_{i}+frac{{tilde{k}}_{ii}}{2}right)right],$$

where ({tilde{k}}_{ii}) denotes the ith diagonal element of ({tilde{K}}_{NN}). Let i and wi be the working response and the iterative weight of GLM for the ith sample, such that

$${nu }_{i}={bar{f}}_{i}+({y}_{i}-{w}_{i})/{w}_{i}quad {{{{{{{rm{and}}}}}}}}quad {w}_{i}=exp left({bar{f}}_{i}+frac{{tilde{k}}_{ii}}{2}right)$$

for i=1, , N, the optimal (hat{u}) which maximizes (exp {{{{{{{{{mathcal{L}}}}}}}}}_{1}}p(u)) satisfies

$$left({K}_{MM}^{-1}+{K}_{MM}^{-1}{K}_{MN}W{K}_{NM}{K}_{MM}^{-1}right)u=Wnu ,$$

(3)

where W=diag(wi; i=1, , N), which suggests

$$nu | u sim {{{{{{{mathcal{N}}}}}}}}(bar{f},{W}^{-1})$$

as described in elsewhere [28]. Therefore, we can maximize

$${{{{{{{{mathcal{L}}}}}}}}}_{2}={{{{{{{mathcal{N}}}}}}}}(nu | 0,{W}^{-1}+{K}_{NM}{K}_{MM}^{-1}{K}_{MN})$$

with respect to {2, } where (u=hat{u}) is iteratively updated as in (3). Thus, to obtain the Score statistic for non-Gaussian y, we replace y= and ({hat{V}}_{d}={W}^{-1}+A{B}^{-1}A) in (2).

For a binary output y, it is more complicated than the Poisson case, bacause it is even impossible to analytically compute the ({{{{{{{{mathcal{L}}}}}}}}}_{1}) bound with logit or Probit link function. For logit link function, several useful alternatives to the ({{{{{{{{mathcal{L}}}}}}}}}_{1}) bound have been proposed [29]. For Probit link function [30], proposed an approximation of ({{{{{{{{mathcal{L}}}}}}}}}_{1}) using the Gauss-Hermite quadrature. However, in both cases, the computational cost is much higher than the Poisson case and it is rather impractical to conduct a large genome-wide association mapping at this moment.

See the rest here:
Genetic association mapping leveraging Gaussian processes | Journal of Human Genetics - Nature.com

Minimally destructive hDNA extraction method for retrospective genetics of pinned historical Lepidoptera specimens … – Nature.com

Pyke, G. H. & Ehrlich, P. R. Biological collections and ecological/environmental research: A review, some observations and a look to the future. Biol. Rev. 85, 247266 (2010).

Article PubMed Google Scholar

Wheeler, Q. D. et al. Mapping the biosphere: Exploring species to understand the origin, organization and sustainability of biodiversity. Syst. Biodivers. 10, 120 (2012).

Article CAS Google Scholar

Lane, M. A. Roles of natural history collections. Ann. Mo. Bot. Gard. 83, 536545 (1996).

Article Google Scholar

Raxworthy, C. J. & Smith, B. T. Mining museums for historical DNA: Advances and challenges in museomics. Trends Ecol. Evol. 36, 10491060 (2021).

Article CAS PubMed Google Scholar

Cavill, E. L., Liu, S., Zhou, X. & Gilbert, M. T. P. To bee, or not to bee? One leg is the question. Mol. Ecol. Resour. 22, 18681874 (2022).

Article CAS PubMed Google Scholar

OBrien, D. et al. Bringing together approaches to reporting on within species genetic diversity. J. Appl. Ecol. 59, 22272233 (2022).

Article Google Scholar

Pearman, P. B. et al. Monitoring of species genetic diversity in Europe varies greatly and overlooks potential climate change impacts. Nat. Ecol. Evol. 2024, 115. https://doi.org/10.1038/s41559-023-02260-0 (2024).

Article Google Scholar

Gauthier, J. et al. Museomics identifies genetic erosion in two butterfly species across the 20th century in Finland. Mol. Ecol. Resour. 20, 11911205 (2020).

Article CAS PubMed PubMed Central Google Scholar

Jensen, E. L. et al. Ancient and historical DNA in conservation policy. Trends Ecol. Evol. 37, 420429 (2022).

Article CAS PubMed Google Scholar

Fountain, T. et al. Predictable allele frequency changes due to habitat fragmentation in the Glanville fritillary butterfly. Proc. Natl. Acad. Sci. USA 113, 26782683 (2016).

Article ADS CAS PubMed PubMed Central Google Scholar

Dabney, J., Meyer, M. & Pbo, S. Ancient DNA damage. Cold Spring Harb. Perspect. Biol. 5, 012567 (2013).

Article Google Scholar

Briggs, A. W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl. Acad. Sci. USA 104, 1461614621 (2007).

Article ADS CAS PubMed PubMed Central Google Scholar

Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3e3 (2012).

Article CAS PubMed Google Scholar

Campos, P. F. & Gilbert, M. T. P. DNA extraction from keratin and chitin. Methods Mol. Biol. 1963, 5763 (2019).

Article CAS PubMed Google Scholar

Chen, F., Shi, J., Luo, Y. Q., Sun, S. Y. & Pu, M. Genetic characterization of the gypsy moth from China (Lepidoptera, Lymantriidae) using inter simple sequence repeats markers. PLoS ONE 8, e73017 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Palma, J., Valmorbida, I., da Costa, I. F. D. & Guedes, J. V. C. Comparative analysis of protocols for DNA extraction from soybean caterpillars. Genet. Mol. Res. 15, 15027094 (2016).

Article Google Scholar

Blaimer, B. B., Lloyd, M. W., Guillory, W. X. & Brady, S. G. Sequence capture and phylogenetic utility of genomic ultraconserved elements obtained from pinned insect specimens. PLoS ONE 11, e0161531 (2016).

Article PubMed PubMed Central Google Scholar

Marn, D. V., Castillo, D. K., Lpez-Lavalle, L. A. B., Chalarca, J. R. & Prez, C. R. An optimized high-quality DNA isolation protocol for spodoptera frugiperda J. E. smith (Lepidoptera: Noctuidae). MethodsX 8, 101255 (2021).

Article PubMed PubMed Central Google Scholar

Thomsen, P. F. et al. Non-destructive sampling of ancient insect DNA. PLoS ONE 4, e5048 (2009).

Article ADS PubMed PubMed Central Google Scholar

Lalonde, M. M. L. & Marcus, J. M. How old can we go? Evaluating the age limit for effective DNA recovery from historical insect specimens. Syst. Entomol. 45, 505515 (2020).

Article Google Scholar

Latorre, S. M. et al. Museum phylogenomics of extinct Oryctes beetles from the Mascarene Islands. bioRxiv https://doi.org/10.1101/2020.02.19.954339 (2020).

Article Google Scholar

Cavill, E. & Liu, S. To bee, or not to bee? One leg is the question. Mol. Ecol. Resour. 22, 18681874 (2022).

Article CAS PubMed Google Scholar

Starks, P. T. & Peters, J. M. Semi-nondestructive genetic sampling from live eusocial wasps, Polistes dominulu and Polistes fuscatu. Insectes Soc. 49, 2022 (2002).

Article Google Scholar

Gilbert, M. T. P., Moore, W., Melchior, L. & Worobey, M. DNA extraction from dry museum beetles without conferring external morphological damage. PLoS ONE 2, e272 (2007).

Article ADS PubMed PubMed Central Google Scholar

Korlevi, P. et al. A minimally morphologically destructive approach for DNA retrieval and whole-genome shotgun sequencing of pinned historic dipteran vector species. Genome Biol. Evol. 13, 226 (2021).

Article Google Scholar

Kristensent, N. P., Scoble, M. J. & Karsholt, O. Lepidoptera phylogeny and systematics: The state of inventorying moth and butterfly diversity. Zootaxa 1668, 699747 (2007).

Google Scholar

Neff, F. et al. Different roles of concurring climate and regional land-use changes in past 40 years insect trends. Nat. Commun. 13, 7611 (2022).

Article ADS CAS PubMed PubMed Central Google Scholar

Wagner, D. L., Grames, E. M., Forister, M. L., Berenbaum, M. R. & Stopak, D. Insect decline in the Anthropocene: Death by a thousand cuts. Proc. Natl. Acad. Sci. USA 118, 39891188 (2021).

Article Google Scholar

Casas-Marce, M., Revilla, E. & Godoy, J. A. Searching for DNA in museum specimens: A comparison of sources in a mammal species. Mol. Ecol. Resour. 10, 502507 (2010).

Article CAS PubMed Google Scholar

Silva, P. C., Malabarba, M. C., Vari, R. & Malabarba, L. R. Comparison and optimization for DNA extraction of archived fish specimens. MethodsX 6, 14331442 (2019).

Article PubMed PubMed Central Google Scholar

Tsai, W. L. E., Schedl, M. E., Maley, J. M. & McCormack, J. E. More than skin and bones: Comparing extraction methods and alternative sources of DNA from avian museum specimens. Mol. Ecol. Resour. 20, 12201227 (2020).

Article CAS PubMed Google Scholar

Fernndez-Vizarra, E., Enrquez, J. A., Prez-Martos, A., Montoya, J. & Fernndez-Silva, P. Tissue-specific differences in mitochondrial activity and biogenesis. Mitochondrion 11, 207213 (2011).

Article PubMed Google Scholar

Menail, H. A. et al. Flexible thermal sensitivity of mitochondrial oxygen consumption and substrate oxidation in flying insect species. Front. Physiol. 13, 897174 (2022).

Article PubMed PubMed Central Google Scholar

Ferrari, G. et al. Developing the protocol infrastructure for DNA sequencing natural history collections. Biodivers. Data J. 11, 102317 (2023).

Article Google Scholar

Hundsdoerfer, A. & Kitching, I. A method for improving DNA yield from older specimens of large Lepidoptera while minimizing damage to external and internal abdominal characters. Arthropod. Syst. Phylogeny 68, 151155 (2010).

Article Google Scholar

Twort, V. G., Minet, J., Wheat, C. W. & Wahlberg, N. Museomics of a rare taxon: Placing Whalleyanidae in the Lepidoptera Tree of Life. Syst. Entomol. 46, 926937 (2021).

Article Google Scholar

Bhler-Cortesi, T. & Wymann, H. P. Schmetterlinge: Tagfalter Der Schweiz. (Verlag Paul Haupt, 2019).

Smith, A. D. et al. Recovery and analysis of ancient beetle DNA from subfossil packrat middens using high-throughput sequencing. Sci. Rep. 11, 12635 (2021).

Article CAS PubMed PubMed Central Google Scholar

Gutaker, R. M., Reiter, E., Furtwngler, A., Schuenemann, V. J. & Burbano, H. A. Extraction of ultrashort DNA molecules from herbarium specimens. Biotechniques 62, 7679 (2017).

Article CAS PubMed Google Scholar

Caldern-Corts, N., Quesada, M., Cano-Camacho, H. & Zavala-Pramo, G. A simple and rapid method for DNA isolation from xylophagous insects. Int. J. Mol. Sci. 11, 50565064 (2010).

Article PubMed PubMed Central Google Scholar

Caligiuri, L. G. et al. Optimization of DNA extraction from individual sand flies for PCR amplification. Methods Protoc. 2, 115 (2019).

Article Google Scholar

El-Ashram, S., Al Nasr, I. & Suo, X. Nucleic acid protocols: Extraction and optimization. Biotechnol. Rep. 12, 3339 (2016).

Article Google Scholar

Healey, A., Furtado, A., Cooper, T. & Henry, R. J. Protocol: A simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods 10, 21 (2014).

Article PubMed PubMed Central Google Scholar

Poinar, H. N. et al. Molecular coproscopy: Dung and diet of the extinct ground sloth Nothrotheriops shastensis. Science 1979(281), 402406 (1998).

Article ADS Google Scholar

Jaenicke-Desprs, V. et al. Early allelic selection in maize as revealed by ancient DNA. Science 1979(302), 12061208 (2003).

Article ADS Google Scholar

Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl. Acad. Sci. USA 110, 1575815763 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Schubert, M. et al. Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat. Protoc. 9, 10561082 (2014).

Article CAS PubMed Google Scholar

Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: Rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).

Article PubMed PubMed Central Google Scholar

Vasimuddin, M., Misra, S., Li, H. & Aluru, S. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 314324 (IEEE, 2019). https://doi.org/10.1109/IPDPS.2019.00041.

Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: Fast processing of NGS alignment formats. Bioinformatics 31, 20322034 (2015).

Article CAS PubMed PubMed Central Google Scholar

Link, V. et al. ATLAS: Analysis tools for low-depth and ancient samples. bioRxiv. https://doi.org/10.1101/105346 (2017).

Jnsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage20: Fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 16821684 (2013).

Read the original:
Minimally destructive hDNA extraction method for retrospective genetics of pinned historical Lepidoptera specimens ... - Nature.com

Restless legs syndrome tied to 140 ‘hotspots’ in the genome – Livescience.com

Researchers have uncovered more than 140 sections of the human genome tied to restless legs syndrome (RLS), a neurological condition that affects up to 10% of the U.S. population.

These stretches of DNA in the genome are known as genetic risk loci, and prior to the new study, only 22 were known to be tied to RLS. The new research, published Wednesday (June 5) in the journal Nature Genetics, increases that number to 164.

Three of the newfound risk loci are located on the X chromosome, which females typically carry two of in each cell while males carry only one. RLS is more common among women than men, but based on their new results, the researchers don't think this difference is explained by the trio of risk loci on the X chromosome.

"This study is the largest of its kind into this common but poorly understood condition," Steven Bell, co-senior study author and an epidemiologist at the University of Cambridge, said in a statement. "By understanding the genetic basis of restless legs syndrome, we hope to find better ways to manage and treat it, potentially improving the lives of many millions of people affected worldwide."

Related: 10 unexpected ways Neanderthal DNA affects our health

This discovery could also be used to help predict a person's risk of developing RLS, the study authors wrote in their paper.

RLS, also called Willis-Ekbom disease, causes people to experience an unpleasant crawling or creeping sensation in their legs, as well as the irresistible urge to move them. These sensations are often more intense in the evening or at night, while people are resting. The condition is thought to be underdiagnosed, and when it is diagnosed, its exact cause is often unknown. RLS can arise due to another condition, such as iron deficiency, kidney disease or Parkinson's, and it's likely tied to dysfunction in part of the brain that uses dopamine to control movement.

Get the worlds most fascinating discoveries delivered straight to your inbox.

There is currently no cure for RLS, but certain treatments, such as anti-seizure drugs, can help ease a person's symptoms.

In the new study, the researchers pooled the data from several, enormous genome-wide association studies, which compare the DNA of people with a given disease to that of people without it. In all, the new research included data from more than 116,000 people who had RSL with more than 1.5 million people without the condition.

Notably, all those included were of European ancestry, which may limit the relevance of the findings in other demographics.

The researchers found no strong differences in genetic risk factors between the sexes, even though RLS is more common in women. They think this suggests that RLS is governed by a combination of genetic, environmental and hormonal factors, so the genetic risk loci don't dictate a person's risk in isolation.

Among the newfound risk loci, the team hunted for genes that might already be targeted by existing approved drugs the goal was to find treatments that potentially be given to patients in the near future.

They found 13 risk loci targeted by existing drugs, including two genes that code for so-called glutamate receptors. These receptors are proteins found on nerve cells that play a vital role in the transmission of signals throughout the nervous system. Preliminary clinical trials suggest that targeting these two receptor genes with anti-epileptic drugs namely, perampanel and lamotrigine can benefit some patients with RLS.

In addition to identifying potential drugs, the team ran a statistical analysis to see if RLS raises the risk of any other conditions. This suggested that RLS may be a risk factor for developing type 2 diabetes, although past studies on the potential link have found mixed results. As such, "these results should not be overinterpreted," the researchers cautioned they need to be confirmed in future research.

Despite their limitations, the findings may bring doctors one step closer to being able to predict someone's risk of developing RLS and understanding the wider impacts the condition has on people's health, the team said.

Ever wonder why some people build muscle more easily than others or why freckles come out in the sun? Send us your questions about how the human body works to community@livescience.com with the subject line "Health Desk Q," and you may see your question answered on the website!

Read the rest here:
Restless legs syndrome tied to 140 'hotspots' in the genome - Livescience.com

Paired tumor-germline testing can enhance patient carewith guidance from genetics specialists – The Cancer Letter

Data from the IMROZ phase III trial demonstrated Sarclisa (isatuximab) in combination with standard-of-care bortezomib, lenalidomide and dexamethasone followed by Sarclisa-Rd (the IMROZ regimen) significantly reduced the risk of disease progression or death by 40%, compared to VRd followed by Rd in patients with newly diagnosed multiple myeloma not eligible for transplant.

Originally posted here:
Paired tumor-germline testing can enhance patient carewith guidance from genetics specialists - The Cancer Letter

Improved functional mapping of complex trait heritability with GSA-MiXeR implicates biologically specific gene sets – Nature.com

Sullivan, P. F. & Geschwind, D. H. Defining the genetic, genomic, cellular, and diagnostic architectures of psychiatric disorders. Cell 177, 162183 (2019).

Article CAS PubMed PubMed Central Google Scholar

de Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties of gene-set analysis. Nat. Rev. Genet. 17, 353364 (2016).

Article PubMed Google Scholar

Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545 (2005).

Article CAS PubMed PubMed Central Google Scholar

Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 2529 (2000).

Article CAS PubMed PubMed Central Google Scholar

Koopmans, F. et al. SynGO: an evidence-based, expert-curated knowledge base for the synapse. Neuron 103, 217234.e4 (2019).

Article CAS PubMed PubMed Central Google Scholar

Hill, W. D. et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Mol. Psychiatry 24, 169181 (2019).

Article CAS PubMed Google Scholar

Howard, D. M. et al. Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nat. Neurosci. 22, 343352 (2019).

Article CAS PubMed PubMed Central Google Scholar

Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502508 (2022).

Article CAS PubMed PubMed Central Google Scholar

de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).

Article PubMed PubMed Central Google Scholar

Simillion, C., Liechti, R., Lischer, H. E. L., Ioannidis, V. & Bruggmann, R. Avoiding the pitfalls of gene set enrichment analysis with SetRank. BMC Bioinform. 18, 151 (2017).

Article Google Scholar

Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 12281235 (2015).

Article CAS PubMed PubMed Central Google Scholar

Goeman, J. J. & Bhlmann, P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980987 (2007).

Article CAS PubMed Google Scholar

Tashman, K. C., Cui, R., OConnor, L. J., Neale, B. M. & Finucane, H. K. Significance testing for small annotations in stratified LD-Score regression. Preprint at medRxiv https://doi.org/10.1101/2021.03.13.21249938 (2021).

Speed, D., Cai, N., Johnson, M. R., Nejentsev, S. & Balding, D. J. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986992 (2017).

Article CAS PubMed PubMed Central Google Scholar

Zabad, S., Ragsdale, A. P., Sun, R., Li, Y. & Gravel, S. Assumptions about frequency-dependent architectures of complex traits bias measures of functional enrichment. Genet. Epidemiol. 45, 621632 (2021).

Article CAS PubMed Google Scholar

Frei, O. et al. Bivariate causal mixture model quantifies polygenic overlap between complex traits beyond genetic correlation. Nat. Commun. 10, 2417 (2019).

Article PubMed PubMed Central Google Scholar

Holland, D. et al. Beyond SNP heritability: polygenicity and discoverability of phenotypes estimated with a univariate Gaussian mixture model. PLoS Genet. 16, e1008612 (2020).

Article CAS PubMed PubMed Central Google Scholar

Shadrin, A. A. et al. Phenotype-specific differences in polygenicity and effect size distribution across functional annotation categories revealed by AI-MiXeR. Bioinformatics 36, 47494756 (2020).

Article CAS PubMed PubMed Central Google Scholar

Holland, D. et al. The genetic architecture of human complex phenotypes is modulated by linkage disequilibrium and heterozygosity. Genetics 217, iyaa046 (2021).

Article PubMed PubMed Central Google Scholar

Kingma, D.P. & Ba, J. L. Adam: a method for stochastic optimization. arXiv (2014).

Chen, J. et al. The trans-ancestral genomic architecture of glycemic traits. Nat. Genet. 53, 840860 (2021).

Article CAS PubMed PubMed Central Google Scholar

Clarke, T. K. et al. Genome-wide association study of alcohol consumption and genetic overlap with other health-related traits in UK Biobank (N=112117). Mol. Psychiatry 22, 13761384 (2017).

Article CAS PubMed PubMed Central Google Scholar

de Lange, K. M. et al. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat. Genet. 49, 256261 (2017).

Article PubMed PubMed Central Google Scholar

Evangelou, E. et al. Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat. Genet. 50, 14121425 (2018).

Article CAS PubMed PubMed Central Google Scholar

Hautakangas, H. et al. Genome-wide analysis of 102,084 migraine cases identifies 123 risk loci and subtype-specific risk alleles. Nat. Genet. 54, 152160 (2022).

Article CAS PubMed PubMed Central Google Scholar

Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 15051513 (2018).

Article CAS PubMed PubMed Central Google Scholar

Mishra, A. et al. Stroke genetics informs drug discovery and risk prediction across ancestries. Nature 611, 115123 (2022).

Article CAS PubMed PubMed Central Google Scholar

Okbay, A. et al. Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nat. Genet. 54, 437449 (2022).

Article CAS PubMed PubMed Central Google Scholar

Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. 50, 912919 (2018).

Article CAS PubMed PubMed Central Google Scholar

Shah, S. et al. Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure. Nat. Commun. 11, 163 (2020).

Article CAS PubMed PubMed Central Google Scholar

The, C.-H.G.I. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715718 (2020).

Article Google Scholar

Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 13391348 (2019).

Article CAS PubMed Google Scholar

Wightman, D. P. et al. A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimers disease. Nat. Genet. 53, 12761282 (2021).

Article CAS PubMed PubMed Central Google Scholar

Wuttke, M. et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 51, 957972 (2019).

Article CAS PubMed PubMed Central Google Scholar

Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ~ 700000 individuals of European ancestry. Hum. Mol. Genet. 27, 36413649 (2018).

Article CAS PubMed PubMed Central Google Scholar

Smeland, O. B., Frei, O., Dale, A. M. & Andreassen, O. A. The polygenic architecture of schizophreniarethinking pathogenesis and nosology. Nat. Rev. Neurol. 16, 366379 (2020).

Article PubMed Google Scholar

Nakazawa, K. et al. GABAergic interneuron origin of schizophrenia pathophysiology. Neuropharmacology 62, 15741583 (2012).

Article CAS PubMed Google Scholar

Stedehouder, J. & Kushner, S. A. Myelination of parvalbumin interneurons: a parsimonious locus of pathophysiological convergence in schizophrenia. Mol. Psychiatry 22, 412 (2017).

Article CAS PubMed Google Scholar

Berrandou, T.-E., Balding, D. & Speed, D. LDAK-GBAT: fast and powerful gene-based association testing using summary statistics. Am. J. Hum. Genet. 110, 2329 (2023).

Article CAS PubMed Google Scholar

Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 14211427 (2017).

Article CAS PubMed PubMed Central Google Scholar

Moon, A. L., Haan, N., Wilkinson, L. S., Thomas, K. L. & Hall, J. CACNA1C: association with psychiatric disorders, behavior, and neurogenesis. Schizophr. Bull. 44, 958965 (2018).

Article PubMed PubMed Central Google Scholar

Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509516 (2022).

Article CAS PubMed PubMed Central Google Scholar

Howes, O. D. & Kapur, S. The dopamine hypothesis of schizophrenia: version IIIthe final common pathway. Schizophr. Bull. 35, 549562 (2009).

Article PubMed PubMed Central Google Scholar

Fusar-Poli, P. & Meyer-Lindenberg, A. Striatal presynaptic dopamine in schizophrenia, part II: meta-analysis of [18F/11C]-DOPA PET studies. Schizophr. Bull. 39, 3342 (2013).

Article PubMed Google Scholar

Huhn, M. et al. Comparative efficacy and tolerability of 32 oral antipsychotics for the acute treatment of adults with multi-episode schizophrenia: a systematic review and network meta-analysis. Lancet 394, 939951 (2019).

Article CAS PubMed PubMed Central Google Scholar

Harrison, P. J. Schizophrenia susceptibility genes and neurodevelopment. Biol. Psychiatry 61, 11191120 (2007).

Article PubMed Google Scholar

Burch, K. S. et al. Partitioning gene-level contributions to complex-trait heritability by allele frequency identifies disease-relevant genes. Am. J. Hum. Genet. 109, 692709 (2022).

Article CAS PubMed PubMed Central Google Scholar

Yao, D. W., OConnor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626633 (2020).

Article CAS PubMed PubMed Central Google Scholar

Siewert-Rocks, K. M., Kim, S. S., Yao, D. W., Shi, H. & Price, A. L. Leveraging gene co-regulation to identify gene sets enriched for disease heritability. Am. J. Hum. Genet. 109, 393404 (2022).

Article CAS PubMed PubMed Central Google Scholar

Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245252 (2016).

Article CAS PubMed PubMed Central Google Scholar

Zhu, X. & Stephens, M. Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nat. Commun. 9, 4361 (2018).

Article PubMed PubMed Central Google Scholar

Link:
Improved functional mapping of complex trait heritability with GSA-MiXeR implicates biologically specific gene sets - Nature.com

Genetic Risk Score Revolutionizes TNBC Prediction in Black Women – Targeted Oncology

Black women in the U.S. often face a higher risk of developing aggressive breast cancer, particularly triple-negative breast cancer (TNBC), which can occur before routine screening is recommended. To address this disparity, accurate risk prediction methods are crucial. A multiple-ancestry polygenic risk score (MA-PRS), developed from genetic data of diverse populations, has shown promise in predicting overall breast cancer risk. In this study, researchers assessed the effectiveness of MA-PRS in predicting TNBC and early-onset TNBC in a large cohort of self-reported Black women.

Analyzing data from over 14,000 eligible participants, predominantly under 50 years old, the study found that MA-PRS significantly improved TNBC risk prediction beyond clinical factors alone. Specifically, women in the top 5% of MA-PRS distribution had roughly twice the risk of TNBC compared to the general population. Importantly, MA-PRS demonstrated comparable impact to mammographic density, a well-established risk factor for breast cancer.

The findings suggest that incorporating MA-PRS into breast cancer risk assessment could enhance early detection and potentially improve survival rates for TNBC among Black women. By accurately identifying those at elevated risk, interventions and screening strategies can be tailored more effectively, addressing a critical need in breast cancer management for this demographic.

Here, Holly Pederson, MD, breast medical oncologist at Cleveland Clinic, and Elisha Hughes, PhD, director of biostatistics at Myriad Genetics, discuss the findings and implications from this study presented at ASCO 2024.

Transcription:

0:05 | The polygenic score was really powerful risk stratifier, or it really explains a lot of the genetic susceptibility that many women have for, you know, overall breast cancer and specifically triple-negative disease. About as powerful as everything else combined with the exception of maybe mammographic density, and the polygenic score and mammographic density are both, I would say equally powerful risk stratifiers.

0:30 | This may change, help to change, screening recommendations even, because it shouldn't just be based on age, but also on ancestry and genetics. I mean, it only makes sense. The other, you know, the other main implication is that we are looking to evaluate young women and identify those families that seem as if they may have a heritable disorder to prevent future cancers. But we'd also love to identify the woman who might be at risk. And and it's, it's about 6% of women who really fall into that high-risk category. But that's an important 6%. So we'd like to make a difference there.

Go here to see the original:
Genetic Risk Score Revolutionizes TNBC Prediction in Black Women - Targeted Oncology

Gene variants and breast cancer risk in Black women – National Institutes of Health (NIH) (.gov)

June 4, 2024

Breast cancer is the most often diagnosed cancer in many parts of the world, including the U.S. More than 310,000 new cases are expected nationwide this year.

Black women tend to develop breast cancer at a younger age than White women. Black women are also more likely than Whites to die from the disease, and they are twice as likely to develop an aggressive subtype called triple-negative breast cancer. But despite the increased risks faced by women of African descent, most large-scale genetic studies of breast cancer to date have focused on women of European ancestry.

To better understand their unique genetic risks, a research team led by Dr. Wei Zheng of Vanderbilt University analyzed genetic data from over 40,000 females of African descent. About 18,000 had been diagnosed with breast cancer. The data were gathered as part of the NIH-funded African Ancestry Breast Cancer Genetic consortium, which combined data from 26 studies. Most participants (85%) were African Americans. The rest were from Barbados or Africa.

The researchers conducted a genome-wide association study (GWAS) to look for genetic variants that are found more often in participants with breast cancer than in those without. This is believed to be the largest GWAS study to date of breast cancer in this population. Results were reported in Nature Genetics on May 13, 2024.

The analysis pinpointed 12 genetic regions, or loci, associated with breast cancer. Three of these loci were linked to the aggressive triple-negative cancer. About 8% of the women carried two genetic copies of risk variants in all three of these loci. Such women, the researchers found, were 4.2 times more likely to be diagnosed with triple-negative breast cancer than women who hadonly one or no copies of the variants.

Because this type of cancer lacks specific cell receptors often seen with breast cancer (like estrogen or HER2 receptors), there are fewer targeted options for treatment. These findings may help researchers identify new treatment targets.

The researchers also confirmed many breast cancer risk variants that were found earlier in other populations.And they identified an uncommon risk variant in the gene ARHGEF38, which had been previously linked to aggressive prostate and lung cancers.

The scientists used their findings to create polygenic risk scores (PRS) for breast cancer risk in females of African descent. PRS use genomic data to gauge the chance that a person will develop a certain medical condition. PRS created previously, using results from other populations, tend to perform poorly at predicting breast cancer risk for Black women. The new PRS, based on genomic data from African descendants, outperformed previous PRS at predicting breast cancer risk in this population.

The findings and data could lead to improved detection of breast cancer in this at-risk population and provide clues for potential treatment targets. Studies with even larger, more diverse populations will be needed to further improve the prediction of breast cancer risk.

We have worked with researchers from more than 15 institutions in the U.S. and Africa to establish this large genetic consortium, Zheng says. Data put together in this consortium have been and will continue to be used by researchers around the world.

by Vicki Contie

References:Genome-wide association analyses of breast cancer in women of African ancestry identify new susceptibility loci and improve risk prediction. Jia G, Ping J, Guo X, Yang Y, Tao R, Li B, Ambs S, Barnard ME, Chen Y, Garcia-Closas M, Gu J, Hu JJ, Huo D, John EM, Li CI, Li JL, Nathanson KL, Nemesure B, Olopade OI, Pal T, Press MF, Sanderson M, Sandler DP, Shu XO, Troester MA, Yao S, Adejumo PO, Ahearn T, Brewster AM, Hennis AJM, Makumbi T, Ndom P, O'Brien KM, Olshan AF, Oluwasanu MM, Reid S, Butler EN, Huang M, Ntekim A, Qian H, Zhang H, Ambrosone CB, Cai Q, Long J, Palmer JR, Haiman CA,Zheng W. Nat Genet. 2024 May;56(5):819-826. doi: 10.1038/s41588-024-01736-4. Epub 2024 May 13. PMID:38741014.

Funding:NIHs National Cancer Institute (NCI).

More here:
Gene variants and breast cancer risk in Black women - National Institutes of Health (NIH) (.gov)