Culture collection and isolate novelty
Lower airway bacteria were cultivated from bronchoscopic brushings from two asthmatics and three healthy individuals from the Celtic Fire Study (described below). We used a limited range of media with and without 0.5% mucin, followed by incubation in a standard atmosphere or an anaerobic workstation to capture 706 isolates. Those without overlapping 16S rRNA gene sequences were transferred to the Wellcome Sanger Institute and the whole-genome sequenced with assembly using Bactopia (v 1.4.11).
We cultured 651 isolates, 256 of which were successfully whole-genome sequenced. Of these, five sequences appeared mixed and were excluded. After removing duplicates on a 99.5% nucleotide identity threshold, 126 unique strains remained. The Bactopia quality report for the genome assemblies is reported in Supplementary Data1. Forty-four isolates were annotated to species level in accordance with MIGA24 (TypeMat and NCBIProk) and with GTDBtk. A further 30 species were identified by either MIGA (TypeMat and NCBIProk) or GTDBtk. The genome completeness and the contamination percentage were tested within the MIGA pipeline aligning 106 bacterial core genes25 (Supplementary Data2).
All isolates were assigned to genera in the TypeMat or NCBI prokaryotes database with P<0.05. Among these samples, we classified 49 Streptococcus, ten Veillonella, nine each of Gemella and Rothia, eight Prevotella, six each of Neisseria, Micrococcus and Pauljensenia, five each of Haemophilus and Staphylococcus, three Granulicatella, two each of Actinomyces, Cutibarterium and Fusobacterium and one Cuprividis, Leptotrichia, Microbacterium and Niallia, respectively (Fig.1a).
a Culture collection phylogeny based on average nucleotide identities between genomes with 1000bp fragment length. Putatively novel species are highlighted in red (indicating that it is not related to any species in the TypeMat DB or NCBI Prok DB (P<0.05) when assessed using MIGA and not assigned to a known species or incongruent species assignment using gtdbtk). Greyed-out isolates are not fully supported by MIGA and gtdbtk. Genome completeness and contamination are displayed as a bar chart. AMR finder was used to identify antimicrobial resistance genes at the protein level (red panel). Virulence factors were identified using the VFDB and Ariba databases and binned into 15 categories (heatmap). The asthma status of the host is indicated in the black asthma/control panel. Cultivation conditions are indicated in green circles for selected growth media, blue rectangles for aerobic, and white rectangles for anaerobic cultivation. Positive Gram staining for GNB, GNC, GPB, GPC, and other Gram staining is shown in black circles. The neuraminidase activity was tested if a blue star was present and was filled for the positive test and white for a negative test. b Taxonomic novelty as calculated by MIGA using TypeMat reference. The scatterplot shows support (P-value, vertical axis) for each taxon relative to complementary hypotheses that this taxon is a previously known one (red markers) or a novel one (cyan markers) at each taxonomic level (horizontal axis). Many of the isolate collections constitute novel species within known genera. c Composition of bacteria isolated and cultivated from five subjects. Counts are shown for all lineages from species level (outer circle) to phylum level (inner circle) in squared brackets. The ETE3 toolkit was used to fetch taxonomic lineages for all genera of cultured isolates101. The number of unique species was summed up and visualised along with their lineages using Krona tools102.
We defined a new species when isolates could not be assigned to known species in reference databases24. We classified isolates as putatively novel species when they exhibited no close relation to any species in the TypeMat or NCBI Prokaryotic Databases, determined by the MIGA tool with a P-value threshold of 0.05 and an incongruent species assignment indicated by gtdbtk.
Fifty-two isolates could not be assigned with P<0.05 to known species in the reference databases24 (Fig.1b). Twenty-eight of the putative novel species were contained within the Streptococcus genus, six within Pauljensenia (not previously recognised to be prevalent in the airways), and four each within Neisseria and Gemella (Fig.1c and Supplementary Data1).
Comparison of the entire sequences of our streptococcal isolates with 2477 public Streptococcus spp. sequences showed that the organisms were widely distributed amongst S. infantis, S. oralis, S. mitis, S. pseudopneumoniae, S. sanguinis, S. parasanguinis, and S. salivarius (Supplementary Fig.2).
We used the eggNOG (evolutionary genealogy of genes, Non-supervised Orthologous Groups) mapper tool (as previously for large-scale systematic genome annotations26) to assign by transfer 5,531 Kegg Ontology (KO) annotations for the 126 isolates. We encoded these in a binary matrix indicating presence or absence (Supplementary Data3) and constructed an isolate phylogeny after removing 254 zero-variance KOs (either present or absent in all isolates) and reducing identical KO presence/absence to single examples before hierarchical clustering with the Manhattan distance metric and complete linkage. The Dynamic Tree Cut algorithm27 identified 15 clusters of isolates that recovered known phylogenetic relationships (Fig.2a). Based on the observed 16S rRNA gene sequence similarity, we further divided one Streptococcus cluster into two (Strep I and Strep II, Fig.2a). Relative KO enrichment was estimated for each of the 16 clusters by contingency table analysis.
a Mapping of the 50 most abundant OTUs onto 126 novel airway isolates. Isolates are grouped into 16 clusters according to the distance and branching order of their inferred Kegg Ontology (KO) gene content. OTU/isolate nt identity is shown as 9597% (light blue), 9799% (medium blue), and 100% (dark blue). The complex relationship between OTUs and isolates reflects multiple copies of the 16S rRNA gene in different taxa, but in general, captures KO phylogenetic structures. b Comparison of abundance (left) and prevalence right) of bacterial OTUs in populations from northern (CELF) and southern (BUS) hemispheres. The species distribution is similar between the CELF and BUS studies. c Comparison of abundance (left) and prevalence right) of bacterial OTUs in the posterior oropharynx (ptOP) and the left lower lobe (LLL) in CELF subjects. The relative abundance of organisms in ptOP is very similar to those in the LLL, although absolute abundance is an order of magnitude lower in the LLL. Lower abundance OTUs in the CELF dataset are more prevalent in the upper than lower airways. d Spearman correlations between the abundance of organisms in the CELF ptOP samples, showing a high degree of positive and negative relationships between OTUs that is the basis of WGCNA network analysis. Common phyla are colour coded at the top of the matrix, and WGCNA modules (named for the most abundant membership) are at the bottom. Network module membership may be dominated by a single phylum (e.g., the Haemophilus or Streptococcus modules) or contain mixed phyla (e.g., the Veillonella module).
Annotation for the 5277 informative KOs (including duplicates removed during clustering) (Supplementary Data4) identified 247 uncharacterised proteins (Supplementary Data4). Features of particular interest among the known genes are summarised below.
Biofilm formation is a feature of respiratory pathogens, archetypically Pseudomonas spp. in patients with cystic fibrosis. Biofilm-associated genes were also common in the commensal collection (Supplementary File4b). Ninety genes were annotated with biofilm in their KO pathway descriptions, with cysE (serine O-acetyltransferase), vpsU (tyrosine-protein phosphatase), luxS (S-ribosylhomocysteine lyase), trpE (anthranilate synthase component I) and PYG (glycogen phosphorylase) present in >75% of isolates. Amongst the most abundant organisms, Haemophilus and Prevotella spp. had distinctive profiles of other biofilm pathway genes (Supplementary Data4).
Many of our isolates contained known genes for antimicrobial resistance (AMR) against tetracyclines and macrolides. Staphylococcus, Prevotella and Haemophilus spp. also possessed beta-lactam resistance (Fig.1a and Supplementary Data4). Virulence factors and toxins were concentrated in Streptococcus, Staphylococcus, Haemophilus, and Neisseria spp. (Fig.1a and Supplementary Data4). Although these annotations neither guarantee that the genes in question are expressed nor that they drive clinically relevant AMR or virulence, they do indicate such potential.
Competition between bacteria is fundamental to maintaining stable communities28. Genes with a KO pathway annotation for antibiotic synthesis (n=33) were present in many genera (Supplementary Data4). Arachin biosynthetic genes included acpP (acyl carrier protein) which was present in 120 isolates and auaG in seven (mostly Staphylococcus spp); rifB (rifamycin polyketide synthase) present in 20 (Veillonella and Staphylococcus spp.); BacF (bacilysin biosynthesis transaminase) present in 12 (Staphylococcus and Gemella spp.); and sgcE5 (enediyne biosynthesis protein E5) present in 12, mostly Haemophilus spp. Bacteriocin exporter genes blpB and blpA were present in 35 and 31 isolates respectively, predominately Streptococcus and Pauljensenia spp. (Supplementary Data4).
Toxins and antitoxin genes were common in the collection (Supplementary Data4), without distinctive enrichment in particular genera. They included homologues of antitoxin YefM (57 isolates); exfoliative toxin A/B eta, (57 isolates); toxin YoeB (51isolates); antitoxins HigA-1 (31) and HigA (30); antitoxin PezA (26); toxin RtxA (15); antitoxin HipB (14); toxin YxiD (13); antitoxin CptB (12); antitoxin Phd (11); and toxin FitB (10). These have not been previously recognised in commensal organisms and differ from the toxin spectrum of known airway pathogens29. They may have significant influences on the mucosa as well as other organisms.
Nitric oxide (NO) is a central host signalling molecule in the airways, where it mediates bronchodilation, vasodilation, and ciliary beating30. NO exhibits cytostatic or cytocidal activity against many pathogenic microorganisms31 and NO elevation in exhaled breath is used as a clinical marker for lower airway inflammation. Many isolate genes encoded NO reductases (Supplementary Data4), including norB (27 isolates); norV (11), norQ (5), norC (1) and norR (1). The hmp gene, encoding a NO dioxygenase, was present in 39 organisms. These enzymes may mitigate the antimicrobial activities of NO or affect host bronchodilation and mucus flow.
Iron is an essential nutrient for humans and many microbes and is a catalyst for respiration and DNA replication32. Host regulation of iron distribution through many mechanisms serves as an innate immune mechanism against invading pathogens (nutritional immunity)32.
We identified 47 genes with iron in their KO name (Supplementary Data2f). Those found in >75% of isolates were afuC (iron (III) transport system ATP-binding protein), ABC.FEV.P (iron complex transport system permease protein), ABC.FEV.S (substrate-binding protein), and ABC.FEV.A (ATP-binding protein). A further 19 genes were identified as members of haem pathways (Supplementary Data4).
Haemophilus spp. require haem for aerobic growth and possess multiple mechanisms to obtain this essential nutrient. These genes may play essential roles in Haemophilus influenzae virulence33. In our isolate collection sitC and sitD (manganese/iron transport system permease proteins) and fieF (a ferrous-iron efflux pump) were only found in Haemophilus spp., as were ccmA, ccmB, ccmC, ccmD (haem exporter proteins A, B, C and D) and hutZ (haem oxygenase). These are potential therapeutic targets.
The sphingolipids constitute an important class of bioactive lipids and include ceramide and sphingosine-1-phosphate (S1P). Ceramide is a hub in sphingolipid metabolism and mediates growth inhibition, apoptosis, differentiation, and senescence. S1P is a key regulator of cell motility and proliferation34.
Sphingolipids play significant roles in host antiviral responses35,36 and resistance to intracellular bacteria37. Their importance in humans is exemplified by a major childhood asthma susceptibility locus that upregulates ORMDL3 expression38. ORMDL3 protein acts as a rate-limiting step in sphingolipid synthesis39 and the ORMDL3 locus greatly increases the risk of HRV-induced acute asthma40.
De novo synthesis of sphingolipids is recognised in human bowel bacteria41 and maintains intestinal homoeostasis and microbial symbiosis42. In the skin, commensal S. epidermidis sphingomyelinase makes a crucial contribution to skin barrier homoeostasis43. Based on KO annotations, we did not find obvious SPT homologues in our isolates but identified 12 genes with putative roles in sphingolipid metabolism (Supplementary Data4). Of these, SPHK (sphingosine kinase, present in 12 isolates) which metabolises sphingosine to produce S1P; and ASAH2 (neutral ceramidase, present in seven isolates) have potential roles in modifying host inflammation and repair. These may interact with the ORMDL3 disease risk alleles described above.
Several genes present in the isolates may directly affect host immunity. These were enriched in Prevotella spp. (Supplementary Data4) and included immune inhibitor A (ina), a neutral metalloprotease secreted to degrade antibacterial proteins; Spa (immunoglobulin G-binding protein A), sbi (immunoglobulin G-binding protein Sbi); omp31 (outer membrane immunogenic protein); blpL (immunity protein cagA); and impA (immunomodulating metalloprotease).
A conserved commensal antigen, -hexosaminidase (HEXA_B), has a major role in induction of anti-inflammatory intestinal T lymphocytes44, and is present in 59 of our isolates with enrichment in Prevotella, Streptococcus and Pauljensenia spp.
Systemic lupus erythematosus (SLE) and Sjgren syndrome are chronic autoimmune inflammatory disorders with multiorgan effects. Lung involvement is common during the course of the disease45. Our Neisseria isolates contain a 60kDa SS-A/Ro ribonucleoprotein (Supplementary Data4) that is an ortholog to the human RO60 gene, a frequent target of the autoimmune response in patients with SLE and Sjgrens syndrome.
Other bacterial genomes contain potential Ro orthologs46, and a bacterial origin of SLE autoimmunity has been suggested47. Here, the abundance of Neisseria spp. in human airways and their close proximity to the mucosa are of interest, as is a recent report that the lung microbiome regulates brain autoimmunity48, and an earlier observation that T cells become licensed in the lung to enter the central nervous system49.
It is relevant that products of cognate microbial-immune interactions in the airways have direct access to the general arterial circulation through the left side of the heart, whereas molecules and cells carried in venous blood from the gut undergo extensive filtration and metabolism in the liver before accessing more distant sites.
Most respiratory viruses, including SARS2-Cov-19, have RNA genomes, and RNA-targeting CRISPR vectors have the potential to prevent or treat viral infections50. Type III RNA-targeting system elements (such as cas10, cas7, csm2 and csm5)51 are present in our isolates (particularly Fusobacteria and Prevotella spp.), as is the Type II system element cas9 (Supplementary Data4).
We sought context for our culture collection within the ecological variation of different geographic and anatomical locations. We studied airway microbial communities in 66 asthmatics and 44 normal subjects recruited from centres in Dublin (48 subjects), Swansea (46 subjects) and London (16 subjects) (collectively known as the Celtic Fire Study (CELF)). Swabs were taken from the posterior oropharynx (ptOPs) and bronchoscopic brushings from the left lower lobe (LLL) in all subjects. When tolerated, the left upper lobe (LUL) was also brushed in 52 subjects. We compared the European CELF microbial communities to 527 ptOP samples from an adult population sample in Busselton, West Australia (BUS)18. Operational Taxonomic Units (OTUs) were identified by sequencing the 16S rRNA gene amplicon and compared with the assembled genomes from our culture collection.
In the CELF ptOP samples, 17 operational taxonomic units (OTUs) covered >70% of the abundance and 41 OTUs covered >85% (Supplementary Data5). Coverage was less in LLL and LUL samples (respectively 64% and 50% at the 70% threshold), due to the expansion of H. influenzae (OTU Haemophilus_14694) and Tropheryma whipplei (OTU Glutamicibacter_5653) in the pulmonary samples, particularly those from asthmatics (Supplementary Data5).
Fifteen of the 17 most abundant OTUs were mapped to at least one isolate using a 99% nucleotide (nt) identity, and eleven of the next 24 OTUs were mapped to a cultured organism. Genera of moderate abundance (2.8%-0.4% of the total) yet to be cultivated include Fusobacterium, Selenomonas, Alloprevotella, Porphyromonas, Leptotrichiaceae, Megasphaera, Lachnospiraceae, Solobacterium, and Capnocytophaga.
OTUs corresponding to isolates for Staphylococcus, Micrococcus and Cupriavidus spp. had minimal representation in the community OTU analyses, although Staphylococcus aureus is a recognised lung pathogen. Their appearance in the isolates may represent oral or skin contamination or assertive growth in culture.
Mapping of the 50 most abundant OTU sequences onto the 126 isolates revealed complex relationships that reflect multiple copies of the 16S rRNA gene in different taxa52 (Fig.2a). In general, however, OTU assignment reflected the principal KO phylogenetic structures and referencing of OTU communities to our isolate genomes may still inform on community functional capabilities.
The 16S rRNA gene sequences poorly detected the extensive diversity of Streptococcus spp. in airways, as noted previously18. However, combinations of OTUs can be seen to form barcodes (Fig.2a) that may refine Streptococcus spp. identification into their three main KO phylogenetic groups.
The taxa defined by OTUs and their relative abundances were similar in CELF ptOP and CELF LLL samples and to the normal population in BUS ptOP (Fig.2b, c). Other than the most abundant organisms, the prevalence of most OTUs was lower in the LLL than in the ptOP (Fig.2c). The mean bacterial burden was much higher in ptOP samples from CELF than in the LLL (log10 mean 7.860.07 vs 5.060.05), consistent with previous studies8,16,17.
Strong correlations and anti-correlations were present between the abundances of OTUs in data from each site (exemplified for CELF ptOP samples in Fig.2d, and previously shown for the BUS ptOP results18). We used WGCNA analysis to find networks (named arbitrarily with colours) within these correlated taxa. Network structures were consistent in the CELF and BUS ptOP communities (Supplementary Figs.3 and 4), but less distinct in the lower airway samples where taxa were less diverse and of lower abundance (Supplementary Fig.5).
Networks often contained closely related species but also extended beyond phylogenetically related organisms (Fig.2d and Supplementary Fig.6). For example, in the CELF ptOP networks (Fig.2d and Supplementary Fig.6) there are phylogenetically homogeneous modules of Streptococci (blue, red and green-yellow), Gemella (magenta), Haemophilus (black and pink) and Granulicatella (purple).
Of interest is the brown module in the CELF ptOP samples, which contains multiple Prevotella and Veillonella spp. of high abundance. The presence of biofilm elements in Prevotella spp. described above supports a hypothesis that these organisms may adhere to form a basic commensal carpet of the airways18.
Both the CELF ptOP and BUS ptOP networks recovered the phylogenetic relationships found in the KO analysis amongst Streptococcus isolates. The three clusters of Streptococcus isolates (Strep. I-III) map to distinct sets of OTUs using sequence similarity (Fig.2a), and this similarity is also uncovered in the WGCNA network modules in both ptOP networks (Supplementary Fig.7).
Subtle alterations in bacterial community composition (dysbiosis53) are recognised in many diseases with microbial components. Community instability and inflammation in the presence of mild viral infections5 should be added to the recognised features of loss of diversity and pathobiont expansion in asthma and COPD. We, therefore, sought insights into airway dysbiosis in our subjects from genomic sequencing of the commensal organisms.
We explored underlying components of airway communities by using Dirichlet-Multinomial Mixtures (DMM)54 on all samples from the BUS and CELF subjects, finding that samples formed predominantly into two clusters (Airway Community Type 1 and 2: ACT1 and ACT2) (Fig.3a). The main drivers for the two pulmotype clusters were identified as Streptococcus, Veillonella, Prevotella and Haemophilus spp. in descending order of relative abundance across all samples. ACT1 was dominated by Streptococcus, Veillonella and Prevotella in 410 samples; whilst ACT2 was dominated by Streptococcus, Veillonella and Haemophilus in 478 samples (Fig.3a). Principal coordinates analysis based on Bray-Curtis-distance (-diversity) of the airway microbiota confirmed significant overall compositional differences between the two community type clusters (PERMANOVA P-value>0.001) (Fig.3b).
a Main drivers of Dirichlet-multinomial model-based airway communities. b Beta diversity based on Bray-Curtis dissimilarity principal coordinate analysis showing separation of the two communities. c Consistency of airway community assignment between samples of the same and different donors (left) and sampling sites (right). d Alpha diversity measures and correlations. e Univariate associations of CELF 16S samples binned on phylum level to metadata. f Proportion of community assignments between ptOP samples of different study origins, sampling sites and disease groups. g relative abundance of most abundant genera based on CELF samples 16S rRNA. h Univariate metabolite associations based on binning of CELF 16S rRNA sequences onto isolate annotation.
Congruence analysis of CELF samples (Fig.3c) confirmed consistency in assignment for samples coming from the same donor (<0.005) or the same sampling site (<0.005).
We performed univariate analysis to investigate the association between CELF subject metadata and potential indicators of dysbiosis, specifically, evenness and richness (Fig.3d), and bacterial abundance at the phylum level (Fig.3e). Features describing clinical phenotypes and sample origin were often strongly collinear. We, therefore, assessed found associations in turn for retained significance with each potential confounder, using a nested rank-transformed mixed model test55 and considering repeated sampling of patients as a random effect.
We saw pervasive effects both on alpha diversity and phylum level of the tested predictors (Fig.3d, e). Importantly, the Shannon index and richness were significantly decreased with asthma status and severity (MWU false-discovery rate (FDR)<0.1) (Fig.3d).
We found an increase (although not significant) of the Proteobacteria Phylum associated with asthma status (Fig.3e), in line with the taxonomic profile of patients with asthma vs. healthy controls (Fig.3g). This is consistent with many reports of Proteobacteria excess in asthmatic airways8,9,56. Type 2 communities were enriched in subjects with positive asthma status in all sample sites and in CELF subjects overall (Fig.3f).
We examined the impact of the study, asthma status, and sampling site on the distribution of community types in the CELF thoracic samples, using logistic regression models with sex and age as control variables. The results indicated significant differences in ACT proportions across different sampling sites: LUL vs. OTS: odds ratio 95% confidence interval 0.1350.444 (p-val: 3.1e-07); LLL vs. OTS: 0.0490.249 (P-val: 5.0e-10). Statistical significance was more marked for the left upper lobe (FDR q-value<0.001) than the left lower lobe (q<0.10).
We extrapolated metabolic activities from binning 16S rRNA gene abundance onto the isolate KOs using PICRUSt57, revealing metabolite profiles that distinguished measures of diversity and location within upper or lower airways (Fig.3h), as well as distinctive features of asthma and dysbiosis.
In order to relate our mapped microbiome to its ecosystem, we sought host components of the microbial-mucosal interface by serial measurements of global gene expression and supernatant metabolomics during full human airway epithelial cell (HAEC) differentiation in an air-liquid interface (ALI) model. We hypothesised that the transition from monolayer to ciliated epithelium over 28 days would be accompanied by the progressive expression of genes and secretion of metabolites for managing the microbiota.
HAEC from a single donor were grown in triplicate and harvested on days 0, 2, 3, 7, 14, 21 and 28. Trans-epithelial resistance (TEER) rose from 7.40.3 on day 0 to 1551113 on day 28, and MUC5AC mRNA production rose 30-fold over the same period (Supplementary Fig.8), indicating full epithelial development.
We found 2553 significantly changing transcripts organised into eight core temporal clusters of gene expression (Limma, 3.22.7) (Fig.4a and Supplementary Data6). Late peaks of expression were found in four clusters, three of which (CL2, CL4 and CL5) contained many genes likely to interact with the microbiome (Supplementary Data6). Transcripts in the other upgoing cluster (CL3) were elevated early and late in differentiation and were enriched for genes mediating cell mobility and localisation. Genes of particular interest in the other upgoing clusters are as follows.
a Global gene expression was measured 7 times over 28 days in an air-liquid model of epithelial differentiation (monolayer to ciliated epithelium). A total of 2,553 transcripts, summarised by 8 core temporal profiles, showed significant variation in abundance during mucociliary development. Hallmark functional roles are shown for each cluster. Clusters CL2, CL3, CL4 and CL5 show late peaks of expression and contain genes that can interact with the microbiome. Upregulated chemokines and immune-function genes are also noted within the clusters. b Metabolites (square) measured in the supernatant of the fully differentiated airway cells were linked to genes (circle) identified in bacterial isolates. Arrows indicate if the reactions were reversible or irreversible, with metabolites as substrates and products. These networks were built based on KEGG pathways. c Binary heatmap displaying the presence (1) or absence (0) of genes (columns) identified in the genomic sequences of bacterial isolates (rows). Bacterial isolates are organised into Kegg Ontology phylogeny clusters (see Fig.2). Gene annotations (top) indicate the frequency of the gene: frequent for genes in >75% of isolates, intermediate for genes in 2575% of isolates and rare for those in <25% of isolates.
Mucosal mucins are central to mucosal function and integrity, providing a source of nutrients and sites for tethering of commensals58, whilst restricting the density of organisms through upward flow by beating cilia59. Interactions of mucins with microbiota play an important role in normal function58, and direct cross-talk between microbes and mucin production is likely59.
In our ALI model, progressive up-regulation of the major secreted respiratory mucins MUC5AC and MUC5B in CL2 was accompanied by the membrane-associated MUC20 (Supplementary Data6). In contrast, CL5 contained three membrane-associated mucins (MUC13, MUC15, MUC16). These mucins do not form gels and are anchored to the apical cell surface, where they present a glycoarray for selective interactions with the microbial environment58.
Within CL5 we also found 17 gene families and 175 genes with putative roles in ciliary function, ciliogenesis, or spermatogenesis (Supplementary Data6). Mutations in many of these genes are known to cause primary ciliary dyskinesia (PCD)60, which results in recurrent pulmonary infections. Other genes in this list are candidates for mutation in cases of PCD without known cause.
The most significant effects (top hits) in CL2 included ENPP4 (which promotes haemostasis); ALOX15 (which generates bioactive lipid mediators including eicosanoids); GLIPR2 (which enhances type-I IFNs); MPPED2 (a metallophosphoesterase active in infection); INSR (insulin receptor); and MIR223 (an inhibitor of neutrophil extracellular trap (NET) formation in infection) (Supplementary Data6).
Immune-related genes significantly expressed in CL5 included complement factor 6 (C6) which forms part of the membrane attack complex. C6 deficiency is associated with Neisseria spp. infections. CD38 was also highly expressed, and its product is an activator of B-cells and T-cells.
Top hits in CL4 include ADH1C, an alcohol dehydrogenase; GSTA2 with a known role in the detoxification of electrophilic carcinogens, environmental toxins and products of oxidative stress by conjugation with glutathione; ACE2, the SARS2-Cov-19 binding site which cleaves angiotensins; and PIK3R3 which phosphorylates phosphatidylinositol to affect growth signalling pathways (Supplementary Data6).
CL4 contains five members of the cytochrome P450 families with potential roles in the detoxification of microbial products, including CYP2F1 (which modifies tryptophan toxins and xenobiotics); CYP4X1 (unknown substrates); CYP4Z1 (benzyl esters); CYP4F3 (Leukotriene B4); and CYP2C18 (sulfaphenazole). Also in CL4 were transporters SLC10A5 (substrate bile acids); SLC27A2 (fatty acids); SLC1A1 (glutamate); SLC4A11 (borate); SLC25A4 (ADP/ATP in mitochondria); SLC45A4 (sucrose); SLC25A28 (iron); and SLC39A11 (zinc).
Enrichment of genes for detoxification and transport was also present within CL2, which included CYP4B1 (substrate fatty acids and alcohols); CYP4V2 (fatty acids); CYP2A13 (nitrosamines); CYP2B6 (xenobiotics); CYP26A1 (retinoids); and CYP4F12 (arachidonic acids). Transporters included SLC40A1 (iron); SLC13A2 (citrate); SLC15A2 (small peptides); SLC12A7 (KCl co-transporter); and SLC35A5 (nucleoside sugars).
The bronchial mucosa is innervated with vagal sensory unmyelinated fibres that detect airway luminal substances and mediate smooth muscle tone, mucus secretion, and cough61. Airway sensory nerves are directly involved in immune or inflammatory responses, themselves releasing proinflammatory molecules (neurogenic inflammation)62,63. Neuroinflammation can change receptors, ion channels, neurochemistry, and fibre density64. It contributes to the disabling syndrome of cough hypersensitivity and chronic cough65.
A basis for innervation can be seen in top hits from CL2, which included ENPP5 and HECW2, which have putative roles in the development of airway sensory nerves (Supplementary Data6). Interestingly, CL2 and CL4 together contained ten members of the protocadherin beta gene family (PCDHB2, PCDHB3, PCDHB4, PCDHB5, PCDHB10, PCDHB12 and PCDHB18P in CL2; PCDHB13, PCDHB14, and PCDHB15 in CL4). Interactions between protocadherin beta extracellular domains specify self-avoidance in specific cell-to-cell neural connections66, and their abundant presence here may regulate singular neural-mucosal cell coherence.
Metabolites are central to biological signalling, and so we used the same time-series model of AEC differentiation to measure levels of metabolites released into the culture media of the cells (Supplementary Data7).
We then mapped the ALI culture metabolites to enzymes in matching bacterial pathways identified within the KO of isolate genomes (Fig.4b), based on direct reactions, as substrates or products. Notable interactions include amino acids, nucleotides and compounds involved in energy metabolism. The metabolite-related KOs exhibited distinctive patterns within the isolate phylogeny (Fig.4c).
Enrichment of these KOs onto global human and bacterial KO pathways with iPath67 is shown in Supplementary Figs.9 and 10. These suggest folate biosynthesis is ubiquitous amongst airway organisms, valine, leucine and isoleucine metabolism to be of intermediate importance and alanine, aspartate and glutamate metabolism to be rare functions amongst the isolates.
Original post:
Genomic attributes of airway commensal bacteria and mucosa | Communications Biology - Nature.com