Skip to main navigation Skip to search Skip to main content

Towards the biogeography of prokaryotic genes

  • Luis Pedro Coelho
  • , Renato Alves
  • , Álvaro Rodríguez del Río
  • , Pernille Neve Myers
  • , Carlos P. Cantalapiedra
  • , Joaquín Giner-Lamia
  • , Thomas Sebastian Schmidt
  • , Daniel R. Mende
  • , Askarbek Orakov
  • , Ivica Letunic
  • , Falk Hildebrand
  • , Thea Van Rossum
  • , Sofia K. Forslund
  • , Supriya Khedkar
  • , Oleksandr M. Maistrenko
  • , Shaojun Pan
  • , Longhao Jia
  • , Pamela Ferretti
  • , Shinichi Sunagawa
  • , Xing Ming Zhao
  • Henrik Bjørn Nielsen, Jaime Huerta-Cepas, Peer Bork
  • Fudan University
  • European Molecular Biology Laboratory
  • 28223
  • Technical University of Denmark
  • Technical University of Madrid
  • University of Hawai'i at Mānoa
  • Biobyte Solutions
  • Norwich Research Park
  • Max Delbrück Center for Molecular Medicine in the Helmholtz Association
  • Berlin Initiative of Health
  • Swiss Federal Institute of Technology Zurich
  • Clinical Microbiomics A/S
  • Yonsei University
  • University of Würzburg

Research output: Contribution to journalArticlepeer-review

Abstract

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats1–3, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority of the genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.

Original languageEnglish
Pages (from-to)252-256
Number of pages5
JournalNature
Volume601
Issue number7892
DOIs
Publication statusPublished - 13 Jan 2022
Externally publishedYes

Fingerprint

Dive into the research topics of 'Towards the biogeography of prokaryotic genes'. Together they form a unique fingerprint.

Cite this