TY - JOUR
T1 - A catalog of small proteins from the global microbiome
AU - Duan, Yiqian
AU - Santos-Júnior, Célio Dias
AU - Schmidt, Thomas Sebastian
AU - Fullam, Anthony
AU - de Almeida, Breno L.S.
AU - Zhu, Chengkai
AU - Kuhn, Michael
AU - Zhao, Xing Ming
AU - Bork, Peer
AU - Coelho, Luis Pedro
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Small open reading frames (smORFs) shorter than 100 codons are widespread and perform essential roles in microorganisms, where they encode proteins active in several cell functions, including signal pathways, stress response, and antibacterial activities. However, the ecology, distribution and role of small proteins in the global microbiome remain unknown. Here, we construct a global microbial smORFs catalog (GMSC) derived from 63,410 publicly available metagenomes across 75 distinct habitats and 87,920 high-quality isolate genomes. GMSC contains 965 million non-redundant smORFs with comprehensive annotations. We find that archaea harbor more smORFs proportionally than bacteria. We moreover provide a tool called GMSC-mapper to identify and annotate small proteins from microbial (meta)genomes. Overall, this publicly-available resource demonstrates the immense and underexplored diversity of small proteins.
AB - Small open reading frames (smORFs) shorter than 100 codons are widespread and perform essential roles in microorganisms, where they encode proteins active in several cell functions, including signal pathways, stress response, and antibacterial activities. However, the ecology, distribution and role of small proteins in the global microbiome remain unknown. Here, we construct a global microbial smORFs catalog (GMSC) derived from 63,410 publicly available metagenomes across 75 distinct habitats and 87,920 high-quality isolate genomes. GMSC contains 965 million non-redundant smORFs with comprehensive annotations. We find that archaea harbor more smORFs proportionally than bacteria. We moreover provide a tool called GMSC-mapper to identify and annotate small proteins from microbial (meta)genomes. Overall, this publicly-available resource demonstrates the immense and underexplored diversity of small proteins.
UR - https://www.scopus.com/pages/publications/85202761622
U2 - 10.1038/s41467-024-51894-6
DO - 10.1038/s41467-024-51894-6
M3 - Article
C2 - 39214983
AN - SCOPUS:85202761622
SN - 2041-1723
VL - 15
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 7563
ER -