Communication

Construction of a phylogenetic matrix: scripts and guidelines for phylogenomics

  • Shiyu Du1 ,
  • Yinhuan Ding2 ,
  • 5 ,
  • Hu Li3 ,
  • Aibing Zhang4 ,
  • Arong Luo5 ,
  • Chaodong Zhu4 ,
  • 5 ,
  • Feng Zhang1
Expand
  • 1Department of Entomology, College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China 2Department of Agronomy and Horticulture, Jiangsu Vocational College of Agriculture and Forestry, Nanjing 212400, China 3Department of Entomology and MOA Key Lab of Pest Monitoring and Green Management, College of Plant Protection, China Agricultural University, Beijing 100193, China 4College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China 5Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China

Online published: 2023-04-24

Abstract

Phylogenomics is a new field that infers evolutionary relationships of taxa at the genome-scale level. The increment of molecular data may raise the potential bias as the limiting factor in phylogenomics. It is particularly important to explore these factors in phylogenomic analyses by simple, convenient, time-saving and (relatively) robust means. Here, we construct a set of custom scripts for USCO (universal single-copy orthologs) loci extraction, multiple sequence alignment, trimming poorly aligned regions, loci filtering and creating a concatenation matrix, prior to reconstructing the phylogenetic trees, to simplify analytical pipelines and improve the accuracy of tree estimation. These scripts employed a series of computationally efficient bioinformatic tools, and were used with a universal ‘BASH’ shell or visual interface by Windows-like ‘drag and drop’ operations in LINUX systems. Most steps in these scripts are parallelized to accelerate analyses. These new custom scripts provide a convenient analytical solution for phylogenomics data preparation, data quality control, and detection of potential analytical errors. Details and scripts usage are provided at https://github.com/xtmtd/ Phylogenomics/tree/main/scripts. The virtual mirror file (.vmdk) integrates the operating system and required environment. All tools and scripts can be downloaded from https://dx.doi.org/ 10.6084/m9.figshare.21283026. Besides, the video introduction and “walk-through” for each script are provided at https://space.bilibili.com/319699648/channel/seriesdetail?sid=2682055.

Cite this article

Shiyu Du1 , Yinhuan Ding2 , 5 , Hu Li3 , Aibing Zhang4 , Arong Luo5 , Chaodong Zhu4 , 5 , Feng Zhang1 . Construction of a phylogenetic matrix: scripts and guidelines for phylogenomics[J]. Zoological Systematics, 2023 , 48(2) : 107 -116 . DOI: 10.11865/zs.2023201

Outlines

/