SNP density impact on kinship inference and IBS-machine learning optimization.
Yi Chuan 2026 Jun; 48(6):570-588.

Abstract

In recent years, multiple panels containing varying numbers of single nucleotide polymorphisms (SNPs) have been reported in forensic genetics for kinship inference. However, systematic exploration of the impact of SNP number on inference performance and the application of machine learning algorithms remains lacking. Therefore, we evaluated the impact of SNP number on kinship inference performance and the optimization effects of machine learning methods on the identity-by-state (IBS) algorithm. We constructed multiple SNP panels with SNP numbers ranging from 15,476 to 20,838, and evaluated the performance of the likelihood ratio (LR) method and the IBS algorithm for kinship inference under different SNP numbers based on simulated pedigrees. After selecting the optimal SNP panel, we validated it using real pedigrees and further combined the IBS algorithm with machine learning methods to enhance inference performance. Our results showed that for the LR method, the sensitivity in inferring sixth and seventh degree kinships exhibited a significant positive correlation with SNP number. For the IBS algorithm, although the sensitivity in inferring fourth to seventh degree kinships showed a significant positive correlation with SNP number, the actual improvement was limited (only 0.5%~2.2% increase). Based on these results, we determined the optimal panel containing 20,838 SNPs (21K panel). The 21K panel based on the LR method could accurately infer kinships within sixth degree (with a sensitivity of 93.65% for sixth degree kinship inference), and the 21K panel based on the IBS algorithm could accurately infer kinships within third degree (with a sensitivity of 86.79% for third degree kinship inference). After combining the IBS algorithm with machine learning, the sensitivity for fourth degree kinship inference improved from 69.10% to 87.66%, the sensitivities for fifth and sixth degree kinships improved from 38.03% and 21.41% to 48.75% and 37.80%, respectively.

Authors+Show Affiliations

Dai LSchool of Criminal Investigation, People's Public Security University of China, Beijing 100038, China. Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Tang ZCSchool of Life Sciences, Jiangsu Normal University, Xuzhou 221116, China.
Jia ZSchool of Criminal Investigation, People's Public Security University of China, Beijing 100038, China. Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Jiang LBeijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Zhao CTSchool of Criminal Investigation, People's Public Security University of China, Beijing 100038, China. Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Zhao ZYSchool of Criminal Investigation, People's Public Security University of China, Beijing 100038, China. Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Zhao WTBeijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.
Li CXSchool of Criminal Investigation, People's Public Security University of China, Beijing 100038, China. Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China.

Pub Type(s)

Journal Article

Language

eng

PubMed ID

42309793