In recent years, multiple panels containing varying numbers of single nucleotide polymorphisms (SNPs) have been reported in forensic genetics for kinship inference. However, systematic exploration of the impact of SNP number on inference performance and the application of machine learning algorithms remains lacking. Therefore, we evaluated the impact of SNP number on kinship inference performance and the optimization effects of machine learning methods on the identity-by-state (IBS) algorithm. We constructed multiple SNP panels with SNP numbers ranging from 15,476 to 20,838, and evaluated the performance of the likelihood ratio (LR) method and the IBS algorithm for kinship inference under different SNP numbers based on simulated pedigrees. After selecting the optimal SNP panel, we validated it using real pedigrees and further combined the IBS algorithm with machine learning methods to enhance inference performance. Our results showed that for the LR method, the sensitivity in inferring sixth and seventh degree kinships exhibited a significant positive correlation with SNP number. For the IBS algorithm, although the sensitivity in inferring fourth to seventh degree kinships showed a significant positive correlation with SNP number, the actual improvement was limited (only 0.5%~2.2% increase). Based on these results, we determined the optimal panel containing 20,838 SNPs (21K panel). The 21K panel based on the LR method could accurately infer kinships within sixth degree (with a sensitivity of 93.65% for sixth degree kinship inference), and the 21K panel based on the IBS algorithm could accurately infer kinships within third degree (with a sensitivity of 86.79% for third degree kinship inference). After combining the IBS algorithm with machine learning, the sensitivity for fourth degree kinship inference improved from 69.10% to 87.66%, the sensitivities for fifth and sixth degree kinships improved from 38.03% and 21.41% to 48.75% and 37.80%, respectively.
Abstract
Journal Article
eng
42309793
Dai, Lv, et al. "SNP Density Impact On Kinship Inference and IBS-machine Learning Optimization." Yi Chuan = Hereditas, vol. 48, no. 6, 2026, pp. 570-588.
Dai L, Tang ZC, Jia Z, et al. SNP density impact on kinship inference and IBS-machine learning optimization. Yi Chuan. 2026;48(6):570-588.
Dai, L., Tang, Z. C., Jia, Z., Jiang, L., Zhao, C. T., Zhao, Z. Y., Zhao, W. T., & Li, C. X. (2026). SNP density impact on kinship inference and IBS-machine learning optimization. Yi Chuan = Hereditas, 48(6), 570-588. https://doi.org/10.16288/j.yczz.25-287
Dai L, et al. SNP Density Impact On Kinship Inference and IBS-machine Learning Optimization. Yi Chuan. 2026;48(6):570-588. PubMed PMID: 42309793.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR
T1 - SNP density impact on kinship inference and IBS-machine learning optimization.
AU - Dai,Lv,
AU - Tang,Zi-Chen,
AU - Jia,Zhen,
AU - Jiang,Li,
AU - Zhao,Chuan-Tong,
AU - Zhao,Zhi-Yuan,
AU - Zhao,Wen-Ting,
AU - Li,Cai-Xia,
PY - 2026/6/18/medline
PY - 2026/6/18/pubmed
PY - 2026/6/17/entrez
KW - IBS method
KW - kinship inference
KW - likelihood ratio method
KW - machine learning
SP - 570
EP - 588
JF - Yi chuan = Hereditas
JO - Yi Chuan
VL - 48
IS - 6
N2 - In recent years, multiple panels containing varying numbers of single nucleotide polymorphisms (SNPs) have been reported in forensic genetics for kinship inference. However, systematic exploration of the impact of SNP number on inference performance and the application of machine learning algorithms remains lacking. Therefore, we evaluated the impact of SNP number on kinship inference performance and the optimization effects of machine learning methods on the identity-by-state (IBS) algorithm. We constructed multiple SNP panels with SNP numbers ranging from 15,476 to 20,838, and evaluated the performance of the likelihood ratio (LR) method and the IBS algorithm for kinship inference under different SNP numbers based on simulated pedigrees. After selecting the optimal SNP panel, we validated it using real pedigrees and further combined the IBS algorithm with machine learning methods to enhance inference performance. Our results showed that for the LR method, the sensitivity in inferring sixth and seventh degree kinships exhibited a significant positive correlation with SNP number. For the IBS algorithm, although the sensitivity in inferring fourth to seventh degree kinships showed a significant positive correlation with SNP number, the actual improvement was limited (only 0.5%~2.2% increase). Based on these results, we determined the optimal panel containing 20,838 SNPs (21K panel). The 21K panel based on the LR method could accurately infer kinships within sixth degree (with a sensitivity of 93.65% for sixth degree kinship inference), and the 21K panel based on the IBS algorithm could accurately infer kinships within third degree (with a sensitivity of 86.79% for third degree kinship inference). After combining the IBS algorithm with machine learning, the sensitivity for fourth degree kinship inference improved from 69.10% to 87.66%, the sensitivities for fifth and sixth degree kinships improved from 38.03% and 21.41% to 48.75% and 37.80%, respectively.
SN - 0253-9772
UR - https://www.unboundmedicine.com/prime/citation/42309793/SNP_density_impact_on_kinship_inference_and_IBS-machine_learning_optimization.
DB - PRIME
DP - Unbound Medicine
ER -


