Tags

Type your tag names separated by a space and hit enter

Leveraging Bayesian networks and information theory to learn risk factors for breast cancer metastasis.
BMC Bioinformatics. 2020 Jul 10; 21(1):298.BB

Abstract

BACKGROUND

Even though we have established a few risk factors for metastatic breast cancer (MBC) through epidemiologic studies, these risk factors have not proven to be effective in predicting an individual's risk of developing metastasis. Therefore, identifying critical risk factors for MBC continues to be a major research imperative, and one which can lead to advances in breast cancer clinical care. The objective of this research is to leverage Bayesian Networks (BN) and information theory to identify key risk factors for breast cancer metastasis from data.

METHODS

We develop the Markov Blanket and Interactive risk factor Learner (MBIL) algorithm, which learns single and interactive risk factors having a direct influence on a patient's outcome. We evaluate the effectiveness of MBIL using simulated datasets, and compare MBIL with the BN learning algorithms Fast Greedy Search (FGS), PC algorithm (PC), and CPC algorithm (CPC). We apply MBIL to learn risk factors for 5 year breast cancer metastasis using a clinical dataset we curated. We evaluate the learned risk factors by consulting with breast cancer experts and literature. We further evaluate the effectiveness of MBIL at learning risk factors for breast cancer metastasis by comparing it to the BN learning algorithms Necessary Path Condition (NPC) and Greedy Equivalent Search (GES).

RESULTS

The averages of the Jaccard index for the simulated datasets containing 2000 records were 0.705, 0.272, 0.228, and 0.147 for MBIL, FGS, PC, and CPC respectively. MBIL, NPC, and GES all learned that grade and lymph_nodes_positive are direct risk factors for 5 year metastasis. Only MBIL and NPC found that surgical_margins is a direct risk factor. Only NPC found that invasive is a direct risk factor. MBIL learned that HER2 and ER interact to directly affect 5 year metastasis. Neither GES nor NPC learned that HER2 and ER are direct risk factors.

DISCUSSION

The results involving simulated datasets indicated that MBIL can learn direct risk factors substantially better than standard Bayesian network learning algorithms. An application of MBIL to a real breast cancer dataset identified both single and interactive risk factors that directly influence breast cancer metastasis, which can be investigated further.

Authors+Show Affiliations

Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Blvd, Pittsburgh, PA, 15217, USA. xij6@pitt.edu.Department of Pathology, University of Pittsburgh and Pittsburgh VA Health System, Pittsburgh, PA, USA. UPMC Hillman Cancer Center, Pittsburgh, PA, USA.UPMC Hillman Cancer Center, Pittsburgh, PA, USA. Division of Hematology/Oncology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Blvd, Pittsburgh, PA, 15217, USA.Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Blvd, Pittsburgh, PA, 15217, USA.Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.

Pub Type(s)

Journal Article

Language

eng

PubMed ID

32650714

Citation

Jiang, Xia, et al. "Leveraging Bayesian Networks and Information Theory to Learn Risk Factors for Breast Cancer Metastasis." BMC Bioinformatics, vol. 21, no. 1, 2020, p. 298.
Jiang X, Wells A, Brufsky A, et al. Leveraging Bayesian networks and information theory to learn risk factors for breast cancer metastasis. BMC Bioinformatics. 2020;21(1):298.
Jiang, X., Wells, A., Brufsky, A., Shetty, D., Shajihan, K., & Neapolitan, R. E. (2020). Leveraging Bayesian networks and information theory to learn risk factors for breast cancer metastasis. BMC Bioinformatics, 21(1), 298. https://doi.org/10.1186/s12859-020-03638-8
Jiang X, et al. Leveraging Bayesian Networks and Information Theory to Learn Risk Factors for Breast Cancer Metastasis. BMC Bioinformatics. 2020 Jul 10;21(1):298. PubMed PMID: 32650714.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - Leveraging Bayesian networks and information theory to learn risk factors for breast cancer metastasis. AU - Jiang,Xia, AU - Wells,Alan, AU - Brufsky,Adam, AU - Shetty,Darshan, AU - Shajihan,Kahmil, AU - Neapolitan,Richard E, Y1 - 2020/07/10/ PY - 2019/06/06/received PY - 2020/07/02/accepted PY - 2020/7/12/entrez PY - 2020/7/12/pubmed PY - 2020/7/12/medline KW - Bayesian network KW - Breast cancer KW - Interaction KW - Metastasis KW - Risk factor SP - 298 EP - 298 JF - BMC bioinformatics JO - BMC Bioinformatics VL - 21 IS - 1 N2 - BACKGROUND: Even though we have established a few risk factors for metastatic breast cancer (MBC) through epidemiologic studies, these risk factors have not proven to be effective in predicting an individual's risk of developing metastasis. Therefore, identifying critical risk factors for MBC continues to be a major research imperative, and one which can lead to advances in breast cancer clinical care. The objective of this research is to leverage Bayesian Networks (BN) and information theory to identify key risk factors for breast cancer metastasis from data. METHODS: We develop the Markov Blanket and Interactive risk factor Learner (MBIL) algorithm, which learns single and interactive risk factors having a direct influence on a patient's outcome. We evaluate the effectiveness of MBIL using simulated datasets, and compare MBIL with the BN learning algorithms Fast Greedy Search (FGS), PC algorithm (PC), and CPC algorithm (CPC). We apply MBIL to learn risk factors for 5 year breast cancer metastasis using a clinical dataset we curated. We evaluate the learned risk factors by consulting with breast cancer experts and literature. We further evaluate the effectiveness of MBIL at learning risk factors for breast cancer metastasis by comparing it to the BN learning algorithms Necessary Path Condition (NPC) and Greedy Equivalent Search (GES). RESULTS: The averages of the Jaccard index for the simulated datasets containing 2000 records were 0.705, 0.272, 0.228, and 0.147 for MBIL, FGS, PC, and CPC respectively. MBIL, NPC, and GES all learned that grade and lymph_nodes_positive are direct risk factors for 5 year metastasis. Only MBIL and NPC found that surgical_margins is a direct risk factor. Only NPC found that invasive is a direct risk factor. MBIL learned that HER2 and ER interact to directly affect 5 year metastasis. Neither GES nor NPC learned that HER2 and ER are direct risk factors. DISCUSSION: The results involving simulated datasets indicated that MBIL can learn direct risk factors substantially better than standard Bayesian network learning algorithms. An application of MBIL to a real breast cancer dataset identified both single and interactive risk factors that directly influence breast cancer metastasis, which can be investigated further. SN - 1471-2105 UR - https://www.unboundmedicine.com/medline/citation/32650714/Leveraging_Bayesian_networks_and_information_theory_to_learn_risk_factors_for_breast_cancer_metastasis L2 - https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03638-8 DB - PRIME DP - Unbound Medicine ER -
Try the Free App:
Prime PubMed app for iOS iPhone iPad
Prime PubMed app for Android
Prime PubMed is provided
free to individuals by:
Unbound Medicine.