引用本文
  • 李一,熊萱,张远.利用加权基因共表达网络挖掘乳腺癌相关疾病靶标[J].第二军医大学学报,2019,40(9):1001-1009    [点击复制]
  • LI Yi,XIONG Xuan,ZHANG Yuan.Weighted gene co-expression network analysis for data mining of breast cancer biomarkers[J].Acad J Sec Mil Med Univ,2019,40(9):1001-1009   [点击复制]
【打印本页】 【下载PDF全文】 【HTML】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 404次   下载 339 本文二维码信息
码上扫一扫!
利用加权基因共表达网络挖掘乳腺癌相关疾病靶标
李一1,熊萱2,张远2*
0
(1. 四川省医学科学院·四川省人民医院乳腺外科, 成都 610072;
2. 个体化药物治疗四川省重点实验室, 四川省医学科学院·四川省人民医院药学部, 成都 610072
*通信作者)
摘要:
目的 利用公共数据库癌症基因组图谱(TCGA),通过加权基因共表达网络分析(WGCNA)挖掘乳腺癌诊断年龄和肿瘤分期相关疾病靶标。方法 利用TCGA得到53例亚洲人种和126例非洲人种乳腺癌基因芯片表达数据及相应的临床指标,然后用R软件的WGCNA包分别构建这2个人群的共表达网络,得到与诊断年龄和肿瘤分期的相关显著性模块,并用在线网站DAVID进行功能富集,用在线网站UALCAN进行生存分析。结果 WGCNA分析得到11个与肿瘤分期和诊断年龄显著相关的模块。将11个模块取交集后得到42个候选基因,利用在线网站DAVID进行基因本体(GO)富集分析,发现这些候选基因主要富集在蛋白质结合功能方面。取42个候选基因中9个由WGCNA识别出的核心基因,输入在线网站UALCAN上行差异分析和生存分析,最终筛选出2个(ERLIN2ASH2L)候选生物标志物,这2个基因在正常组织和癌组织中的表达差异有统计学意义(P<0.01),且表达水平影响乳腺癌患者的生存期(P<0.05)。结论 利用数据挖掘寻找生物标志物或疾病靶标是一种高效、经济的研究方式。本研究通过数据挖掘发现ERLIN2ASH2L为乳腺癌的候选生物标志物,可用于大样本临床验证及机制探讨。
关键词:  加权基因共表达网络分析  乳腺肿瘤  数据挖掘  生物学肿瘤标记
DOI:10.16781/j.0258-879x.2019.09.1001
投稿时间:2019-05-21修订日期:2019-06-26
基金项目:国家临床药学重点专科建设项目(30305030698),四川省医学科学院省级公益性科研院所基本科研业务费(30504010425),四川省医学科学院·四川省人民医院青年人才基金(2017QN15),四川省卫生和计划生育委员会普通项目(18PJ554).
Weighted gene co-expression network analysis for data mining of breast cancer biomarkers
LI Yi1,XIONG Xuan2,ZHANG Yuan2*
(1. Department of Breast Surgery, Sichuan Academy of Medical Sciences·Sichuan Provincial People's Hospital, Chengdu 610072, Sichuan, China;
2. Personalized Drug Therapy of Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Academy of Medical Sciences·Sichuan Provincial People's Hospital, Chengdu 610072, Sichuan, China
*Corresponding author)
Abstract:
Objective To explore the disease targets of breast cancer related to age at diagnosis and tumor stage by weighted gene co-expression network analysis (WGCNA) from public database The Cancer Genome Atlas (TCGA). Methods We obtained the breast cancer gene chip expression data and corresponding clinical data of 53 Asians and 126 Africans from TCGA database. R software WGCNA package was used to construct the co-expression network of the two populations, and the significant modules related to age at diagnosis and cancer stage were obtained. Online website DAVID was used for function enrichment and online website UALCAN for survival analysis. Results WGCNA yielded 11 modules significantly related to cancer stage and age at diagnosis. Forty-two candidate genes were obtained after 11 modules were intersected. Gene ontology (GO) enrichment analysis was carried out using online website DAVID and these genes were mainly involved in protein binding function. Nine of the 42 candidate genes were identified as hub genes by WGCNA, the 9 genes were used in UALCAN for differential analysis and survival analysis, and 2 candidate biomarkers (ERLIN2 and ASH2L) were screened out. The expression of the 2 genes in normal tissues and breast cancer tissues was significantly different (P<0.01), and the expression level significantly influenced the survival of breast cancer patients (P<0.05). Conclusion Data mining from public databases for biomarkers or therapeutic targets is a cost-effective research method. In this study ERLIN2 and ASH2L have been found to be candidate biomarkers for breast cancer through data mining, which needs large sample study and mechanism exploration.
Key words:  weighted gene co-expression network analysis  breast neoplasms  data mining  biological tumor markers