基于芯片数据进行生物学功能性聚类分析以建立早期非小细胞肺癌预后模型
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
研究背景与目的
     非小细胞肺癌(non-small-cell lung cancer, NSCLC)同一TNM分期患者的预后存在巨大差异,即便是接受根治术后的早期(Ⅰ、Ⅱ期)患者,其生存率亦明显低于预期,说明基于解剖学特征的现有TNM分期系统尚不足以充分反映预后,应运而生出一系列着眼于肿瘤组织本身生物学差异的研究,以期找到对不良预后有提示作用的基因标志。
     肿瘤预后基因标志研究一直以来都是本领域的研究热点,已经有若干相关报道。但IASLC (International Association for the Study of Lung Cancer,国际肺癌联盟)分期委员会并未将此类研究结果用于最新的第7版分期修订,主要是考虑到该类预后预测方法尚不够成熟。其结果多难以通过独立验证集的检验,至少是验证集的敏感度和特异度相较训练集大打折扣。
     本研究旨在探索预后不同的早期NSCLC肿瘤组织本身生物学特性的差异。首先,按照机器学习法的常规流程建模。此外,为了规避机器学习法“过度拟合”等局限性,本研究尝试跳出海量数据的窠臼,重点着眼于数据背后的规律,全面解析早期NSCLC预后相关的生物学功能类别。以建立在大量数据基础上的生物学功能性聚类为切入点,在多基因有机组合的基础上建立功能预后模型。
     实验方法
     第一部分机器学习法建立早期非小细胞肺癌预后模型
     经广东省人民医院伦理委员会批准并与患者签署知情同意书后,收集2003年4月至2006年6月间接受手术治疗的非小细胞肺癌患者肿瘤组织标本120例和部分肿瘤的配对正常组织53例。常规随访并完整记录随访资料。根据病理评估结果纯化肿瘤组织,最终使肿瘤细胞含量占全部细胞成分的80%以上。常规提取肿瘤组织RNA,利用Affymetrix U133 Plus2.0芯片进行全基因表达谱分析。
     本研究着眼于早期NSCLC患者的预后模型研究,共入组86例早期(Ⅰ/Ⅱ期)患者,其中术后总生存期不足2.5年者归为高危组,术后无病生存期超过5年者归为低危组,共计50例纳入建模数据集。
     同期采集127例肿瘤组织标本(与前述120例样本有105例重合),经过同样的病理评估过程后,常规提取肿瘤组织DNA,参照Agilent Oligonucleotide Array-Based CGH for Genomic DNA Analysis操作规程完成数据采集并检验数据质量。分析早期NSCLC的拷贝数变异情况,明确存在高频(>10%)拷贝数变异事件的基因。
     常规采用机器学习法构建预后模型,即按照如下三个条件筛选risk score模型的候选基因,后以forward selection法建模:
     (?)配对肿瘤组织与正常组织存在明显差异表达的基因。
     (?)10%以上的肿瘤样本出现CNA事件的基因。
     (?) Univariate cox regression生存分析所得P值<0.05的基因。分别在建模数据库(training cohort)和独立验证数据库(validation cohort)验证该模型的预后评估功能。
     第二部分基于生物学功能性聚类分析以建立早期非小细胞肺癌预后模型
     表达谱芯片和比较基因组杂交芯片的材料和数据处理方法同第一部分。对于CGH芯片的结果分析,本部分重点关注存在于3%以上样本的染色体片段的CNA事件。并通过将染色体片段与其中focal amplification/focal deletion基因相关联的方式,研究在NSCLC发生发展中起重要作用的生物学功能类别。
     表达谱芯片部分,本研究以生物学功能性聚类分析为着眼点,以本中心数据集为基础,结合分析四个辅助数据集,明确主要的早期NSCLC预后相关功能类别及代表性基因。人工构建功能预后模型以充分体现各功能类别与早期NSCLC预后的关系,并经独立验证集加以验证。更进一步分析优化预后模型判断功效的方法。
     结果
     第一部分机器学习法建立早期非小细胞肺癌预后模型
     1.机器学习法预后模型的建立
     根据三个条件共得到22个模型候选基因:
     (?)配对肿瘤组织与正常组织存在明显差异表达的基因:2383个
     (?)10%以上的肿瘤样本出现CNA事件的基因:953个
     (?) Univariate cox regression生存分析所得P值<0.05的基因:1381个Forward selection得到4基因RS模型:RS=(CLDN11×0.777)+(SATB1×1.379)+(ANLN×1.334)+(NUF2×-0.651)
     2.机器学习法所得模型对训练集的预后评估功能
     (1)全部50例样本:
     Log rank检验结果P=0.000,其特异度为24/28=85.7%;敏感度为18/22=81.8%;准确率为(24+18)/50=84.0%。
     (2)同一TNM分期(38例Ⅰ期)样本:
     P=0.000,其特异度为22/25=88.0%;敏感度为11/13=84.6%;准确率为(22+11)/38=86.8%。
     (3)18例Ⅰ期AC样本:
     P=0.001,其特异度为13/14=92.9%;敏感度为3/4=75.0%;准确率为(13+3)/18=88.9%。
     3.机器学习法所得模型对韩国Lee等发布的独立数据集的预后评估功能
     (1)韩国Lee等发布的研究数据(全部70例样本):
     P=0.013,其特异度为23/35=65.7%;敏感度为23/35=65.7%;准确率为(23+23)/70=65.7%。
     (2)韩国Lee等发布的研究数据(31例AC样本):
     P=0.072,其特异度为11/18=61.1%;敏感度为10/13=76.9%;准确率为(11+10)/31=67.7%。
     (3)韩国Lee等发布的研究数据(39例SCC样本):
     P=0.063,其特异度为12/21=57.1%;敏感度为13/18=72.2%;准确率为(12+13)/39=64.1%。
     4.对于Lee等、Hou等、Bild等提供的数据,选择机器学习法所得4基因进行Multi cox regression以重建模型,并分析其预后评估功能:
     (1)韩国Lee等发布的研究数据(全部70例样本):
     得到方程如下:RS=(CLDN11×0.079)+(SATB 1×0.065)+(ANLN×0.681)+(NUF2×-0.353)P=0.004,其特异度为23/35=65.7%;敏感度为23/35=65.7%;准确率为(23+23)/70=65.7%。
     (2)Bild等发布的研究数据(全部52例样本):
     得到方程如下:RS=(CLDN11×-0.019)+(SATB1×0.110)+(ANLN×0.275)+(NUF2×-0.074)该四个基因的组合,不足以实现预后评估功能(P=0.892)。
     (3)Hou等发布的研究数据(全部48例样本):
     得到方程如下:RS=(CLDN11×-0.029)+(SATB1×-0.014)+(ANLN×-0.070)+(NUF2×0.264)该四个基因的组合,不足以实现预后评估功能(P=0.713)。
     5.分析机器学习过程选择基因的局限性
     在50例样本组成的全集的基础上,分别去除2例样本,得到两个组成不同的实验集(分别由48例样本组成),记为实验集1和实验集2。进行Univariate cox regression生存分析可见:
     (1)实验集1得到P<0.05的基因共1358个。
     (2)实验集2得到P<0.05的基因共1359个
     (3)实验集1与实验集2结果的交集基因共1130个
     (4)实验集1与50例全集结果的交集基因共1186个。
     (5)实验集2与50例全集结果的交集基因共1240个
     (6)三数据集的交集基因共1113个,与50例样本全集所得1381基因的差异共268个基因。
     第二部分基于生物学功能性聚类分析以建立早期非小细胞肺癌预后模型
     1.NSCLC肿瘤组织样本在全基因组水平存在明显的DNA拷贝数变异
     分析存在高频CNA的基因,可见与NSCLC发生发展相关的主要生物学功能类别有:细胞增殖/分化相关、细胞周期相关、细胞凋亡相关、细胞黏附相关、免疫反应相关基因等。
     2.从生物学功能的角度分析对早期NSCLC具有良好预后评估功能的生物学功能类别和代表性基因
     Cell cycle相关:ANLN、BUB1B、CDCC99基因;
     Cell proliferation相关:DUSP4、STIL、MKI67基因;
     Cell adhesion相关:HMMR和CD9基因;
     凋亡相关:KIAA0101和BIRC5基因;
     Immune response相关:CD1A和C5基因;
     凝血相关:F12和PGDS基因;
     物质代谢相关:LPGAT1和PPARGC1A基因。
     其中cell cycle和cell proliferation相关基因是最重要的预后关联基因,但是他们并不足以代替其他功能类别的作用。
     3.以生物学功能为基础构建模型(代入16基因),分析其预后评估效果
     (1)本研究数据(全部50例样本):
     RS方程如下RS=(MK167*-1.227)+(ANLN*1.296)+(BUB1B*0.700)+(CCDC99*2.048)+(DUSP4 *-0.853)+(STIL*-2.255)+(HMMR/-.483)+(CD9*-2.083)+(KIAA0101*2.907)+(BIR C5*-1.371)+(CD]A*-0.108)+(C5*-1.333)+(LPGATl*1.853)+(PPARGC1A*1.765)+( F12*-0.393)+(PGDS*0.246)
     Log rank检验结果P=0.000,其特异度为25/28=89.3%;敏感度为19/22=86.4%;准确率为(25+19)/50=88.0%。
     (2)本研究数据(38例Ⅰ期样本):
     P=0.000,其特异度为22/25=88.0%;敏感度为12/13=92.3%;准确率为(22+12)/38=89.5%。
     (3)本研究数据(18例Ⅰ期AC样本):
     P=0.000,其特异度为13/14=92.9%;敏感度为4/4=100.0%;准确率为(13+4)/18=94.4%。
     (4)Lee等报道的数据(70例全集):
     RS方程如下RS=(MK167*-0.024)+(ANLN*0.414)+(BUB1B*0.986)+(CCDC99*0.765)+(DUSP4 *-0.001)+(STIL*0.762)+(HMMR*-.429)+(CD9*-0.261)+(KIAA0101*-0.401)+(BIR C5*-0.490)+(CD1A*0.291)+(C5*-0.316)+(LPGAT1*-0.142)+(PPARGC1A*0.796)+( F12*-0.009)+(PGDS*0.648)
     Log rank检验结果P=0.000,其特异度为26/35=74.3%;敏感度为26/35=74.3%;准确率为(26+26)/70=74.3%。
     4.分析预后模型的敏感度、特异度的差异
     结果发现,将样本按照预后模型所得危险系数(risk score,RS)的降序排列,判断失误样本主要集中在中间的灰色地带,灰色地带的宽度和集中程度与模型的预后评估功能直接相关。为后续提高模型的敏感度、特异度提供了启示。
     结论
     1.机器学习法构建的模型可以为训练集数据提供理想的预后判断(敏感度和特异度在80%以上,且与TNM相独立),但在独立验证集(同为东亚人种)的预后判断功效不足。
     2.机器学习模型的4个组成基因有两个为cell cycle类基因,cell cycle和cellproliferation类基因是与早期NSCLC预后最相关的两类基因。但单凭cell cycle和cell proliferation基因尚不足以判断患者预后。此外还有5大类基因与早期NSCLC的预后相关,分别为:cell adhesion、cell apoptosis、immune response、物质代谢、凝血功能相关基因。
     3.选择7类预后相关生物学功能的16个代表性基因组成功能预后模型,其对独立验证集的预后分组效果优于机器学习模型,即便是在机器学习模型的训练集,结果也不逊色。说明多项相关生物学功能的协同作用,可以提高模型的预后评估效果。
     4.分析预后模型在不同数据集作出错误判断的原因,可见,根据基因来评估NSCLC患者的预后时,判断失误的样本主要集中在危险系数的中间地带——灰色地带。
     5根据相关的生物学功能可以将基因分为两大类,预后良好的关联基因——Geneprognosis-positive和预后不良的关联基因——Geneprognosis-negative,最终的预后判断取决于高、低危两大类功能状态的角逐。若二者难分优劣势,则无法通过基因模型来判断预后,即预后判断的灰色地带。这也为后续提高功能模型的预后评估效果提供了努力的方向:
     (1)尽可能缩小灰色地带的范围。
     (2)分别确定高、低危组与现有灰色地带的界值。
Background and Objectives
     Non-small-cell lung cancer (NSCLC) patients with the same TNM stage may suffer from large prognosis variations. Even patients with early-stage NSCLC still demonstrated lower-than-expected survival rates after surgical resection, indicating that our current TNM staging methods do not adequately predict outcome. Studies focusing on tumor biologic characteristics came into being because of demand, in order to identify prognostic gene signatures.
     As a hot topic in this field, a variety of related researches have been reported. However, IASLC International Staging Committee didn't include these results in the latest 7th edition of TNM stage. Mainly because the results were still not stable enough. For example, prognostic model established on the basis of training cohort could fairly pass the validation of independent validation cohort, at least came up with reduced sensitivity and speficity.
     Our research focused on the early-stage NSCLC prognostic related biological characteristics. On the one hand, we routinely established machine learning prognostic model. On the other hand, we also tried to focus on the rules revealed by the data instead of the vast amout of data itself. We tried to study early-stage NSCLC prognostic related biological functions. Functional cluster was chosen as the cut-in point of array data analysis to establish prognostic model for early-stage NSCLC
     Methods
     Part 1 Early-stage NSCLC prognostic model established by machine learning process
     120 NSCLC samples and 53 matched normal tissues were obtained with informed consent between April 2003 and June 2006. Follow-up was routinely carried out and detailed recorded. Tumor samples had been purified based on histological assessment in order to ensure the tumor cells came up to more than 80% in total. Affymetrix U133 Plus2.0 array was used to perform gene expression analysis in high-throughput.
     Our research focused on early-stage NSCLC prognostic model establishment.86 samples (stageⅠ-Ⅱ) were enrolled in total. For the samples with OS less than 2.5 years, they were assigned into high-risk group. For the samples with DFS more than 5 years, they were assigned into low-risk group.50 samples with definite prognostic subgroup were analyzed to establish the model.
     127 samples (had 105 samples in common with the 120 samples above) obtained in the same period were processed by Agilent Oligonucleotide Array-Based CGH for Genomic DNA Analysis to obtain the results of copy number aberration (CNA).
     Prognostic model was established by routine machine learning process. First step was candidate genes screening according to the 3 requirements. Second step was RS formula establishment according to forward selection process.
     (?) Differentially expressed genes between matched tumor and normal tissues
     (?) Genes with CNA in more than 10% tumor samples
     (?) Genes with P<0.05 in univariate cox regression of all samples
     The prognostic efficacy of the machine learning prognostic model has been assessed in training and validation cohort respectively.
     Part 2 Early-stage NSCLC prognostic model established by gene functional cluster
     Expreesion array and CGH array were executed in the same process as part 1. For the CGH array data, we focused on CNA with more than 3% of samples. Biological functions correlated with NSCLC were assessed by genes with focal amplification/focal deletion.
     For the expression array analysis, our research focused on biological functional cluster in order to identify the prognostic correlated funtions and representative genes. We established the model manually in order to show the early-stage NSCLC prognostic related functions in full scale. The functional prognostic model has also been assessed by independent validation cohort, in order to provide insights for further refinement.
     Results
     Part 1 Early-stage NSCLC prognostic model established by machine learning process
     1. Establishment of machine learning prognostic model
     22 candidate genes had been identified by the following three canditions:
     (?) Differentially expressed genes between matched tumor and normal tissues: 2383 genes
     (?) Genes with CNA in more than 10% tumor samples:953 genes
     (?) Genes with P<0.05 in univariate cox regression:1381 genes
     RS formula acquired by forward selection analysis: RS=(CLDN11×0.777)+(SATB1×1.379)+(ANLN×1.334)+(NUF2×-0.651)
     2. Validation of machine learning prognostic model in training cohort
     (1) All 50 samples:
     Results of log rank test, P=0.000; specificity:24/28=85.7%; sensitivity: 18/22=81.8%; accuracy:(24+18)/50=84.0%.
     (2) Samples in the same TNM stage (38 samples in stage I):
     P=0.000; specificity:22/25=88.0%; sensitivity:11/13=84.6%; accuracy: (22+11)/38=86.8%.
     (3) 18 AC samples in stage I:
     P=0.000; specificity:13/14=92.9%; sensitivity:3/4=75.0%; accuracy: (13+3)/18=88.9%.
     3. Validation of machine learning prognostic model in Lee et al dataset
     (1) All 70 samples:
     P=0.013; specificity:23/35=65.7%; sensitivity:23/35=65.7%; accuracy: (23+23)/70=65.7%.
     (2) 31 AC samples:
     P=0.072; specificity:11/18=61.1%; sensitivity:10/13=76.9%; accuracy: (11+10)/31=67.7%.
     (3) 39 SCC samples:
     P=0.063; specificity:12/21=57.1%; sensitivity:13/18=72.2%; accuracy: (12+13)/39=64.1%.
     4. Validation of 4 genes cox regression model (machine learning prognostic model chosen genes) in Lee et al, Hou et al and Bild et al datasets
     (1) Lee et al dataset (all 70 samples)
     RS formula: RS= (CLDN11×0.079)+(SATB1×0.065)+(ANLN×0.681)+(NUF2×-0.353)
     P=0.004; specificity:23/35=65.7%; sensitivity:23/35=65.7%; accuracy: (23+23)/70=65.7%.
     (2) Bild et al dataset (all 52 samples):
     RS formula: RS= (CLDN11×-0.019)+(SATB1×0.110)+(ANLN×0.275)+(NUF2×-0.074) The model failed to predict sample prognosis, P=0.892.
     (3) Hou et al dataset (all 48 samples):
     RS formula: RS=(CLDN11×-0.029)+(SATB1×-0.014)+(ANLN×-0.070)+(NUF2×0.264) The model failed to predict sample prognosis, P=0.713.
     5. Limitations of candidate genes screening process of machine learning prognostic model establishment
     2 separate experiment groups (each groups had 48 samples in total) were obtained by removing 2 samples from the original cohort (50 samples) randomly. Results of the Univariate cox regression showed that:
     (1) Genes with P<0.05 in experiment group 1:1358 genes.
     (2) Genes with P<0.05 in experiment group 2:1359 genes.
     (3) Genes showed to be P<0.05 both in experiment group 1 and group 2:1130 genes.
     (4) Genes showed to be P<0.05 both in experiment group 1 and original cohort (50 samples):1186 genes.
     (5) Genes showed to be P<0.05 both in experiment group 2 and original cohort (50 samples):1240 genes.
     (6) Genes showed to be P<0.05 in 3 groups:1113 genes. There were 268 genes in total to be different with 1381 genes.
     Part 2 Early-stage NSCLC prognostic model established by gene functional cluster
     1. NSCLC samples showed apparent DNA copy number abberations
     NSCLC related biological functions based on CGH data analysis were:cell proliferation/differentiation, cell cycle, cell apoptosis, cell adhesion, immune response et al.
     2. Early-stage NSCLC prognostic correlated biological funtions and representative genes based on functional cluster
     Cell cycle correlated genes:ANLN, BUB1B and CDCC99 genes;
     Cell proliferation correlated genes:DUSP4, STIL and MKI67 genes;
     Cell adhesion correlated genes:HMMR and CD9 genes;
     Cell apoptosis correlated genes:KIAA0101 and BIRC5 genes;
     Immune response correlated genes:CD1A and C5 genes;
     Blood coagulation correlated genes:F12 and PGDS genes;
     Metabolism correlated genes:LPGAT1 and PPARGC1A genes。
     Among them, cell cycle and cell proliferation correlated genes were the most important genes, yet they were still not strong enough to represent other genes with different biological functions.
     3. Validation of functional prognostic model
     (1) For training cohort:
     RS formula Rs=(MKI67*-1.227)+(ANLN*1.296)+(BUB1B*0.700)+(CCDC99*2.048)+(DUSP4 *-0.853)+(STIL*-2.255)+(HMMR*-.483)+(CD9*-2.083)+(KIAA0101*2.907)+(BIR C5*-1.371)+(CD1A*0.108)+(C5*-1.333)+(LPGATl*1.853)+(PPARGC1A*1.765)+( F12*-0.393)+(PGDS*0.246)
     Log rank test P=0.000;specificity:25/28=89.3%;sensitivity:19/22=86.4%; accuracy:(25+19)/50=88.0%.
     (2)Lee et al dataset:
     RS formula
     RS=(MKI67*-0.024)+(ANLN*0.414)+(BUB1B*0.986)+(CCDC99*0.765)+(DUSP4 *-0.001)+(STIL*0.762)+(HMMR*-.429)+(CD9*-0.261)+(KIAA0101*-0.401)+(BIR C5*-0.490)+(CD1A*0.291)+(C5*-0.316)+(LPGAT1*-0.142)+(PPARGC1A*0.796)+( F12*-0.009)+(PGDS*0.648)
     Log rank test P=0.000;specificity:26/35=74.3%;sensitivity:26/35:74.3%; accuracy:(26+26)/70=74.3%.
     4.Analysis for the functional model with specificity and sensitiviyt need to be upregulated
     Ordering the samples with descending RS values,it turned out that the samples with wrong judgement all assembled in the middle gray area.The width and diffusion of the gray area were highly correlated with model prognosis assessment ability.This provided insights for further refinement of the prognostic model.
     Primary conclusions
     1.Machine learning prognostic model can predict the prognosis of training cohort samples(with sensitivity and specificity more than 80%,independent from TNM staging system).But the efficacy of the model in independent validation cohort is not high enough,even in Asian population.
     2.2 genes of the machine learing model(4 genes in sum)are cell cycle correlated genes,indicate that the cell cycle and cell proliferation genes exerted the highest correlation with early stage NSCLC prognosis.Yet only cell cycle and cell proliferation genes are still not enough to make a definite conclusion.There are still other 5 kinds of biological functions correlated with NSCLC prognosis:cell adhesion, cell apoptosis, immune response, metabolism and blood coagulation related genes.
     3.16 representatives of 7 biological functions were used to establish functional prognostic model, which showed much better prediction efficacy in independent validation chort.
     4. Analysis of functional model with specificity and sensitivity need to be upregulated showed that the samples with wrong judgement all assembled in the middle gray area. The width and diffusion of the gray area were highly correlated with model prognosis assessment ability.
     5. Prognostic related genes can be divided into two sets, genes with positive effects (Geneprognosis-positive) and genes with negative effects (Geneprognosis-negative).The final results relied on the struggle of the two gene sets. If one of them is not stronger than the other, then the sample should be grouped into grey area. This provided insights for further refinement of the prognostic model:
     (1) Try to narrow down the gray area
     (2) Try to identify the cut-off of the gray area between high-risk group and/or low-risk group
引文
1. Jemal A, Bray F, Center MM, et al. Global cancer statistics. CA:A Cancer Journal for Clinicians 2011:caac.20107v1.
    2. Detterbeck FC, Boffa DJ, Tanoue LT. The new lung cancer staging system. Chest 2009;136:260.
    3. Mok TS, Wu YL, Thongprasert S, et al. Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med 2009;361:947-57.
    4. Rosell R, Moran T, Queralt C, et al. Screening for epidermal growth factor receptor mutations in lung cancer. The New England journal of medicine 2009;361:958.
    5. Zhu CQ, Pintilie M, John T, et al. Understanding prognostic gene expression signatures in lung cancer. Clinical lung cancer 2009; 10:331-40.
    6. Ramaswamy S. Translating cancer genomics into clinical oncology. New England Journal of Medicine 2004;350:1814-6.
    7. Paik S, Kim C, Song Y, et al. Technology insight:application of molecular techniques to formalin-fixed paraffin-embedded tissues from breast cancer. Nature Clinical Practice Oncology 2005;2:246-54.
    8.林嘉颖,杨学宁,杨矜记,et al.基因芯片筛选同时参与肺腺癌不同癌变进程的分子靶标[J].肿瘤研究与临床2005;3:145-47.
    9. Michiels S, Koscielny S, Hill C. Interpretation of microarray data in cancer. British journal of cancer 2007;96:1155-8.
    10. Affymetrix:Statistical Algorithms Description Document[M], Santa Clara,2002.
    11. Li C, Wong WH. Model-based analysis of oligonucleotide arrays:model validation, design issues and standard error application. Genome Biol 2001;2:0032.1-.11.
    12. Cope LM, Irizarry RA, Jaffee HA, et al. A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 2004;20:323.
    13. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences 2001;98:13790.
    14. Chen HY, Yu SL, Chen CH, et al. A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med 2007;356:11-20.
    15. Beer DG, Kardia SLR, Huang CC, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine 2002;8:816-24.
    16. Lau SK, Boutros PC, Pintilie M, et al. Three-gene prognostic classifier for early-stage non"Csmall-cell lung cancer. Journal of Clinical Oncology 2007;25:5562.
    17. Petty RD, Nicolson MC, Kerr KM, et al. Gene Expression Profiling in Non-Small Cell Lung Cancer. Clinical Cancer Research 2004; 10:3237.
    18. Blackhall FH, Pintilie M, Wigle DA, et al. Stability and heterogeneity of expression profiles in lung cancer specimens harvested following surgical resection. Neoplasia (New York, NY) 2004;6:761.
    19. Bachtiary B, Boutros PC, Pintilie M, et al. Gene expression profiling in cervical cancer:an exploration of intratumor heterogeneity. Clinical Cancer Research 2006;12:5632.
    20. Scicchitano MS, Dalmas DA, Bertiaux MA, et al. Preliminary comparison of quantity, quality, and microarray performance of RNA extracted from formalin-fixed, paraffin-embedded, and unfixed frozen tissue samples. Journal of Histochemistry & Cytochemistry 2006;54:1229.
    21. Shedden K, Taylor JMG, Enkemann SA, et al. Gene expression"Cbased survival prediction in lung adenocarcinoma:a multi-site, blinded validation study. Nature Medicine 2008;14:822-7.
    22. Brown KR, Jurisica I. Online predicted human interaction database. Bioinformatics 2005;21:2076.
    23. Brown KR, Jurisica I. Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biology 2007;8:R95.
    24. Cabin RJ, Mitchell RJ. To Bonferroni or not to Bonferroni:when and how are the questions. Bulletin of the Ecological Society of America 2000;81:246-8.
    25. Garber ME, Troyanskaya OG, Schluens K, et al. Diversity of gene expression in adenocarcinoma of the lung. Proceedings of the National Academy of Sciences 2001;98:13784.
    26. Miura K, Bowman ED, Simon R, et al. Laser capture microdissection and microarray expression analysis of lung adenocarcinoma reveals tobacco smoking-and prognosis-related molecular profiles. Cancer Research 2002;62:3244.
    27. Wigle DA, Jurisica I, Radulovich N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Research 2002;62:3005.
    28. Gordon GJ, Jensen RV, Hsiao LL, et al. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research 2002;62:4963.
    29. Ramaswamy S, Ross KN, Lander ES, et al. A molecular signature of metastasis in primary solid tumors. Nature Genetics 2002;33:49-54.
    30. Bianchi F, Nuciforo P, Vecchi M, et al. Survival prediction of stage Ⅰ lung adenocarcinomas by expression of 10 genes. Journal of Clinical Investigation 2007;117:3436.
    31. Endoh H, Tomida S, Yatabe Y, et al. Prognostic model of pulmonary adenocarcinoma by expression profiling of eight genes as determined by quantitative real-time reverse transcriptase polymerase chain reaction. Journal of Clinical Oncology 2004;22:811.
    32. Liu J, Blackhall F, Seiden-Long I, et al. Modeling of lung cancer by an orthotopically growing H460SM variant cell line reveals novel candidate genes for systemic metastasis. Oncogene 2004;23:6316-24.
    33. Parmigiani G, Garrett-Mayer ES, Anbazhagan R, et al. A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clinical Cancer Research 2004; 10:2922.
    34. Tomida S, Koshikawa K, Yatabe Y, et al. Gene expression-based, individualized outcome prediction for surgically treated lung cancer patients. Oncogene 2004;23:5360-70.
    35. Lu Y, Lemon W, Liu PY, et al. A gene expression signature predicts survival of patients with stage I non-small cell lung cancer. PLoS medicine 2006;3:e467.
    36. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Research 2006;66:7466.
    37. Blackhall FH, Wigle DA, Jurisica I, et al. Validating the prognostic value of marker genes derived from a non-small cell lung cancer microarray study. Lung Cancer 2004;46:197-204.
    38. Larsen JE, Pavey SJ, Passmore LH, et al. Expression profiling defines a recurrence signature in lung squamous cell carcinoma. Carcinogenesis 2006;28:760.
    39. Larsen JE, Pavey SJ, Passmore LH, et al. Gene expression signature predicts recurrence in lung adenocarcinoma. Clinical Cancer Research 2007; 13:2946.
    40. Lau SK, Boutros PC, Pintilie M, et al. Three-gene prognostic classifier for early-stage nonCsmall-cell lung cancer. Journal of Clinical Oncology 2007;25:5562.
    41. Sun Z, Wigle DA, Yang P. Non-Overlapping and NonCCell-Type"C Specific Gene Expression Signatures Predict Lung Cancer Survival. Journal of Clinical Oncology 2008;26:877.
    42. Skrzypski M, Jassem E, Taron M, et al. Three-gene expression signature predicts survival in early-stage squamous cell carcinoma of the lung. Clinical Cancer Research 2008;14:4794.
    43. Lee ES, Son DS, Kim SH, et al. Prediction of recurrence-free survival in postoperative nonCsmall cell lung cancer patients by using an integrated model of clinical information and gene expression. Clinical Cancer Research 2008; 14:7397.
    44. Hou J, Aerts J, Den Hamer B, et al. Gene expression-based classification of non-small cell lung carcinomas and survival prediction. PLoS One 2010;5:e10312.
    45. Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 2005;439:353-7.
    46. Zhu CQ, Ding K, Strumpf D, et al. Prognostic and predictive gene signature for adjuvant chemotherapy in resected nonCsmall-cell lung cancer. Journal of Clinical Oncology 2010;28:4417.
    47. Massion PP, Kuo WL, Stokoe D, et al. Genomic copy number analysis of non-small cell lung cancer using array comparative genomic hybridization. Cancer Research 2002;62:3636.
    48. Weir BA, Woo MS, Getz G, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007;450:893-8.
    49. Zhao W, Fang G. Anillin is a substrate of anaphase-promoting complex/cyclosome (APC/C) that controls spatial contractility of myosin during late cytokinesis. Journal of Biological Chemistry 2005;280:33516.
    50. Suzuki C, Daigo Y, Ishikawa N, et al. ANLN plays a critical role in human lung carcinogenesis through the activation of RHOA and by involvement in the phosphoinositide 3-kinase/AKT pathway. Cancer Research 2005;65:11314.
    51. Nakamura Y, Tanaka F, Haraguchi N, et al. Clinicopathological and biological significance of mitotic centromere-associated kinesin overexpression in human gastric cancer. British journal of cancer 2007;97:543-9.
    52. Kawamura R, Pope LH, Christensen MO, et al. Mitotic chromosomes are constrained by topoisomerase ⅡCsensitive DNA entanglements. The Journal of Cell Biology 2010; 188:653.
    53. Yu H. Cdc20:a WD40 activator for a cell cycle degradation machine. Molecular cell 007;27:3-16.
    54. Duchrow M, Schlter C, Wohlenberg C, et al. Molecular characterization of the gene locus of the human cell proliferation\associated nuclear protein defined by monoclonal antibody Ki67. Cell proliferation 1996;29:1-12.
    55. Christov C, Trivier E, Krude T. Noncoding human Y RNAs are overexpressed in tumours and required for cell proliferation. British journal of cancer 2008;98:981-8.
    56. Shiota M, Izumi H, Tanimoto A, et al. Programmed cell death protein 4 down-regulates Y-box binding protein-1 expression via a direct interaction with Twistl to suppress cancer cell growth. Cancer Research 2009;69:3148.
    57. Kumar A, Girimaji SC, Duvvari MR, et al. Mutations in STIL, encoding a pericentriolar and centrosomal protein, cause primary microcephaly. The American Journal of Human Genetics 2009;84:286-90.
    58. Glienke W, Maute L, Wicht J, et al. Curcumin inhibits constitutive STAT3 phosphorylation in human pancreatic cancer cell lines and downregulation of survivin/BIRC5 gene expression. Cancer Investigation 2009;28:166-71.
    59. Silke J, Vaux DL. Two kinds of BIR-containing protein-inhibitors of apoptosis, or required for mitosis. Journal of cell science 2001;114:1821.
    60. Yuan RH, Jeng YM, Pan HW, et al. Overexpression of KIAA0101 predicts high stage, early tumor recurrence, and poor prognosis of hepatocellular carcinoma. Clinical Cancer Research 2007;13:5368.
    61. LaTulippe E, Satagopan J, Smith A, et al. Comprehensive gene expression analysis of prostate cancer reveals distinct transcriptional programs associated with metastatic disease. Cancer Research 2002;62:4499.
    62. Bloomston M, Frankel WL, Petrocca F, et al. MicroRNA Expression Patterns to Differentiate Pancreatic Adenocarcinoma From Normal Pancreas and Chronic Pancreatitis. JAMA:The Journal of the American Medical Association 2007;297:1901-8.
    63. Chung CH. Gene Expression Profiles Identify Epithelial-to-Mesenchymal Transition and Activation of Nuclear Factor-B Signaling as Characteristics of a High-risk Head and Neck Squamous Cell Carcinoma. Cancer Research 2006;66:8210-8.
    64. Brundage MD, Davies D, Mackillop WJ. Prognostic factors in non-small cell lung cancer:a decade of progress. Chest 2002;122:1037-57.
    65. Mountain CF. Revisions in the International System for Staging Lung Cancer. Chest 1997;111:1710-7.
    66. Goldstraw P, Crowley J, Chansky K, et al. The IASLC Lung Cancer Staging Project:proposals for the revision of the TNM stage groupings in the forthcoming (seventh) edition of the TNM Classification of malignant tumours. J Thorac Oncol 2007;2:706-14.
    67. Spira A, Ettinger DS. Multidisciplinary management of lung cancer. N Engl J Med 2004;350:379-92.
    68. Spiro SG, Silvestri GA. One hundred years of lung cancer. Am J Respir Crit Care Med 2005;172:523-9.
    69. Pfannschmidt J, Muley T, Bulzebruck H, et al. Prognostic assessment after surgical resection for non-small cell lung cancer:experiences in 2083 patients. Lung Cancer 2007;55:371-7.
    70. Fang D, Zhang D, Huang G, et al. Results of surgical resection of patients with primary lung cancer:a retrospective analysis of 1,905 cases. Ann Thorac Surg 2001;72:1155-9.
    71. Battafarano RJ, Piccirillo JF, Meyers Bf, et al. Impact of comorbidity on survival after surgical resection in patients with stage I non-small cell lung cancer. J Thorac Cardiovasc Surg 2002; 123:280-7.
    72. Ginsberg RJ, Rubinstein LV. Randomized trial of lobectomy versus limited resection for T1 NO non-small cell lung cancer. Lung Cancer Study Group. Ann Thorac Surg 1995;60:615-22, discussion 22-3.
    73. Arriagada R, Bergman B, Dunant A, et al. Cisplatin-based adjuvant chemotherapy in patients with completely resected non-small-cell lung cancer. N Engl J Med 2004;350:351-60.
    74. Betticher DC. Adjuvant and neoadjuvant chemotherapy in NSCLC:a paradigm shift. Lung Cancer 2005;50S2:S9-S16.
    75. Kato H, Ichinose Y, Ohta M, et al. A randomized trial of adjuvant chemotherapy with uracil-tegafur for adenocarcinoma of the lung. N Engl J Med 2004;350:1713-21.
    76. Le Chevalier T, Arriagada R, Pignon JP, et al. Should adjuvant chemotherapy become standard treatment in all patients with resected non-small-cell lung cancer? Lancet Oncol 2005;6:182-4.
    77. Tsuboi M, Ohira T, Saji H, et al. The present status of postoperative adjuvant chemotherapy for completely resected non-small cell lung cancer. Ann Thorac Cardiovasc Surg 2007; 13:73-7.
    78. Winton T, Livingston R, Johnson D, et al. Vinorelbine plus cisplatin vs. observation in resected non-small-cell lung cancer. N Engl J Med 2005;352:2589-97.
    79. El-Sherif A, Gooding WE, Santos R, et al. Outcomes of sublobar resection versus lobectomy for stage I non-small cell lung cancer:a 13-year analysis. Ann Thorac Surg 2006;82:408-15 discussion 15-6.
    80. Keenan RJ, Landreneau RJ, Maley RH, Jr, et al. Segmental resection spares pulmonary function in patients with stage I lung cancer. Ann Thorac Surg 2004;78:228-33 discussion 28-33.
    81. Koike T, Yamato Y, Yoshiya K, et al. Intentional limited pulmonary resection for peripheral T1 NO MO small-sized lung cancer. J Thorac Cardiovasc Surg 2003;125:924-8.
    82. Nakamura H, Kawasaki N, Taguchi M, et al. Survival following lobectomy vs limited resection for stage I lung cancer:a meta-analysis. Br J Cancer 2005;92:1033-7.
    83. Okada M, Koike T, Higashiyama M, et al. Radical sublobar resection for smallsized non-small cell lung cancer:a multicenter study. J Thorac Cardiovasc Surg 2006;132:769-75.
    84. D'amico TA, Massey M, Herndon JE II, et al. A biologic risk model for stage Ⅰ lung cancer:immunohistochemical analysis of 408 patients with the use of ten molecular markers. J Thorac Cardiovasc Surg 1999;117:736-43.
    85. Harpole DH, Jr, Herndon JE Ⅱ, Wolfe WG, et al. A prognostic model of recurrence and death in stage I non-small cell lung cancer utilizing presentation, histopathology, and oncoprotein expression. Cancer Res 1995;55:51-6.
    86. Chen G, Gharib TG, Wang H, et al. Protein profiles associated with survival in lung adenocarcinoma. Proc Natl Acad Sci U S A 2003;100:13537-42.
    87. D'amico TA. Molecular biologic staging of lung cancer. Ann Thorac Surg 2008;85:S737-42.
    88. Coussens LM, Werb Z. Inflammation and cancer. Nature 2002;420:860-7.
    89. Mantovani A, Allavena P, Sica A, et al. Cancer-related inflammation. Nature 2008;454:436-44.
    90. Jones PA, Baylin SB. The epigenomics of cancer. Cell 2007;128:683-92.
    91. Chin L, Gray JW. Translating insights from the cancer genome into clinical practice. Nature 2008;452:553-63.
    92. Schena M, Shalon D, Davis RW, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995;270:467-70.
    93. Lashkari DA, DeRisi JL, McCusker JH, et al. Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci U S A 1997;94:13057-62.
    94. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A 2001;98:13790-5.
    95. Garber ME, Troyanskaya OG, Schluens K, et al. Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci U S A 2001;98:13784-9.
    96. Giordano TJ, Shedden KA, Schwartz DR, et al. Organ-specific molecular classification of primary lung, colon, and ovarian adenocarcinomas using gene expression profiles. Am J Pathol 2001; 159:1231-8.
    97. Yamagata N, Shyr Y, Yanagisawa K, et al. A training-testing approach to the molecular classification of resected non-small cell lung cancer. Clin Cancer Res 2003;9:4695-704.
    98. Gordon GJ, Jensen RV, Hsiao LL, et al. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 2002;62:4963-7.
    99. McDoniels-Silvers AL, Stoner GD, Lubet RA, et al. Differential expression of critical cellular genes in human lung adenocarcinomas and squamous cell carcinomas in comparison to normal lung tissues. Neoplasia 2002;4:141-50.
    100. Beer DG, Kardia SL, Huang CC, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 2002;8:816-24.
    101. Miura K, Bowman ED, Simon R, et al. Laser capture microdissection and microarray expression analysis of lung adenocarcinoma reveals tobacco smoking- and prognosis-related molecular profiles. Cancer Res 2002;62:3244-50.
    102. Potti A, Mukherjee S, Petersen R, et al. A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer. N Engl J Med 2006;355:570-80.
    103. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res 2006;66:7466-72.
    104. Wigle DA, Jurisica I, Radulovich N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res 2002;62:3005-8.
    105. Borczuk AC, Shah L, Pearson GD, et al. Molecular signatures in biopsy specimens of lung cancer. Am J Respir Crit Care Med 2004; 170:167-74.
    106. Gordon GJ, Richards WG, Sugarbaker DJ, et al. A prognostic test for adenocarcinoma of the lung from gene expression profiling data. Cancer Epidemiol Biomarkers Prev 2003;12:905-10.
    107. Inamura K, Fujiwara T, Hoshida Y, et al. Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization. Oncogene 2005;24:7105-13.
    108. Lu Y, Lemon W, Liu PY, et al. A gene expression signature predicts survival of patients with stage I non-small cell lung cancer. PLoS Med 2006;3:e467.
    109. Ramaswamy S, Ross KN, Lander ES, et al. A molecular signature of metastasis in primary solid tumors. Nat Genet 2003;33:49-54.
    110. Tomida S, Koshikawa K, Yatabe Y, et al. Gene expression-based, individualized outcome prediction for surgically treated lung cancer patients. Oncogene 2004;23:5360-70.
    111. Yanaihara N, Caplen N, Bowman E, et al. Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell 2006;9:189-98.
    112. Glinksy GV, Berezovska O, Glinskii AB. Microarray analysis identifies a deathfrom-cancer signature predicting therapy failure in patients with multiple types of cancer. J Clin Invest 2005;115:1503-21.
    113. Guo L, Ma Y, Ward R, et al. Constructing molecular classifiers for the accurate prognosis of lung adenocarcinoma. Clin Cancer Res 2006;12:3344-54.
    114. Hayes DN, Monti S, Parmigiani G, et al. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. J Clin Oncol 2006;24:5079-90.
    115. Kakiuchi S, Daigo Y, Ishikawa N, et al. Prediction of sensitivity of advanced non-small cell lung cancers to gefitinib (Iressa, ZD1839). Hum Mol Genet 2004;13:3029-43.
    116. Harris TJR, McCormick F. The molecular pathology of cancer. Nature Reviews Clinical Oncology 2010;7:251-65.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700