An implication network based methodology was used to identify biomarkers by modeling crosstalk with major lung cancer signaling pathways. Specifically, the methodology contains the following steps: (1) identifying genes significantly associated with lung cancer survival; (2) selecting candidate genes which are differentially expressed in smokers versus non-smokers from the survival genes identified in Step 1; (3) from these candidate genes, constructing gene coexpression networks based on prediction logic for the smoker group and the non-smoker group, respectively; (4) identifying smoking-mediated differential components, i.e., the unique gene coexpression patterns specific to each group; and (5) from the differential components, identifying genes directly co-expressed with major lung cancer signaling hallmarks.
A smoking-associated 6-gene signature was identified for prognosis of lung cancer from a training cohort (n = 256). The 6-gene signature could separate lung cancer patients into two risk groups with distinct post-operative survival (log-rank P < 0.04, Kaplan-Meier analyses) in three independent cohorts (n = 427). The expression-defined prognostic prediction is strongly related to smoking association and smoking cessation (P < 0.02; Pearson's Chi-squared tests). The 6-gene signature is an accurate prognostic factor (hazard ratio = 1.89, 95%CI: [1.04, 3.43]) compared to common clinical covariates in multivariate Cox analysis. The 6-gene signature also provides an accurate diagnosis of lung cancer with an overall accuracy of 73%in a cohort of smokers (n = 164). The coexpression patterns derived from the implication networks were validated with interactions reported in the literature retrieved with STRING8, Ingenuity Pathway Analysis, and Pathway Studio.
The pathway-based approach identified a smoking-associated 6-gene signature that predicts lung cancer risk and survival. This gene signature has potential clinical implications in the diagnosis and prognosis of lung cancer in smokers.