PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
详细信息    查看全文
  • 作者:Sean O’Callaghan (1) (2)
    David P De Souza (1) (2)
    Andrew Isaac (3)
    Qiao Wang (4)
    Luke Hodkinson (5)
    Moshe Olshansky (6)
    Tim Erwin (7)
    Bill Appelbe (8)
    Dedreia L Tull (1) (2)
    Ute Roessner (2) (7)
    Antony Bacic (1) (2) (9)
    Malcolm J McConville (1) (10) (2)
    Vladimir A Liki? (1) (2)
  • 刊名:BMC Bioinformatics
  • 出版年:2012
  • 出版时间:December 2012
  • 年:2012
  • 卷:13
  • 期:1
  • 全文大小:708KB
  • 参考文献:1. Sumner LW, Mendes P, Dixon RA: Plant metabolomics: large-scale phytochemistry in the functional genomics era. / Phytochemistry 2003,62(6):817-36. CrossRef
    2. Villas-Boas SG, Mas S, Akesson M, Smedsgaard J, Nielsen J: Mass spectrometry in metabolome analysis. / Mass Spectrom Rev 2005,24(5):613-46. CrossRef
    3. Halket JM, Waterman D, Przyborowska AM, Patel RK, Fraser PD, Bramley PM: Chemical derivatization and mass spectral libraries in metabolic profiling by GC/MS and LC/MS/MS. / J Exp Bot 2005,56(410):219-43. CrossRef
    4. Kopka J: Gas Chromatography Mass Spectrometry. In / Plant Metabolomics. Edited by: Saito K, Dixon RA, Willmitzer L. Berlin Heidelberg: Springer; 2006:3-0. CrossRef
    5. Fiehn O: Extending the breadth of metabolite profiling by gas chromatography coupled to mass spectrometry. / Trends Analyt Chem 2008,27(3):261-69. CrossRef
    6. Dettmer K, Aronov PA, Hammock BD: Mass spectrometry-based metabolomics. / Mass Spectrom Rev 2007,26(1):51-8. CrossRef
    7. Horning EC, Horning MG: Metabolic profiles: gas-phase methods for analysis of metabolites. / Clin Chem 1971,17(8):802-09.
    8. Fernie AR, Trethewey RN, Krotzky AJ, Willmitzer L: Metabolite profiling: from diagnostics to systems biology. / Nat Rev Mol Cell Biol 2004,5(9):763-69. CrossRef
    9. Shu XL, Frank T, Shu QY, Engel KH: Metabolite profiling of germinating rice seeds. / J Agric Food Chem 2008,56(24):11612-1620. CrossRef
    10. Richardson SD: Mass spectrometry in environmental sciences. / Chem Rev 2001,101(2):211-54. CrossRef
    11. Pasikanti KK, Ho PC, Chan EC: Gas chromatography/mass spectrometry in metabolic profiling of biological fluids. / J Chromatogr B Analyt Technol Biomed Life Sci 2008,871(2):202-11. CrossRef
    12. Niessen WMA (Ed): / Current practice of gas chromatography--mass spectrometry. New York: Marcel Dekker; 2001.
    13. Kanani H, Chrysanthopoulos PK, Klapa MI: Standardizing GC-MS metabolomics. / J Chromatogr B Analyt Technol Biomed Life Sci 2008,871(2):191-01. CrossRef
    14. Katajamaa M, Oresic M: Data processing for mass spectrometry-based metabolomics. / J Chromatogr A 2007,1158(1-):318-28.
    15. Smith CA, Want EJ, O'Maille G, Abagyan R, Siuzdak G: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. / Anal Chem 2006,78(3):779-87. CrossRef
    16. Katajamaa M, Miettinen J, Oresic M: MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. / Bioinformatics 2006,22(5):634-36. CrossRef
    17. Bunk B, Kucklick M, Jonas R, Munch R, Schobert M, Jahn D, Hiller K: MetaQuant: a tool for the automatic quantification of GC/MS-based metabolome data. / Bioinformatics 2006,22(23):2962-965. CrossRef
    18. Hiller K, Hangebrauk J, Jager C, Spura J, Schreiber K, Schomburg D: MetaboliteDetector: comprehensive analysis tool for targeted and nontargeted GC/MS based metabolome analysis. / Anal Chem 2009,81(9):3429-439. CrossRef
    19. Wenig P, Odermatt J: OpenChrom: a cross-platform open source software for the mass spectrometric analysis of chromatographic data. / BMC Bioinforma 2010, 11:405-13. CrossRef
    20. Styczynski MP, Moxley JF, Tong LV, Walther JL, Jensen KL, Stephanopoulos GN: Systematic identification of conserved metabolites in GC/MS data for metabolomics and biomarker discovery. / Anal Chem 2007,79(3):966-73. CrossRef
    21. Xia J, Wishart DS: Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. / Nat Protoc 2011,6(6):743-60. CrossRef
    22. Xia J, Wishart DS: Metabolomic data processing, analysis, and interpretation using MetaboAnalyst. / Curr Protoc Bioinformatics 2011, Chapter 14:Unit 14 10.
    23. Carroll AJ, Badger MR, Harvey Millar A: The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets. / BMC Bioinforma 2010, 11:376. CrossRef
    24. Stein SE: An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. / J Am Soc Mass Spectrom 1999,10(8):770-81. CrossRef
    25. Behrends V, Tredwell GD, Bundy JG: A software complement to AMDIS for processing GC-MS metabolomic data. / Anal Biochem 2011,415(2):206-08. CrossRef
    26. Aggio R, Villas-Boas SG, Ruggiero K: Metab: an R package for high-throughput analysis of metabolomics data generated by GC-MS. / Bioinformatics 2011,27(16):2316-318. CrossRef
    27. Duran AL, Yang J, Wang L, Sumner LW: Metabolomics spectral formatting, alignment and conversion tools (MSFACTs). / Bioinformatics 2003,19(17):2283-293. CrossRef
    28. Broeckling CD, Reddy IR, Duran AL, Zhao X, Sumner LW: MET-IDEA: data extraction tool for mass spectrometry-based metabolomics. / Anal Chem 2006,78(13):4334-341. CrossRef
    29. Luedemann A, Strassburg K, Erban A, Kopka J: TagFinder for the quantitative analysis of gas chromatography–mass spectrometry (GC-MS)-based metabolite profiling experiments. / Bioinformatics 2008,24(5):732-37. CrossRef
    30. Lommen A: MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. / Anal Chem 2009,81(8):3079-086. CrossRef
    31. Hoffmann N, Stoye J: ChromA: signal-based retention time alignment for chromatography-mass spectrometry data. / Bioinformatics 2009,25(16):2080-081. CrossRef
    32. Cuadros-Inostroza A, Caldana C, Redestig H, Kusano M, Lisec J, Pena-Cortes H, Willmitzer L, Hannah MA: TargetSearch–a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data. / BMC Bioinforma 2009, 10:428. CrossRef
    33. Tautenhahn R, Bottcher C, Neumann S: Highly sensitive feature detection for high resolution LC/MS. / BMC Bioinforma 2008, 9:504. CrossRef
    34. Grigsby CC, Rizki MM, Tamburino LA, Pitsch RL, Shiyanov PA, Cool DR: Metabolite differentiation and discovery lab (MeDDL): a new tool for biomarker discovery and mass spectral visualization. / Anal Chem 2010,82(11):4386-395. CrossRef
    35. Strohalm M, Kavan D, Novak P, Volny M, Havlicek V: mMass 3: a cross-platform software environment for precise analysis of mass spectrometric data. / Anal Chem 2010,82(11):4648-651. CrossRef
    36. Robinson MD, De Souza DP, Keen WW, Saunders EC, McConville MJ, Speed TP, Likic VA: A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography–mass spectrometry experiments. / BMC Bioinforma 2007, 8:419. CrossRef
    37. Zhang W, Wu P, Li C: Study of automated mass spectral deconvolution and identification system (AMDIS) in pesticide residue analysis. / Rapid Commun Mass Spectrom 2006,20(10):1563-568. CrossRef
    38. Iavicoli I, Chiarotti M, Bergamaschi A, Marsili R, Carelli G: Determination of airborne polycyclic aromatic hydrocarbons at an airport by gas chromatography–mass spectrometry and evaluation of occupational exposure. / J Chromatogr A 2007,1150(1-):226-35.
    39. Weingart G, Kluger B, Forneck A, Krska R, Schuhmacher R: Establishment and Application of a Metabolomics Workflow for Identification and Profiling of Volatiles from Leaves of Vitis vinifera by HS-SPME-GC-MS. / Phytochem Anal 2011.
    40. Erickson B: ANDI MS standard finalized. / Anal Chem 2000,72(3):103A.
    41. Draft JCAMP-DX Protocols http://www.jcamp-dx.org/
    42. Rew R, Davis G: NetCDF: an interface for scientific data access. / Computer Graphics and Applications, IEEE 1990,10(4):76-2. CrossRef
    43. Likic VA: Extraction of pure components from overlapped signals in gas chromatography–mass spectrometry (GC-MS). / BioData Min 2009,2(1):6. CrossRef
    44. Savitzky A, Golay MJE: Smoothing and Differentiation of Data by Simplified Least Squares Procedures. / Anal Chem 1964, 36:1627-638. CrossRef
    45. Adams MJ: / Chemometrics in Analytical Spectroscopy. Cambridge: RSC Publishing; 2004.
    46. The Scientist and Engineer's Guide to Digital Signal Processing http://www.DSPguide.com
    47. Wiltschko AB, Gage GJ, Berke JD: Wavelet filtering before spike detection preserves waveform shape and enhances single-unit discrimination. / J Neurosci Methods 2008,173(1):34-0. CrossRef
    48. Xi Y, Rocke DM: Baseline correction for NMR spectroscopic metabolomics data analysis. / BMC Bioinformatics 2008, 9:324. CrossRef
    49. Barkauskas DA, Rocke DM: A general-purpose baseline estimation algorithm for spectroscopic data. / Anal Chim Acta 2010,657(2):191-97. CrossRef
    50. Angulo J, Serra J: Automatic analysis of DNA microarray images using mathematical morphology. / Bioinformatics 2003,19(5):553-62. CrossRef
    51. Biller JE, Biemann K: Reconstruction of mass spectra, a novel approach for the utilization of gas chromatograph-mass spectrometer data. / Anal Lett 1974, 7:515-28. CrossRef
    52. Dromey RG: Extraction of mass spectra free of background and neighboring component contributions from gas chromatography/mass spectrometry data. / Analytical Chemistry 1976,48(9):1368-375. CrossRef
    53. Knorr FJ, Futrell JH: Separation of mass spectra of mixtures by factor analysis. / Analytical Chemistry 1979,51(8):1236-241. CrossRef
    54. Karjalainen EJ, Karjalainen UP: Component reconstruction in the primary space of spectra and concentrations. Alternating regression and related direct methods. / Analytica Chimica Acta 1991,250(1):169-79. CrossRef
    55. Colby BN: Spectral deconvolution for overlapping GC/MS components. / J Am Soc Mass Spectrom 1992,3(5):558-62. CrossRef
    56. Pool WG, De Leeuw JW, Van de Graaf B: Automated extraction of pure mass spectra from gas chromatographic/mass spectrometric data. / J Mass Spectrom 1997,32(4):438-43. CrossRef
    57. Johnson KJ, Wright BW, Jarman KH, Synovec RE: High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis. / J Chromatogr A 2003,996(1-):141-55.
    58. Baran R, Kochi H, Saito N, Suematsu M, Soga T, Nishioka T, Robert M, Tomita M: MathDAMP: a package for differential analysis of metabolite profiles. / BMC Bioinformatics 2006, 7:530. CrossRef
    59. Chae M, Shmookler Reis RJ, Thaden JJ: An iterative block-shifting approach to retention time alignment that preserves the shape and area of gas chromatography–mass spectrometry peaks. / BMC Bioinformatics 2008,9(Suppl 9):S15. CrossRef
    60. Bylund D, Danielsson R, Malmquist G, Markides KE: Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data. / J Chromatogr A 2002,961(2):237-44. CrossRef
    61. Jonsson P, Gullberg J, Nordstrom A, Kusano M, Kowalczyk M, Sjostrom M, Moritz T: A strategy for identifying differences in large series of metabolomic samples analyzed by GC/MS. / Anal Chem 2004,76(6):1738-745. CrossRef
    62. Lu H, Liang Y, Dunn WB, Shen H, Kell DB: Comparative evaluation of software for deconvolution of metabolomics data based on GC-TOF-MS. / TrAC Trends in Analytical Chemistry 2008,27(3):215-27. CrossRef
    63. Lei Z, Huhman DV, Sumner LW: Mass spectrometry strategies in metabolomics. / J Biol Chem 2011,286(29):25435-5442. CrossRef
    64. Aberg KM, Torgrip RJ, Kolmert J, Schuppe-Koistinen I, Lindberg J: Feature detection and alignment of hyphenated chromatographic-mass spectrometric data. Extraction of pure ion chromatograms using Kalman tracking. / J Chromatogr A 2008,1192(1):139-46. CrossRef
    65. Arbona V, Iglesias DJ, Talon M, Gomez-Cadenas A: Plant phenotype demarcation using nontargeted LC-MS and GC-MS metabolite profiling. / J Agric Food Chem 2009,57(16):7338-347. CrossRef
    66. Dai Y, Li Z, Xue L, Dou C, Zhou Y, Zhang L, Qin X: Metabolomics study on the anti-depression effect of xiaoyaosan on rat model of chronic unpredictable mild stress. / J Ethnopharmacol 2010,128(2):482-89. CrossRef
    67. Yao H, Shi P, Zhang L, Fan X, Shao Q, Cheng Y: Untargeted metabolic profiling reveals potential biomarkers in myocardial infarction and its application. / Mol Biosyst 2010,6(6):1061-070. CrossRef
    68. Serrazanetti DI, Ndagijimana M, Sado-Kamdem SL, Corsetti A, Vogel RF, Ehrmann M, Guerzoni ME: Acid stress-mediated metabolic shift in Lactobacillus sanfranciscensis LSCE1. / Appl Environ Microbiol 2011,77(8):2656-666. CrossRef
    69. Gu Q, David F, Lynen F, Rumpel K, Dugardeyn J, Van Der Straeten D, Xu G, Sandra P: Evaluation of automated sample preparation, retention time locked gas chromatography–mass spectrometry and data analysis methods for the metabolomic study of Arabidopsis species. / J Chromatogr A 2011,1218(21):3247-254. CrossRef
    70. Wishart DS: Current progress in computational metabolomics. / Brief Bioinform 2007,8(5):279-93. CrossRef
    71. Zolnai Z, Lee PT, Li J, Chapman MR, Newman CS, Phillips GN Jr, Rayment I, Ulrich EL, Volkman BF, Markley JL: Project management system for structural and functional proteomics: Sesame. / J Struct Funct Genomics 2003,4(1):11-3. CrossRef
    72. Want EJ, Cravatt BF, Siuzdak G: The expanding role of mass spectrometry in metabolite profiling and characterization. / Chembiochem 2005,6(11):1941-951. CrossRef
    73. Wright P: Metabolite identification by mass spectrometry: forty years of evolution. / Xenobiotica 2011,41(8):670-86. CrossRef
    74. Wishart DS: Advances in metabolite identification. / Bioanalysis 2011,3(15):1769-782. CrossRef
    75. Smith CA, O'Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G: METLIN: a metabolite mass spectral database. / Ther Drug Monit 2005,27(6):747-51. CrossRef
    76. Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, / et al.: GMD@CSB.DB: the Golm Metabolome Database. / Bioinformatics 2005,21(8):1635-638. CrossRef
    77. Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, / et al.: HMDB: the Human Metabolome Database. / Nucleic Acids Res 2007,35(Database issue):D521-D526. CrossRef
    78. Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, Ojima Y, Tanaka K, Tanaka S, Aoshima K, / et al.: MassBank: a public repository for sharing mass spectral data for life sciences. / J Mass Spectrom 2010,45(7):703-14. CrossRef
  • 作者单位:Sean O’Callaghan (1) (2)
    David P De Souza (1) (2)
    Andrew Isaac (3)
    Qiao Wang (4)
    Luke Hodkinson (5)
    Moshe Olshansky (6)
    Tim Erwin (7)
    Bill Appelbe (8)
    Dedreia L Tull (1) (2)
    Ute Roessner (2) (7)
    Antony Bacic (1) (2) (9)
    Malcolm J McConville (1) (10) (2)
    Vladimir A Liki? (1) (2)

    1. Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
    2. Metabolomics Australia, Bio21 Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
    3. Victorian Life Science Computational Initiative, The University of Melbourne, Parkville, Victoria, 3010, Australia
    4. National ICT Australia (NICTA), The University of Melbourne, Parkville, Victoria, 3010, Australia
    5. Centre for Astrophysics & Supercomputing, Swinburne University of Technology, Hawthorn, Victoria, 3122, Australia
    6. Walter and Eliza Hall Institute of Medical Research, 1 G Royal Parade, Parkville, Victoria, 3052, Australia
    7. Australian Centre for Plant and Functional Genomics, School of Botany, The University of Melbourne, Parkville, Victoria, 3010, Australia
    8. Victorian Partnership for Advanced Computing, 110 Victoria Street, Carlton South, Victoria, 3053, Australia
    9. ARC Centre of Excellence in Plant Cell Walls, School of Botany, The University of Melbourne, Parkville, Victoria, 3010, Australia
    10. Department of Biochemistry and Molecular Biology, The University of Melbourne, Parkville, Victoria, 3010, Australia
  • ISSN:1471-2105
文摘
Background Gas chromatography–mass spectrometry (GC-MS) is a technique frequently used in targeted and non-targeted measurements of metabolites. Most existing software tools for processing of raw instrument GC-MS data tightly integrate data processing methods with graphical user interface facilitating interactive data processing. While interactive processing remains critically important in GC-MS applications, high-throughput studies increasingly dictate the need for command line tools, suitable for scripting of high-throughput, customized processing pipelines. Results PyMS comprises a library of functions for processing of instrument GC-MS data developed in Python. PyMS currently provides a complete set of GC-MS processing functions, including reading of standard data formats (ANDI- MS/NetCDF and JCAMP-DX), noise smoothing, baseline correction, peak detection, peak deconvolution, peak integration, and peak alignment by dynamic programming. A novel common ion single quantitation algorithm allows automated, accurate quantitation of GC-MS electron impact (EI) fragmentation spectra when a large number of experiments are being analyzed. PyMS implements parallel processing for by-row and by-column data processing tasks based on Message Passing Interface (MPI), allowing processing to scale on multiple CPUs in distributed computing environments. A set of specifically designed experiments was performed in-house and used to comparatively evaluate the performance of PyMS and three widely used software packages for GC-MS data processing (AMDIS, AnalyzerPro, and XCMS). Conclusions PyMS is a novel software package for the processing of raw GC-MS data, particularly suitable for scripting of customized processing pipelines and for data processing in batch mode. PyMS provides limited graphical capabilities and can be used both for routine data processing and interactive/exploratory data analysis. In real-life GC-MS data processing scenarios PyMS performs as well or better than leading software packages. We demonstrate data processing scenarios simple to implement in PyMS, yet difficult to achieve with many conventional GC-MS data processing software. Automated sample processing and quantitation with PyMS can provide substantial time savings compared to more traditional interactive software systems that tightly integrate data processing with the graphical user interface.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700