FBMN with MZmine

Introduction

The main documentation for Feature-Based Molecular Networking (FBMN) can be accessed here. See our preprint on bioaRxiv.

Below we describe how to use MZmine2 v2.51 with the FBMN workflow on GNPS. We have previously written the documentation for v2.33 and have noted differences in the software versions.

Mass spectrometry processing with MZmine

Citations and development

This work builds on the efforts of our many colleagues, please cite their work:

Nothias, L.F. et al Feature-based Molecular Networking in the GNPS Analysis Environment bioRxiv 812404 (2019).

Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).

Katajamaa, M., Miettinen, J. & Oresic, M. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636 (2006).

Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).

The development of the features used in the pipeline is publicly accessible here.

Installation

Download the latest version of MZmine software (version MZmine v2.33 minimum) at https://github.com/mzmine/mzmine2/releases.

Data Processing with MZmine for FBMN

In MZmine, a sequence of steps are performed to process the mass spectrometry data. Here we will present key steps required to process LC-MS/MS data acquired in non-targeted mode (data dependent acquisition). For convenience we also provide a batch file (XML format) that can be imported directly in MZmine.

IMPORTANT: MZmine parameters will vary depending on the instrument used, the acquisition parameters, and samples studied. The following documentation serves a basic guideline for using MZmine with the FBMN workflow.

Please consult the resources below for more details on MZmine processing:

  • The video tutorial below about Quick MZMine2 Export to GNPS for FBMN

Convert your LC-MS/MS Data to an Open Format

MZmine accepts different input formats. Note that we recommand to first convert your files to mzML format before doing MZmine2 processing. See the documentation here.

Processing Steps

Below is the schematic representation of the LC-MS/MS data processing steps with MZmine (thanks Daniel Petras !):

complete workflow view

Below is the overview of the LC-MS/MS data processing steps in the MZmine batch mode:

img

Batch Import

Here are some MZmine batch that are compatible with the FBMN workflow. These batch files can be imported into MZMine (Batch mode):

Instrument Gradient Length Matrix Type Sample Size Download
Bruker Maxis HD qTof 10 Min Stool 20 Batch
Please contribute a send batch file for various instruments Batch

Processing Steps

Below is a walk-through of all the steps

1. Import Files

Go to Menu: Raw data methods > Raw data import > Select the files

img

2. Mass Detection

This step creates mass lists from your raw LC-MS/MS data (non-targeted mode).

Perform mass detection on MS level 1: Menu: Raw data methods > Feature Detection > Mass detection > Set filters: MS level 1.

(version 2.33) Menu: Raw data methods > Mass detection > Set filter : MS level 1

IMPORTANT Set an appropriate intensity threshold. You can use the preview window to assess the right threshold on your data. As a rule of thumb, the value should at least correspond to the minimum value set for the triggering of the MS2 scan event. (Example: MAXIS-QTOF: 1E3, Q-Exactive 1E4)

Perform mass detection on MS level 2. The same mass list name must be used.

Go to Menu: Raw data methods > Feature Detection > Mass detection > Set filters: MS level 2.

(version 2.33) Go to: Raw data methods > Mass detection > Set filter : MS level 2.

IMPORTANT: Make sure to set an intensity threshold representative of noise level in the MS2 spectra. This is typically lower than for MS1. (Example: maXis QTOF: 1E2; LTQ-XL Orbitrap 1E4, Q-Exactive: 0). If you have any doubt, set it to 0.

img

3. Build Chromatogram (LC-MS feature detection part 1)

Starting with MZmine 2.39, the original Chromatogram builder is considered deprecated. It has been replaced with the ADAP Chromatogram Builder. The ADAP Module includes parameters for the Min group size # of scans and Group intensity threshold. Further explanation for these parameters can be found in the ADAP Tutorial. If you use the ADAP Chromatogram Builder, please cite the publication below.

Myers, O.D. et al, One Step Forward for Reducing False Positive and False Negative Compound Identifications from Mass Spectrometry Metabolomics Data: New Algorithms for Constructing Extracted Ion Chromatograms and Detecting Chromatographic Peaks. Anal. Chem. 89, 17, 8696-8703 (2017).

Go to: Raw data methods > Feature Detection > Chromatogram builder OR ADAP Chromatogram builder

(version 2.33) Go to: Raw data methods > Chromatogram builder

4. Deconvolve the Chromatogram (LC-MS feature detection part 2)

Go to Menu: Feature list methods > Feature detection > Chromatogram deconvolution

(version 2.33) Go to Menu: Peak list methods > Peak detection > Chromatogram deconvolution

IMPORTANT: tick both options "m/z range for MS2 scan pairing (Da)" and "RT range for MS2 scan pairing (min)". The values have to be defined according to your experimental setup (expected MS mass accuracy and chromatographic peak width).

Example for a UHPLC colum (1.7 µm C18, 50 × 2.1 mm, flow rate of 0.5 mL/min):

  • maXis-QTOF: 12 min gradient, 0.02 Da and 0.15 min

  • Q-Exactive: 5 min gradient, 0.01 Da and 0.1 min

img

5. Group isotopes and co-eluting ions

Use the "Isotopic peaks grouper" [recommended] or other alternative (such as the CAMERA module).

Go to Menu: Feature list methods > Isotopes > Isotopic peaks grouper

(version 2.33) Go to Menu: Peak list methods > Isotopes > Isotopic peaks grouper.

IMPORTANT: This depends on your expected peak shapes, duty cycle time and the MS mass accuracy. (Example: MAXIS-QTOF, 10 min gradient, 0.1 min, 0.02 m/z; Q-Exactive, 5 min gradient, 0.05 min, 0.01 m/z)

6. Order the peaklists

Go to Menu: Peak list methods > Order peak lists.

IMPORTANT: This is to ensure the reproducibility of the processing. Indeed, the aligned peak list will change slighlty if that step is not performed.

7. LC-MS feature alignement (Peaklist alignement)

In this step, the peak lists from each sample will be aligned in one aligned peak list. The alignement is performed iteratively using the first peak list selected (see MZmine documentation). For that reason, make sure the first sample is adapted (not a negative control) or to manually put an representative peaklist in the first position.

Go to Menu: Feature list methods > Alignment > Join aligner

(version 2.33) Go to Menu: Peak list methods > Alignment > Join aligner

8. (Optional) Detect Missing Peaks / Gap Filling

Gap filling enables to retrieve the intensity of a peak in all the samples, even if it was not detected in a previous processing step. Go to Menu: Feature list methods > Gap filling > Peak finder (multi-threaded).

(version 2.33) Go to Menu: Peak list methods > Gap filling > Peak finder (multi-threaded).

IMPORTANT: This step is optional. Use the multi-threaded peak finder for fast processing.

9. (Optional) Filter the Peaklist to MS/MS Peaks

Depending on the number of features in the aligned peaklist, it is possible to filter the peaklist to keep only features with minimum number of occurences ("Minimum peaks in a row") or a mininum number of isotopic peaks for the feature ("Minimum peaks in an isotope pattern"), or to "Keep only peaks with MS2 scan (GNPS)".

Go to Menu: Feature list methods > Filtering > Feature list rows filter > Select the filters

(version 2.33) Go to Menu: Peak list methods > Filtering > Peak list row filter > Select the filters

IMPORTANT: if you use a filter, we recommend using the filter "Reset the peak number ID"

IMPORTANT Note that this step was mandatory in the prototype versions of FBMN with MZmine, now the filter "Keep only peaks with MS2 scan (GNPS)" is optional.

10. Use the GNPS Export module

Use the dedicated module "Submit to/Export for GNPS" in MZmine under Feature list methods > Export/Import to export the needed file:

  • the feature quantification table (.CSV file format) with LC-MS feature intensities.
  • the MS/MS spectral summary (.MGF file), with a representative MS/MS spectrum per LC-MS feature. The MS/MS spectrum correspond either to the most intense MS/MS found for the feature, or to the merged spectrum (new feature !)

Select the lastest "filtered aligned peaklist" generated and Go to Menu: Peak list methods > Export > Export for/Submit to GNPS

img

See an example of files generated by the export module using the workflow: here.

The files can be uploaded to the GNPS web-platform and Feature-Based Molecular Networking job can be directly launched

IMPORTANT: While the possibility to submit the files directly to GNPS and launch a FBMN job on the fly is really convenient for quick data analysis, the files will not be saved in your personal account on GNPS and are periodically deleted, which will prevent futur cloning of the jobs. If you do not provide username/password, and you are limited to basic presets of parameters. For that reason, we recommend to upload your files with the FTP uploader (see documentation) and prepare your job directly on GNPS (you must be logged in first).

img

In the "Export for/Submit to GNPS" module, select the option: "Submit to GNPS"

  • [Optional] Metadata file: specify the path to the metadata table in GNPS format. See documentation here

  • Select the parameters presets for the GNPS job.

  • [Optional] Email: specify the email to forward the job link

  • [Optional] Annotation edges (Experimental feature that will described later).

  • [Optional] Open website: if ticked, will open the job webpage.

ADDITIONAL NOTES: The feature table must contain at least the row ID, the row m/z, and row retention time, along with the sample columns. It is currently mandatory for the sample name headers to have the following format: "filename Peak area". Depending on the steps used in MZmine the sample name header can be "filename baseline-corrected Peak area", but this has to be changed back to "filename Peak area".

Video Tutorial - Quick MZMine Export to GNPS for FBMN.

The workflow for Feature Based Molecular Networking in GNPS is different from the classic molecular networking workflow. Access the FBMN workflow here (You need to be logged in first !)

FBMN in GNPS

The main documentation of the Feature Based Molecular Networking workflow on GNPS can be consulted on that page. The workflow for Feature Based Molecular Networking in GNPS is different from the "classic" molecular networking workflow. Access the FBMN workflow here (You need to be logged in first !).

Basically, you will need to upload the files produced by MZmine (test files are accessible here):

  • The feature quantification table (.CSV file format).
  • The MS/MS spectral summary (.MGF file format)
  • [Optional] The metadata table - described here

There are several additional normalization options specifically for feature detection. We can normalize the features per LC/MS run and aggregate by groups with either a sum or average (recommended).

img

Here is an example FBMN job with files resulting from MZmine2 processing of a subset of the American Gut Project.

Video Tutorial - Analyze FBMN jobs in GNPS

This video presents

FBMN in Cytoscape

Cytoscape is an open source software platform used to visualize, analyze and annotate molecular networks from GNPS. See the documentation here

Tutorials

See our tutorial on using MZmine2 for FBMN analysis of a cohort from the American Gut Project, and our tutorial on running a FBMN analysis on GNP.

Page contributors

Louis Felix Nothias (UCSD), Daniel Petras (UCSD), Ming Wang (UCSD), Ivan Protsyuk (EMBL, Heidelberg, Germany).

Join the GNPS Community !