FBMN with MZmine
Introduction¶
The main documentation for Feature-Based Molecular Networking can be accessed here. See our article.
Below we describe how to use MZmine2 v2.51 with the FBMN workflow on GNPS. We have previously written the documentation for v2.33 and have noted differences in the software versions.
Mass spectrometry processing with MZmine¶
Citations and development¶
Recommended Citations
This work builds on the efforts and tools from our many colleagues, please cite their work:
Nothias, L.-F., Petras, D., Schmid, R. et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 17, 905–908 (2020).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).
Katajamaa, M., Miettinen, J. & Oresic, M. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636 (2006).
Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).
The development of the features used in the pipeline is publicly accessible here.
Installation¶
Download the latest version of MZmine software (version MZmine v2.33 minimum) at https://github.com/mzmine/mzmine2/releases.
Data Processing with MZmine for FBMN¶
In MZmine, a sequence of steps are performed to process the mass spectrometry data. Here we will present key steps required to process LC-MS/MS data acquired in non-targeted mode (data dependent acquisition). For convenience we also provide a batch file (XML format) that can be imported directly in MZmine.
Warning
MZmine parameters will vary depending on the instrument used, the acquisition parameters, and samples studied. The following documentation serves a basic guideline for using MZmine with the FBMN workflow.
Please consult the resources below for more details on MZmine processing:
- The official documentation http://mzmine.github.io/documentation.html,
- The MZmine tutorial by Pierre-Marie Allard and Joelle Houriet from the University of Geneva.
- ADAP User Manual
- The video tutorial below about MZmine2 processing for Feature Based Molecular Networking. Note that this video is slighlty outdatted, so please refer to the steps described in this documentation.
- The video tutorial below about Quick MZMine2 Export to GNPS for FBMN
Convert your LC-MS/MS Data to an Open Format¶
MZmine accepts different input formats. Note that we recommand to first convert your files to mzML format before doing MZmine2 processing. See the documentation here.
Processing Steps¶
Below is the schematic representation of the LC-MS/MS data processing steps with MZmine (thanks Daniel Petras !):
Below is the overview of the LC-MS/MS data processing steps in the MZmine batch mode:
Batch Import¶
Here are some MZmine batch that are compatible with the FBMN workflow. These batch files can be imported into MZMine (Batch mode):
Instrument | Gradient Length | Matrix Type | Sample Size | Download |
---|---|---|---|---|
Bruker Maxis HD qTof | 10 Min | Stool | 20 | Batch |
Please contribute a send batch file for various instruments | Batch |
Processing Steps¶
Below is a walk-through of all the steps
1. Import Files¶
Go to Menu: Raw data methods > Raw data import > Select the files
2. Mass Detection¶
This step creates mass lists from your raw LC-MS/MS data (non-targeted mode).
Perform mass detection on MS level 1: Menu: Raw data methods > Feature Detection > Mass detection > Set filters: MS level 1.
(version 2.33) Menu: Raw data methods > Mass detection > Set filter : MS level 1
Important
Set an appropriate intensity threshold. You can use the preview window to assess the right threshold on your data. As a rule of thumb, the value should at least correspond to the minimum value set for the triggering of the MS2 scan event. (Example: MAXIS-QTOF: 1E3, Q-Exactive 1E4)
Perform mass detection on MS level 2. The same mass list name must be used.
Go to Menu: Raw data methods > Feature Detection > Mass detection > Set filters: MS level 2.
(version 2.33) Go to: Raw data methods > Mass detection > Set filter : MS level 2.
Important
Make sure to set an intensity threshold representative of noise level in the MS2 spectra. This is typically lower than for MS1. (Example: maXis QTOF: 1E2; LTQ-XL Orbitrap 1E4, Q-Exactive: 0). If you have any doubt, set it to 0.
3. Build Chromatogram (LC-MS feature detection part 1)¶
Starting with MZmine 2.39, the original Chromatogram builder is considered deprecated. It has been replaced with the ADAP Chromatogram Builder. The ADAP Module includes parameters for the Min group size # of scans and Group intensity threshold. Further explanation for these parameters can be found in the ADAP Tutorial. If you use the ADAP Chromatogram Builder, please cite the publication below.
Myers, O.D. et al, One Step Forward for Reducing False Positive and False Negative Compound Identifications from Mass Spectrometry Metabolomics Data: New Algorithms for Constructing Extracted Ion Chromatograms and Detecting Chromatographic Peaks. Anal. Chem. 89, 17, 8696-8703 (2017).
Go to: Raw data methods > Feature Detection > Chromatogram builder OR ADAP Chromatogram builder
(version 2.33) Go to: Raw data methods > Chromatogram builder
4. Deconvolve the Chromatogram (LC-MS feature detection part 2)¶
Go to Menu: Feature list methods > Feature detection > Chromatogram deconvolution
(version 2.33) Go to Menu: Peak list methods > Peak detection > Chromatogram deconvolution
Important
tick both options "m/z range for MS2 scan pairing (Da)" and "RT range for MS2 scan pairing (min)". The values have to be defined according to your experimental setup (expected MS mass accuracy and chromatographic peak width).
Example for a UHPLC colum (1.7 µm C18, 50 × 2.1 mm, flow rate of 0.5 mL/min):
-
maXis-QTOF: 12 min gradient, 0.02 Da and 0.15 min
-
Q-Exactive: 5 min gradient, 0.01 Da and 0.1 min
5. Group isotopes and co-eluting ions¶
Use the "Isotopic peaks grouper" [recommended] or other alternative (such as the CAMERA module).
Go to Menu: Feature list methods > Isotopes > Isotopic peaks grouper
(version 2.33) Go to Menu: Peak list methods > Isotopes > Isotopic peaks grouper.
Important
This depends on your expected peak shapes, duty cycle time and the MS mass accuracy. (Example: MAXIS-QTOF, 10 min gradient, 0.1 min, 0.02 m/z; Q-Exactive, 5 min gradient, 0.05 min, 0.01 m/z)
6. Order the peaklists¶
Go to Menu: Peak list methods > Order peak lists.
Important
This is to ensure the reproducibility of the processing. Indeed, the aligned peak list will change slighlty if that step is not performed.
7. LC-MS feature alignement (Peaklist alignement)¶
In this step, the peak lists from each sample will be aligned in one aligned peak list. The alignement is performed iteratively using the first peak list selected (see MZmine documentation). For that reason, make sure the first sample is adapted (not a negative control) or to manually put an representative peaklist in the first position.
Go to Menu: Feature list methods > Alignment > Join aligner
(version 2.33) Go to Menu: Peak list methods > Alignment > Join aligner
8. (Optional) Detect Missing Peaks / Gap Filling¶
Gap filling enables to retrieve the intensity of a peak in all the samples, even if it was not detected in a previous processing step. Go to Menu: Feature list methods > Gap filling > Peak finder (multi-threaded).
(version 2.33) Go to Menu: Peak list methods > Gap filling > Peak finder (multi-threaded).
Important
This step is optional. Use the multi-threaded peak finder for fast processing.
9. (Optional) Filter the Peaklist to MS/MS Peaks¶
Depending on the number of features in the aligned peaklist, it is possible to filter the peaklist to keep only features with minimum number of occurences ("Minimum peaks in a row") or a mininum number of isotopic peaks for the feature ("Minimum peaks in an isotope pattern"), or to "Keep only peaks with MS2 scan (GNPS)".
Go to Menu: Feature list methods > Filtering > Feature list rows filter > Select the filters
(version 2.33) Go to Menu: Peak list methods > Filtering > Peak list row filter > Select the filters
Important
if you use a filter, we recommend using the filter "Reset the peak number ID"
Important
Note that this step was mandatory in the prototype versions of FBMN with MZmine, now the filter "Keep only peaks with MS2 scan (GNPS)" is optional.
10. Use the GNPS Export module¶
Use the dedicated module "Submit to/Export for GNPS" in MZmine under Feature list methods > Export/Import to export the needed file:
- the feature quantification table (.CSV file format) with LC-MS feature intensities.
- the MS/MS spectral summary (.MGF file), with a representative MS/MS spectrum per LC-MS feature. The MS/MS spectrum correspond either to the most intense MS/MS found for the feature, or to the merged spectrum (new feature !)
Select the lastest "filtered aligned peaklist" generated and Go to Menu: Peak list methods > Export > Export for/Submit to GNPS
See an example of files generated by the export module using the workflow: here.
The files can be uploaded to the GNPS web-platform and Feature-Based Molecular Networking job can be directly launched¶
IMPORTANT: While the possibility to submit the files directly to GNPS and launch a FBMN job on the fly is really convenient for quick data analysis, the files will not be saved in your personal account on GNPS and are periodically deleted, which will prevent future cloning of the jobs. If you do not provide username/password, and you are limited to basic presets of parameters. For that reason, we recommend to upload your files with the FTP uploader (see documentation) and prepare your job directly on GNPS (you must be logged in first).
In the "Export for/Submit to GNPS" module, select the option: "Submit to GNPS"
-
[Optional] Metadata file: specify the path to the metadata table in GNPS format. See documentation here
-
Select the parameters presets for the GNPS job.
-
[Optional] Email: specify the email to forward the job link
-
[Optional] Annotation edges (Experimental feature that will described later).
-
[Optional] Open website: if ticked, will open the job webpage.
ADDITIONAL NOTES: The feature table must contain at least the row ID, the row m/z, and row retention time, along with the sample columns. It is currently mandatory for the sample name headers to have the following format: "filename Peak area". Depending on the steps used in MZmine the sample name header can be "filename baseline-corrected Peak area", but this has to be changed back to "filename Peak area".
Video Tutorial - Quick MZMine Export to GNPS for FBMN.¶
The workflow for Feature Based Molecular Networking in GNPS is different from the classic molecular networking workflow. Access the FBMN workflow here (You need to be logged in first !)
FBMN in GNPS¶
The main documentation of the Feature Based Molecular Networking workflow on GNPS can be consulted on that page. The workflow for Feature Based Molecular Networking in GNPS is different from the "classic" molecular networking workflow. Access the FBMN workflow here (You need to be logged in first !).
Basically, you will need to upload the files produced by MZmine (test files are accessible here):
- The feature quantification table (.CSV file format).
- The MS/MS spectral summary (.MGF file format)
- [Optional] The metadata table - described here
There are several additional normalization options specifically for feature detection. We can normalize the features per LC/MS run and aggregate by groups with either a sum or average (recommended).
Here is an example FBMN job with files resulting from MZmine2 processing of a subset of the American Gut Project.
ADDITIONAL NOTES: In case you want to run the Feature-Based Molecular networking with just part of the samples you processed in MZmine, it is possible to filter the exported files (.csv and .mgf) to keep only the features of interest. In the case of the .csv file, the rows of specific features can just be deleted. Within FBMN in GNPS, the workflow will automatically only consider these set to of features in the csv file.
For other purposes, if you would like to filter down the .mgf file, there are two options of Python scripts to filter the file. One of them can be run in the Terminal, while the other can be run in a Jupyter notebook.
Video Tutorial - Analyze FBMN jobs in GNPS¶
This video presents
FBMN in Cytoscape¶
Cytoscape is an open source software platform used to visualize, analyze and annotate molecular networks from GNPS. See the documentation here
Tutorials¶
See our tutorial on using MZmine2 for FBMN analysis of a cohort from the American Gut Project, and our tutorial on running a FBMN analysis on GNP.
Join the GNPS Community !¶
- For feature request, or to report bugs, please open an "Issue" on the CCMS-UCSD/GNPS_Workflows GitHub repository.
- To contribute to the GNPS documentation, please use GitHub by forking the CCMS-UCSD/GNPSDocumentation repository, and make a "Pull Request" with the changes.
Page Contributors¶