FBMN with OpenMS
Introduction to FBMN with OpenMS¶
The main documentation for Feature-Based Molecular Networking can be accessed here. See our article.
Below we are describing how to use OpenMS with the FBMN workflow on GNPS.
This work builds on the efforts and tools from our many colleagues, please cite their work:
Nothias, L.-F., Petras, D., Schmid, R. et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 17, 905–908 (2020).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).
Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016). https://doi.org/10.1038/nmeth.3959
The OpenMS website at https://www.openms.de/
The GNPSExport TOPP tool code can be found on OpenMS GitHub repository and the lastest version on this fork repository.
The code for the Open-GNPS pipeline (ProteoSAFe workflow and python wrapper) is available on this GitHub repository.
Feature Detection with OpenMS for the FBMN¶
We have developed an OpenMS-GNPS pipeline that can be used for the processing of metabolomics LC-MS data with OpenMS and Feature Based Molecular Networking (FBMN). This pipeline uses OpenMS tools and the GNPSExport (part of OpenMS TOPP tools).
In brief, after running an OpenMS "metabolomics" pipeline, the GNPSExport TOPP tool can be used on the consensusXML file and corresponding mzML files to generate the files needed for FBMN on GNPS. These two files are:
- The feature quantification table (.TXT format) which is generated with the TextExport tool.
- The MS2 spectral summary file (.MGF format) which is generated with the GNPSExport tool.
Running the OpenMS-GNPS pipeline on GNPS web-platform¶
The OpenMS-GNPS pipeline is an experimetal workflow deployed currently on proteomics2.ucsd.edu [https://proteomics2.ucsd.edu/ProteoSAFe/]). The job can be configured as follows:
- Log into https://proteomics2.ucsd.edu (Create an account if necessary).
- Access the lastest version of the OpenMS + GNPS workflow (release 20) at this address.
- In Import Data Files, select the input mzML files (prefered) or mzXML files (not recommended, because the pipeline would have to perform conversion to mzML)
- [TO BE RELEASED] Select the parameters from the presets
HPLC-Q-Exactive, UHPLC-Q-Exactive, HPLC-QTOF, UHPLC-Q-Exactive. The corresponding OpenMS configuration files (.INI files) are available from that [GitHub repository] (https://github.com/Bioinformatic-squad-DorresteinLab/openms-gnps-workflow/presets/)). Alternatively, you can upload your OpenMS TOPP tool parameter file (.INI files). Note that parameter files can be updated with a text editor or with the INIFileEditor TOPP tool.
- Wait for the job to complete. You will get a notification.
- Download the main output files with "Download OpenMS Output Files" and/or all the files with "Download Workflow Files".
- Unzip the archive.
- Examine the consensusXML file (in the filefilter folder) with TOPPAS.
gnpsexportfolder contains the MS/MS spectral summary file (.MGF file)
textexporterfolder contains the feature quantification table (.TXT file).
- Click a FBMN workflow on GNPS by clicking on "Run a Feature-Based Molecular Networking at GNPS" Note that you have to be logged in to GNPS first, https://gnps.ucsd.edu). See documentation for the FBMN workflow on GNPS.
- Alternatively, upload these files via FTP for FBMN on GNPS, or use them for other annotation tools (DEREPLICATOR, Mass2Motif, NAP.
Running the OpenMS-GNPS pipeline with OpenMS TOPP tools¶
Installation and requirement¶
Installing OpenMS TOPP tools: See the OpenMS website at https://www.openms.de/
Representative OpenMS-GNPS workflow¶
A representative OpenMS-GNPS workflow would sequencially use these OpenMS TOPP tools:
- Input the mzML files.
- Run the FeatureFinderMetabo tool on the mzML files.
- Run the IDMapper tool on the featureXML and mzML files.
- Run the MapAlignerPoseClustering tool on the featureXML files.
- Run the MetaboliteAdductDecharger tool on the featureXML files.
- Run the FeatureLinkerUnlabeledKD tool or FeatureLinkerUnlabeledQT, on the featureXML files and output a consensusXML file.
- Run the FileFilter tool on the consensusXML file to keep only consensusElements with at least MS/MS scan (peptide annotation).
- Run the GNPSExport tool on the "filtered consensusXML file" to export an .MGF file.
- Run the TextExport tool on the "filtered consensusXML file" to export a .TXT file.
- Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. See the main documentation for using FBMN on GNPS.
Requirements for the OpenMS-GNPS pipeline¶
- The IDMapper has to be run on the featureXML files, in order to associate MS2 scan(s) (peptide annotation) with each feature. These peptide annotations are used by the GNPSExport.
- The FileFilter has to be run on the consensusXML file, prior to the GNPSExport, in order to remove consensusElements without MS2 scans (peptide annotation).
- Note that mass accuracy and the retention time window for the pairing between MS2 scan(s) and a LC-MS feature or consensusElement is defined at the IDMapper tool step for features.
The GNPSExport TOPP tool¶
The GNPSExport TOPP tool can be run on the consensusXML file and corresponding mzML files to generate a MS/MS spectral file (MGF format) and corresponding feature quantification table (.TXT format) that contains LC-MS peak area intensities.
For each consensusElement in the consensusXML file, GNPSExport produces one representative consensus MS/MS spectrum (named peptide annotation in OpenMS jargon) saved to the MS/MS spectral summary file (.MGF file). Several modes for the generation of the consensus MS/MS spectrum are available and described below. Note that these parameters are defined in the GNPSExport parameters file (.INI file, available with that link).
Usage and Options for GNPSExport TOPP tool:¶
GNPSExport -ini iniFile-GNPSExport.ini -in_cm filefilter.consensusXML -in_mzml inputFile0.mzML inputFile1.mzML -out GNPSExport_output.mgf
Options for the consensus MS/MS spectra:
Most intense [default]:
most_intense- For each consensusElement, the GNPSExport will output the most intense MS/MS scan (with the highest precursor ion intensity) as representative MS/MS spectrum.
full_spectra- For each consensusElement, the GNPSExport will output All MS/MS scans.
merged_spectra- For each consensusElement, the GNPSExport will merge all eligible MS/MS scans into one representative MS/MS spectrum. Eligible MS/MS scans have a pairwise cosine similarity with the MS/MS scan of highest precursor intensity above the Cosine Similarity Treshold. The fragment ions of merged MS/MS scans are binned in m/z (or Da) range defined by the Binning width parameter.
- Cosine Similarity Treshold:
merged_spectra:cos_similarity(float, default: 0.9) - Parameter that defines Cosine Similarity Treshold for the pairwise cosine similarity between the MS/MS scan with the highest precursor intensity and the other MS/MS scans.
- Binning width:
merged_spectra:ms2_binned_size(float, default: 0.02 Daltons) - Parameter that defines the Binning width of fragment ions during the merging of eligible MS/MS spectra.
- Cosine Similarity Treshold:
Running the FBMN workflow with OpenMS files¶
After the processing with OpenMS, the output files can be used to run the Feature-Based Molecular Networking workflow on GNPS either using the [Superquick FBMN start page] (http://dorresteinappshub.ucsd.edu:5050/featurebasednetworking) or the standard interface of the FBMN workflow (you need to be logged in GNPS first).
Make sure to select the correct table source (OpenMS).
The main documentation for FBMN can be accessed here. For metadata format requirements, see this page
Join the GNPS Community !¶
- For feature request, or to report bugs, please open an "Issue" on the CCMS-UCSD/GNPS_Workflows GitHub repository.
- To contribute to the GNPS documentation, please use GitHub by forking the CCMS-UCSD/GNPSDocumentation repository, and make a "Pull Request" with the changes.