molDiscovery - a probabilistic model for in silico database search of MS/MS¶
- molDiscovery - a probabilistic model for in silico database search of MS/MS
- Manuscripts and Citations
- Accessing molDiscovery workflow on GNPS
- STEP 1: Go to GNPS and login
- STEP 2: Access molDiscovery
- STEP 3: Select input spectra
- STEP 4: Set up parameters
- STEP 5: Submit job
- STEP 6: View results
- Feedback and Bug Reports
- Contribute to the Documentation
- Page Contributions
molDiscovery is a probabilistic model for in silico database search of small molecule MS/MS. It improves DEREPLICATOR+ by
- using an efficient fragmentation algorithm
- using log likelihood ratio as metabolite spectrum matching score
- an efficient implementation for indexing
As a result, molDiscovery improves both efficiency and accuracy of the database search.
Manuscripts and Citations¶
Liu Cao et al. "MolDiscovery: Learning Mass Spectrometry Fragmentation of Small Molecules." bioRxiv.
The GitHub repository provides latest source code and release.
Accessing molDiscovery workflow on GNPS¶
The molDiscovery are currently available on GNPS as a workflow.
STEP 1: Go to GNPS and login¶
Go to http://gnps.ucsd.edu, and create/login to your account.
STEP 2: Access molDiscovery¶
Derectly go to the workflow page.
STEP 3: Select input spectra¶
Select/Import a mass spectrometry file: The input format accepted are mzML/mzXML/MGF. Note that have the option to import an existing GNPS dataset (by clicking on "Share Files") or upload your own data (by clicking on "Upload Files"). See the corresponding GNPS documentation for FTP upload.
After getting done with selecting files, click on Finish Selection.
STEP 4: Set up parameters¶
Set up the parameters for the molDiscovery job: Select a title for your job, and adjust parameters depending on your data and the mass spectrometer used.
-- Precursor Ion Mass Tolerance: This value specifies how much fragment ions can be shifted from their expected m/z values. Default value is ± 0.01 Da.
-- Fragment ion Mass Tolerance: This value specifies how much fragment ions can be shifted from their expected m/z values. Default value is ± 0.01 Da.
-- Predifined database: the default small molecule structure database is AllDB, which contains 720K compounds.
-- Custom DB file: if a custom database is provided (via File selection or URL), the predefined DB choice is ignored.
-- Max charge: Maximum charge allowed. The default is 2.
-- Fragmentation Model: the default model is 2-1-3, which indicates there will be at most two bridges, one 2-cut, and three cuts in total.
-- Min significant score: the minimum score to report a significant metabolite spectrum match. The default value is 10.
STEP 5: Submit job¶
Add an email for notification. Then click on submit.
You will get an email with the link to the results when your job is finished. You can also check the status of your job in "Jobs" section.
STEP 6: View results¶
See the molDiscovery job status: The status of the job can be vizualized in your account job list, and through the link in provided in email you will received when the job is done.
The job can be Cloned, for further parameters modifications. Summary Statistics and Workflow Parameters can be consulted on the respective links.
Click on View Unique Metabolites (recommended), to have get the list of annotated molecules.
Click on the View All MSM (Metabolites-Spectrum Match) to have detailed view of the Metabolites-spectrum matches.
Explore the molDiscovery annotations: Annotations are sorted by molDiscovery score (default), mass, compound name, etc. These fields can also be searched with the filter function.
Feedback and Bug Reports¶
Contribute to the Documentation¶
- For informations/feature request, please open an "Issue" on the CCMS-UCSD/GNPSDocumentation GitHub repository.
- To contribute directly to the GNPS documentation, fork the CCMS-UCSD/GNPSDocumentation repository, and make a "Pull Request".