Fig. 2

Schematic overview of the use of NPClassScore in integrative omics mining where functionality that previously existed in the NPLinker workflow is shaded in grey. First, BGCs and MS/MS spectra are clustered and dereplicated by BiG-SCAPE and GNPS molecular networking, respectively. Co-occurrence scoring (standardised Metcalf) is used to generate ranked candidate links of BGC-MS/MS spectra by correlating the presence/absence patterns of strains that contain a BGC and/or MS/MS spectrum. Depicted in the non-shaded area is the NPClassScore workflow which we integrated in the NPLinker platform. We incorporated structure-based classification predictions into the integrative omics mining workflow using CANOPUS from the SIRIUS platform and MolNetEnhancer, which predict ClassyFire and NPClassifier ontologies, while using antiSMASH and BiG-SCAPE for genome-based chemical compound classification ontologies. Based on the predicted classes of a BGC and MS/MS spectrum, NPClassScore outputs a score based on the matched genome- and structure-based ontologies in MIBiG. The best use of the scores from NPClassScore is to filter candidate BGC-MS/MS spectrum links based on a NPClassScore cut-off and then rerank the previously ranked candidate lists resulting from co-occurrence scoring