Quality assessment of a clinical next-generation sequencing melanoma panel within the Italian Melanoma Intergroup (IMI)

Background Identification of somatic mutations in key oncogenes in melanoma is important to lead the effective and efficient use of personalized anticancer treatment. Conventional methods focus on few genes per run and, therefore, are unable to screen for multiple genes simultaneously. The use of Next-Generation Sequencing (NGS) technologies enables sequencing of multiple cancer-driving genes in a single assay, with reduced costs and DNA quantity needed and increased mutation detection sensitivity. Methods We designed a customized IMI somatic gene panel for targeted sequencing of actionable melanoma mutations; this panel was tested on three different NGS platforms using 11 metastatic melanoma tissue samples in blinded manner between two EMQN quality certificated laboratory. Results The detection limit of our assay was set-up to a Variant Allele Frequency (VAF) of 10% with a coverage of at least 200x. All somatic variants detected by all NGS platforms with a VAF ≥ 10%, were also validated by an independent method. The IMI panel achieved a very good concordance among the three NGS platforms. Conclusion This study demonstrated that, using the main sequencing platforms currently available in the diagnostic setting, the IMI panel can be adopted among different centers providing comparable results. Supplementary Information The online version contains supplementary material available at 10.1186/s13000-020-01052-5.


Introduction
Malignant melanoma is one of the most aggressive, drug-resistant human cancers, and its incidence has risen persistently during the last few decades, particularly in the Caucasian population [1]. According to GLOBOCAN, more than 287,723 new cases of melanoma of the skin occurred worldwide in 2018 (1.6% of all cancers), with approximately 60,712 reported deaths (GLOBOCAN 2018) [2]. In 2020, it is estimated that around 377,000 new cancer cases will be diagnosed in Italy and, among them, 14,863 cases are expected to be melanomas (AIOM, AIRTUM, I numeri del cancro in Italia 2020, available at: https://www.fondazioneaiom.it/ wp-content/uploads/2020/10/2020_Numeri_Cancro-pazi enti-web.pdf). Several tumor suppressor genes and/or oncogenes have been reported to be involved in melanomagenesis [3][4][5][6]. Of great interest are the RAS-RAF-MEK-ERK, PI3K/PTEN and c-Kit pathways, since patients harboring activating mutations in BRAF, NRAS and KIT genes could benefit of target treatment options or tailored combinations of target-and immuno-therapies. The identification of variants predictive of response or resistance to systemic treatments is already recommended today for proper management of advanced melanoma and molecular testing is a priority in determining the course of therapy. Indeed, molecular testing for actionable mutations is mandatory in patients with advanced disease (unresectable stage III or stage IV, and highly recommended in highrisk resected disease stage IIc, stage IIIb-IIIc). In case of a BRAF-wild type tumor, NRAS and c-KIT (mucosal and acrolentigenous primaries) testing should be performed (Italian Association of Medical Oncology/AIOM Guidelines Melanoma -2019, available at: https://www.aiom.it/ linee-guida-aiom-melanoma-2019/; National Comprehensive Cancer Network/NCCN clinical practice guidelines in oncology: melanoma -2019, available at: https://www. nccn.org/professionals/physician_gls/pdf/cutaneous_mela noma.pdf) [7].
To date, various molecular strategies are available for mutational analysis of the BRAF gene, such as Sanger Sequencing (SS), real-time PCR, high-resolution melting analysis, Peptide Nucleic Acid (PNA)-mediated real-time PCR clamping, digital PCR, pyosequencing, and immunohistochemistry. Each technique is able to detect mutations on single genes per run with a specific sensitivity, specificity, and limit of detection [17][18][19][20][21][22][23][24]. At the beginning, Cobas 4800 BRAF V600 Mutation Test (Roche Molecular Systems) and THxID™-BRAF kit (BioMerieux, Inc.) were the only FDA-approved assays for BRAF V600E mutation and for BRAF V600E/V600K mutations in DNA samples extracted from Formalin-Fixed Paraffin-Embedded (FFPE) human melanoma tissue, respectively (http://www. fda.gov/companiondiagnostics) [25][26][27]. The advent of high throughput Next-Generation Sequencing (NGS) technology has revolutionized the understanding of cancer biology and improved personalized treatment strategies in a large variety of human cancers, including melanoma. Development and use of NGS targeted gene sequencing panels may represent an attractive method in hospitals and clinics, since they can simultaneously screen disease-related mutations in multiple several genes per run, thus reducing both reagents cost and DNA quantity necessary, with enough sensitivity and specificity to detect somatic variants with frequencies higher than 5%. In the clinical setting, the application of NGS targeted gene panels requires analytical validation to ensure the detection of somatic variants and high quality of sequencing results [28]. NGS methods for cancer -related genes testing have been rapidly adopted by clinical laboratories [29], but no consensus on the use of NGS tests and validation of a customize panel in clinical practice for melanoma are established in Italy, yet. A consensus was reported by the AIOM 2019 guidelines, but only for BRAF mutations (AIOM Guidelines for Melanoma -version 2019, available at: https://www.aiom.it/linee-guida-aiommelanoma-2019/).
Here, we present the design and the mutational concordance between three different NGS platforms of a customized panel that analyzes target regions of 25 genes frequently mutated in melanoma, based on literature evidences [5]. By using three NGS platforms often available in the research and clinical centers, this multicenter study aims to develop quality controls to be adopted by IMI centers.

Samples' collection
We selected a total of 11 metastatic melanoma cancer cases, 5 treated at the IRCCS Ospedale Policlinico San Martino (Genoa, Italy) and 6 treated at the Unit of Cancer Genetics, National Research Council (CNR) (Sassari, Italy). Both centers have passed previous External Quality Assessment (EQA) tests conducted by both the Italian Association of Medical Oncology (AIOM) and The European Molecular Genetics Quality Network (EMQN). These procedures of quality assurance are actually widely recognized systems to assess the performance of a laboratory, allowing laboratories to demonstrate consensus with their peers and providing information on inter-method comparability.
All samples were FFPE tissues, except for two fresh frozen tumor samples. All tumor samples were evaluated by pathologists for the presence of adequate tumor cell content (≥70%). The clinical characteristics of the metastatic melanoma patients are reported in Table 1. All specimens had already been screened for the presence of BRAF codon 15 mutations by SS approach and Real Time PCR assay (PNAClamp™ BRAF Mutation Detection Kit; Panagene, Daejeon, Korea) or Therascreen™ BRAF Pyro assay (Qiagen, Valencia, CA) for molecular diagnostic purposes.
All patients were informed about the use of their tumour tissues samples for mutation analyses, gave the permission to collect tissue specimens for such purposes and signed a written consent. The study was approved by local Ethics Committees of the institution involved in this study (National Research Council and Ospedale Policlinico San Martino). Medical records were used for collecting clinical and pathological data (clinical presentation, tumour size and characteristics; Table 1). and Agilent 2200 TapeStation system using the Genomic DNA ScreenTape assay (Agilent Technologies, Santa Clara, CA, USA). gDNA fragmentation status was evaluated by the Agilent 2200 TapeStation system using the Genomic DNA ScreenTape assay (Agilent Technologies, Santa Clara, CA, USA) able to produce a DNA Integrity Number (DIN). gDNA quality showed a DIN ranging from 2.9 to 8.6.

DNA extraction and quality control
All DNA samples belonging to each laboratory were distributed in a blind-coded manner to the other.

Melanoma panel design
The "IMI Somatic Panel" -IAD79062 -was created to facilitate the identification of the genetic regions most significantly associated with melanoma using the Ion AmpliSeq™ Designer™ tool [at https://ampliseq.com/ login/login.action]; the chosen targets of 35.13 kb were entered into the online tool and the resulting 343 amplicons (ranging from 125 to 175 bp) were divided by the online designer into three primer pools to maximize target specificity [30].

Illumina
Overall, 30 ng of gDNA for each sample was used for library construction using IMI Somatic Panel (3 primers pool) and Ampliseq Library PLUS for Illumina (Illumina Inc., San Diego, CA, USA) following the manufacturer's instructions. Cycling conditions were performed according to the DNA type and primer pairs per pool: 23 cycles with an extension time of 4 min in the first multiplex PCR, whereas in the second, optional PCR, the gDNA were subjected to seven cycles. Sample libraries was combined and diluted to 2 nM, denatured with 0.2 N fresh NaOH, diluted to 8.4 pM by addition of Illumina HT1 buffer. Then, the libraries, spiked with 1% PhiX (8.4 pM), were sequenced on an Illumina MiSeq™ instrument by using the 300-cycle (2 × 150 paired ends) MiSeq v2 Reagent Kit v2 (Illumina).

PGM™ ion torrent
gDNA from the 11 tumor samples were amplified using the Ion AmpliSeq™ Library Kit 2.0 (ThermoFisher Scientific) starting from 30 ng of gDNA, barcoding each sample following the manufacturer's instructions. Cycling conditions were

Proton™ ion torrent
The eleven libraries were generated starting from 30 ng of input DNA with the Ion AmpliSeq Library Kit 2.0, according with the manufacturer instructions, barcoded with Ion Xpress Barcode Adapters, diluted at a final concentration of 50 pM, and pooled together. Template preparation and chip loading were performed on the Ion Chef; PI™ v2 BC chips were subsequently sequenced on the Ion Proton™ instrument using the Ion PI™ IC 200 Kit.

Bioinformatics analysis
The Variant Caller (VC) analysis for each samples was carried out using the Ion and Illumina informatics solution integrated by each specific NGS platform. For Ion Torrent platforms, initial variant calling from the Ion AmpliSeq™ sequencing data was generated using Torrent Suite v.5.10.1 (ThermoFisher Scientific) with a plug-in VC program (VC v.5.10.1.20) with Generic -PGM (3xx) -Somatic -Low Stringency parameters. Moreover, Ion Reporter™ Software were used for variant annotation.
Illumina data was analyzed using BaseSpace (Illumina) to convert *.bcl files into FASTQ files, which contain base call and quality information for all reads passing filtering. DNA Amplicon App v.2.1.0 was used for alignment in the targeted regions (specified in a manifest file), or the Burrows Wheeler Aligner across the entire genome. We selected the option "Somatic Variant Caller" with a Variant Allele Frequency (VAF) threshold of 0.01 (Percentage) and a depth threshold of 10. The tertiary analysis was carried out using BaseSpace Variant Interpreter.

Sanger Sequencing (SS) validation
All NGS variants with frequency higher than 10% were validated by SS using primer sets, designed by Primer3-Plus tool (http://www.bioinformatics.nl/cgi-bin/primer3 plus/primer3plus.cgi). All primer sequences are reported in Table 2. The PCR reactions were performed by amplifying 40 ng of gDNA in a final volume of 15.5 μL containing 200 mol/L dNTPs, 10× Taq buffer, 0.322 μM of each PCR primer, 1.5 U of Taq Hot Start (Qiagen). The PCR program consists of 10 min at 95°C and 35 cycles with 30 s at 95°C, 30 s at specific annealing temperature of primer, and 30 s at 72°C, followed by 5 min at 72°C. Purified products were sequenced, using the same primers of the PCR amplification, with the BigDye Terminator v1.1 cycle sequencing kit (Applied Biosystems) under the following conditions: 1 μl BigDye Terminator v1.1, 2 μl sequencing buffer 5X, 3.2 pmol forward or reverse primer, 1.5 μl PCR purified product and 4 μl sterile water to a final reaction volume of 10.5 μl. Cycle sequencing was performed using initial denaturation step at 96°C for 10 s followed by 25 cycles at 96°C for 10 s, 60°C for 3 min on GeneAmp® PCR System 9700 (Applied Biosystems). The sequencing products were separated by capillary electrophoresis in an automated sequencer (ABI 3130XL Genetic Analizer, Applied Biosystems) with a 36 cm length capillary and POP-7™ polymer, according to the manufacturer's instructions. Data were analyzed with Sequencing Analysis Software version 5.3.1 (Applied Biosystems).

NGS concordance
The concordance of variant calls across the 3 different NGS approaches, was measured on with the Intra-class Correlation Coefficient (ICC) [32], using the IRR package within the R computational environment [33,34]. The ICC analysis was calculated considering cut-off of 200 depth of coverage and VAF of 10.0%, and then repeated using only the VAF criterion.

Results
The NGS analysis was performed using a specific multiple-gene panel constructed by the Italian Melanoma Intergroup, the IMI Somatic Panel, arranged in three primer pools, and designed using the Ion AmpliSeq Designer to explore the mutational status of selected regions (343 amplicons; amplicon range: 125-175 bp; coverage 100%) within the 25 genes reported as the most frequently mutated in melanomas by The Cancer Genome Atlas (TGCA) and successive NGS-based studies [5,14].

Analytical performance
We evaluated the performance of somatic variants detection by three NGS platform using the 11 tumor samples that had been blindly sequenced in the two centers.
The combination of variant calls between the three platforms identified a total of 126 exonic genetic variants among the different systems irrespective of coverage and VAF (Additional File 5; Fig. 4a). By setting a coverage ≥200x and VAF ≥10%, a total of 36 variants were called by the three systems (PGM™, Proton™, and Miseq™) ( Table 4). Therefore, concordance was calculated based on our assay detection limit (coverage ≥200x and VAF ≥10%) on these 36 variants. Despite different coverage depending on the platform used and pipeline of analysis, considering a minimum coverage of 200x and a VAF   Ser12TrpfsTer14)) in one tumor sample (ID #10), but both variants had a coverage of 108x and were thus excluded by our detection limit (Additional File 4). Interestingly, the two CDKN2A genetic variants started in the same chromosome position with a considerable different VAF. Since one of the two had been called by Illumina with a VAF of 48.1%, we decided to validate it by SS. The SS confirmed the presence in this chromosome position (NM_001195132: chr9:21974792) of p.(Ser12TrpfsTer14) with a VAF5 0% instead of p.Ser12Leu. A possible explanation of the incorrect call could be the position of the variant (GRCh37.p13; chr9:21974792) located in the last base of the designed amplicon. The region in which Illumina called the CDKN2A variant was covered at a similar (105X) and higher (250X) depth by PGM™ and Proton™, and therefore we considered this variant as called at a frequency of 0% by these two platforms. In light of this findings, we re-assessed the concordance between the three platforms dropping the coverage cut-off and including all the 37 variants with VAF higher than 10% (Fig. 4b), and obtained an ICC of 0.863 between the three platforms (95%CI = 0.779-0.922, p < 0.01).

Discussion
As the number of actionable genes in melanoma tumors is steadily growing there is an increasing need to perform multi-gene mutation testing in molecular diagnostics. Several NGS panels are commercially available, but these panels often contain genes or hotspots that are not of particular interest for molecular diagnostics due to their uncertain clinical significance, or to the lack of genes or hotspots specific for tumor types studied. Today, only two commercial NGS panels are specifically designed to test somatic melanoma. However, these panels,  BRAF negative cases is recommended in clinical routine for the selection of target therapy and/or inclusion in clinical trials, and all these exons are included in the three panels here discussed. The application of the panel here described is for research purposes. The panel has already been used in research studies performed within the Italian Melanoma Intergroup with the analyses performed in a single center [30].
We therefore obtained a panel with a total size of 35.13 kb, made up of three primers pools and with limited amount of DNA required (30 ng), offering sufficiently extensive and clinically relevant mutational profiling in a cost-efficient way. We then evaluated the concordance of this custom NGS panel in the identification of somatic genetic variants clinically relevant in melanoma patients using three different benchtop sequencers by a bicentric-study. To do this, we tested the panel using the most used NGS platform available in the laboratories: Ion Torrent PGM™ and Ion Proton™ for the ThermoFisher and MiSeq™ benchtop sequencers for the Illumina. Notably, at the time of the "IMI somatic panel" design the Ion Torrent S5 XL sequencer (ThermoFisher Scientific) was not present in the two centers, for the evaluation on this additional NGS platform, so due to the limited availability of the DNA of the eleven samples of the study, another patients setting was subsequently tested on S5 XL. In any case, the S5 XL sequencer employs the same chemistry as the Ion Torrent PGM™ and the Ion Torrent Proton™, so would not be relevant to our analysis. In fact, although several platforms available for routine diagnostic applications can perform highthroughput analysis within few days, with considerably reduced costs compared to SS [35], two of these are mainly used in clinical laboratories: Ion Torrent and Illumina systems.
We also estimated the total cost for the analysis of a single patient with the "IMI somatic panel" using the three different sequencing platforms. The cost for testing 25 genes using the "IMI somatic panel" was €270 (loading 3 samples on chip 316v2), €337 (loading all samples on Miseq Reagent Nano kit v2), and €398 (loading all samples on Ion PI Chip Kit V2) per sample for PGM™, Illumina, and Proton™, respectively, not taking into account panel primers, DNA extraction and quantity/quality control, labor time and bioinformatics analysis costs.
All platforms used in this study demonstrated comparable performance in the detection of somatic variants from the DNA samples tested, reaching an amplicon mean coverage higher than 1897x and an uniformity average greater than 87.6%. The Proton™ platform has revealed to have higher NGS quality metrics compared to the other 2 platforms. This data could be due to a load of fewer samples, which allowed to obtain a superior coverage than that of the other platforms.
Our analysis revealed that some amplicons are consistently not covered >200x across all samples and NGS platforms. Of note, two amplicons (CDKN2A-226,642,480 and MAP2K1-233,667,219) have been constantly covered less than 200x in half of the samples analyzed, proving that some amplicons in the "IMI somatic panel" design have an intrinsic impairment in their coverage ability.
Published scientific data have shown how uneven coverage of amplicons is associated with GC bias introduced during PCR amplification of library, cluster amplification, or sequencing. In fact, the GC content of the amplified region is also critical for NGS sequencing performance on both Illumina and Ion Torrent platforms [36][37][38][39][40]. However, only the CDKN2A-226,642,480 amplicon displayed % GC content higher than 90, explaining a lower coverage, while the MAP2K1-233,667,219 amplicon showed a % GC of 33 [39]. Moreover, not even the amplicons length can explain this lack of coverage, since the "IMI somatic panel" designed has an amplicon range of 125-175 bp. Finally, gDNA degradation status also did not influence the NGS quality data since the three different NGS platforms showed a different coverage for the same sample analyzed (Additional File 1), irrespective of DIN, although unsurprisingly, the DIN values were lower in FFPE compared to fresh frozen samples. On the contrary, some amplicons show consistently a coverage < 200x across all samples and NGS platforms, regardless of sample DIN.
Regardless the NGS quality metrics, the three NGS platforms achieved a very good concordance (ICC of 0.901; 95%CI: 0.837-0.945, p < 0.01) considering a 200 depth of coverage and a VAF of 10.0%. It is known that Ion torrent NGS platforms present a higher per base error rate and a quality of base calling accuracy lower than that of Illumina sequencing platforms. Moreover, the Ion torrent platforms have a tendency of misreading the length of homopolymers compared to other platforms (e.g. Illumina) [36,37,41]. Unlike the two Ion Torrent platforms, in one tumor sample the Illumina platform called two different genetic variants [NM_001195132: c.35C > T (p.Ser12Leu) and c.35delC (p.Ser12TrpfsTer14)] in the same position of a CDKN2A amplicon (AMPL-225530996). Although the coverage of the aforementioned CDKN2A amplicon was similar across the three different platforms (105x, 230x and 108x for the PGM™, Proton™ and Illumina sequencer, respectively), the two variants were only called by the Illumina platform. Interestingly, the p.(Ser12TrpfsTer14) CDKN2A variant was confirmed by SS at a VAF of around 50.0%.
A possible explanation of this phenomenon could be due to the well documented characteristic of the Ion Torrent's current semiconductor sequencing platforms to call a higher number of indel error rate, particularly after long homopolomeric stretches, compared to Illumina platforms [41,42]. In fact, Illumina's overall indel error rate is the lowest of all NGS technologies. Moreover, paired-end reads sequencing is more sensitive and accurate than single-end reads sequencing, because it greatly facilitates alignment operations, allowing among other things, to detect any deletion, duplication or insertion in the patient's DNA. The reason why Illumina miscalled the variant and identified it as SNV at a frequency of around 50% could be clarified by the fact that the genetic variants were both located at the end of the amplicon AMPL-225530996. The risk of false negative variants, as well as the allele drop-out phenomenon could be reduced by a tiling primer design that results in multiple overlapping amplicons for each target, to ensure the correct identification of all variants present in the target regions of the panel design. Moreover, this bias could be solved decreasing the number of samples sequenced in the same NGS run, which will increase coverage per sample while deliver a raised cost per sample for sequencing. Specific regions refractory to NGS, such as AMPL-225530996, need to be sequenced by SS and/or validated by alternative assays, in order to cover the gap and to validate the NGS data [43].
All these observations justify the need to improve analytical solutions to detect somatic mutations with high confidence, to avoid false positives or inaccurate call measurements. Nevertheless, both the detection of some variants located at the end of the amplicons mistakenly called and the insufficiently coverage highlighted the importance of validating variants by an independent test before clinical application. Moreover, NGS results should not be transferred to clinical reports and practice without acceptable validation. It is fundamental to confirm the genetic variation on a newly extracted DNA from the same sample using another NGS platform, SS, or another proper technique, in order to exclude false positive results. Indeed, in our study, all variants called at VAF higher than 10% were further confirmed by SS (Table 2). Moreover, all samples were previously screened for the presence of mutations in BRAF codon 15 by Real Time PCR assay (PNAClamp™ BRAF Mutation Detection Kit; Panagene, Daejeon, Korea) and Therasc-reen™ BRAF Pyro assay (Qiagen, Valencia, CA) ( Table 1). In fact, PNAClamp™ and Therascreen™ tests were performed as part of the routine diagnostic approach and the outcome of these tests was documented in the patient report file and communicated with the medical oncologists. The technique used to validate the results should be included in the NGS report. Finally, all variants should be annotated and reported according to the HGVS [44] and, for diagnostic purposes, only those genes with an established (i.e. published and confirmed) relationship between the aberrant genotype and melanoma should be included in the analysis. The information provided in the NGS report should be limited to the disease status, its targets, the names of the genes tested, their reportable ranges, as well as the analytical sensitivity and specificity of the technique [45,46]. On the contrary, variants not linked with melanoma or gene variants not requested by medical oncologist should be not reported. It should also be emphasized that the interpretation of pathogenicity of a variant must be circumscribed to the evidence of its role in melanoma tumorigenesis at the time of the report, and that it could change over time as new information becomes available.
Massive efforts should be made to unify the interpretation and reporting of NGS molecular results among laboratories. In this context, a joint consensus recommendation for the interpretation and reporting of sequence variants in cancer was published [44].
The IMI somatic panel represent a relevant, highly scalable, and robust tool that is easy to implement and that can be fully adapted to daily clinical practice in determining melanoma actionable gene mutations, with a very good concordance -to detect somatic variants with frequencies higher than 10% with a coverage of 200x among the three NGS platforms. However, further validation studies on a greater number of samples from metastatic melanoma patients are required. Currently, the screening of clinically-actionable mutations is performed on FFPE tumor biopsies, but the amount of tumor tissue is often limited, and DNA quality may not be always optimal. We showed that this panel can be applied in the analysis of tumor FFPE tissue with varying status of DNA degradation. In fact, for all the samples, gDNA obtained from routine molecular testing of BRAF in metastatic melanoma and extracted with different methods in the two laboratories proved to be good reference material for the evaluation of this panel.

Conclusions
Since the advent of targeted therapy, treatment decisions are increasingly based on the molecular features of the tumor. Hence, laboratories need comprehensive molecular testing covering all actionable melanoma mutations using only limited amount of tumor tissue, mostly FFPE tissues, in a time-and cost-effective manner and with good performance. We show that the IMI panel, which include all established and several candidate melanoma driver genes, has optimal concordance-in the detection of actionable melanoma mutations using the main three NGS platforms available in research and clinical centers. We also achieve a good sequencing performance based upon amplicon and hotspot variants within the 25 genes of our designed NGS custom panel, obtaining an average amplicon coverage above 1800x with all three platforms.
Although our study is limited by the small number of samples analyzed, our study showed a high level of concordance in mutational patterns of the panel between two centers, using different extraction methods and NGS platforms to identify challenges and opportunities of center-specific platforms/protocols to analyze the same samples with the same panel. To the best of our knowledge, this is the first study in which concordance obtained using an NGS melanoma custom panel was evaluated by a bi-centric study with three different NGS platforms. This study may lay the ground for developing collaborations and share positive controls here analyzed to other centers working together within the Italian Melanoma Intergroup.
in order to highlight the most significant data in COSMIC, only scores ≥0.7 are classified as 'Pathogenic' whereas mutations are classed as 'Neutral' if the score is ≤0.5 [47]. The "Effect" column reports the effect of nucleotide change on the protein. The last three columns of the table report the GnomAD Frequency, the predictive effect on the protein based on SIFT, and the conservation score, namely GERP. Converted rankscore is reported for SIFT. To obtain the rankscore, Sorting Intolerant from Tolerant (SIFT) scores were first converted to SIFTnew = (1-SIFTori), then ranked among all SIFTnew scores in dbNSFP. The rankscore is the ratio of the rank the SIFT new score over the total number of SIFTnew scores in dbNSFP. If there are multiple scores, only the largest (most damaging) rankscore is presented. Rank scores range from 0.02654 to 0.87932. Genomic Evolutionary Rate Profiling (GERP) is a conservation score calculated by quantifying substitution deficits across multiple alignments of orthologues using the genomes of 35 mammals. It ranges from − 12.3 to 6.17, with 6.17 being the most conserved [48]. Additional file 4. List of exonic genetic variants called by MiSeq™ Illumina Variant interpreter for the eleven tumor samples. All variants are annotated with the gene ID and locus RefSeq, and the mutation nomenclature is based on the convention recommended by the Human Genome Variation Society (http://www.hgvs.org/mutnomen/) other than the variant allele and the nature of the allele call (heterozygous or homozygous). Frequency data indicate the percentage of the variant allele detected by Illumina. Moreover, they are annotated for dbSNP (rs number) or COSMIC v86 database, together with FATHMM score. The FATHMM is a functional score for individual mutations from FATHMM-MKL are in the form of a single p-value, ranging from 0 to 1. Scores above 0.5 are deleterious, but in order to highlight the most significant data in COSMIC, only scores ≥0.7 are classified as 'Pathogenic' whereas mutations are classed as 'Neutral' if the score is ≤0.5 [47]. The "Effect" column reports the effect of nucleotide change on the protein. The last three columns of the table report the GnomAD Frequency, the predictive effect on the protein based on SIFT, and the conservation score, namely GERP. Converted rankscore is reported for SIFT. To obtain the rankscore, Sorting Intolerant from Tolerant (SIFT) scores were first converted to SIFTnew = (1-SIFTori), then ranked among all SIFTnew scores in dbNSFP. The rankscore is the ratio of the rank the SIFTnew score over the total number of SIFTnew scores in dbNSFP. If there are multiple scores, only the largest (most damaging) rankscore is presented. Rankscores range from 0.02654 to 0.87932. Genomic Evolutionary Rate Profiling (GERP) is a conservation score calculated by quantifying substitution deficits across multiple alignments of orthologues using the genomes of 35 mammals. It ranges from − 12.3 to 6.17, with 6.17 being the most conserved [48].