Table 2.

Variants identified by km using our AML catalog in the Leucegene and TCGA cohort.

DatasetMutation nameKm typeVariantTargetNumber of samples
InsDelIndelSubITDI&I
LeucegeneIDH1 R132000320032437437
DNMT3A R882000640064436
NPM1 4-bp ins2201010313139437
FLT3–ITD1033388354162429
FLT3–TKD040310034434
MYC T58A/P59R0002002437
NUP98–NSD17a0000076
NSD1–NUP9800000002
KMT2A–PTD10b000001015
TCGA (AML)IDH1 R132000110011148151
DNMT3A R882000120012149
NPM1 4-bp ins600028640151
FLT3–ITD7605222018100142
FLT3–TKD010110012149
MYC T58A/P59R0002002139
NUP98–NSD100000000
NSD1–NUP9800000002
KMT2A–PTD3b0000030
TCGA (non-AML)IDH1 R132000394003949,85010,256
DNMT3A R88200000009,267
NPM1 4-bp ins00000001,0232
FLT3–ITD0009009163
FLT3–TKD0000000361
MYC T58A/P59R00050059,204
NUP98–NSD100000000
NSD1–NUP9800000000
KMT2A–PTD00000000
  • a Fusion with exon 12 found as an insertion in the target sequence.

  • b Tandem duplication extended with exon 9 or 9 and 10.

  • Each dataset is split into two parts: reference target and variant target (italic). The “target” column reports the number of samples expressing the target sequence. The “variant” column shows the number of samples where at least one variant of the target sequence is found. As a variant target sequence represents a mutated sequence (Fig 1E), mutated samples counts are indicated in bold. The columns in “km type” identify the specific types of variants detected. Of note, several types of variants can be identified in a given sample. As expected, SNVs on IDH1 are found in AML and non-AML samples on lower grade glioma (LGG) (see Table S1).