1 Preparación

1.1 Instalación MYSQL

Antes de empezar, es necesario descargar e instalar en tu ordenador MySQL y MySQL Workbench.

Haciendo click en los nombres te dirigirá a los links para la descarga de estos programas.

1.2 Usuario y contraseña

Una vez descargados e instalados, debes indicar desde MySQL Workbench tu usuario y contraseña para poder acceder a la base de datos en la que darás de alta tu pipeline. Para ello, abre MySQL Workbench y añade una conexión nueva en MySQL Connections.
MySQL Connections esta es la información que debemos introducir para poder acceder con usuario y contraseña a Delphi:

  • host: delphi-db.cv40yqws6ak9.eu-north-1.rds.amazonaws.com
  • port: 3306
  • user: variant_user
  • password: JNnLynnFDK

2 Dar de alta un pipeline

2.1 Introducción

Esta guía indica como dar de alta un pipeline completp con MySQL Workbench a partir de Pipelines, Guías, Criterios y Métricas disponibles en las base de datos de Delphi.

2.1.1 Bases de datos en Delphi

Tenemos en total tres bases de datos:

  • delphi-dev

  • delphi-release

  • delphi

dev se usa para hacer pruebas, release es la base de datos más estable y delphi es para externalizar. Para este caso, en el que queremos dar de alta un pipeline completo, se trabajará en delphi-dev.

2.1.2 Estructura de los datos

Cada pipeline se subdivide en guías, que pueden subdividirse en criterios y estos contienen métricas. Los Pipeline en delphi-dev tiene un ID, nombre, descripción, autor, IDs para sus Guidelines e IDs para su filtrado.

2.1.3 Insertar de novo

Puede que nuestro pipeline considere Criterios, Métricas, etc que ya estén definidos en las bases de datos de Delphi y por contra también habrán otros que todavía no existen en las BBDD. Es importante diferenciarlos bien ya que solo se introducirán los nuevos. Es decir: identifica las métricas y criterios preexistentes.

2.1.4 ID de referencia

Es importante conocer los ID con los que vas a tener que trabajar para dar de alta tu pipeline. Para el caso de Hereditary Cancer (HC), el pipeline de referencia es 3, las guías de referencia son 5 para la clasificación y 2 para el filtrado.

2.2 1. Insertar Guidelines

Para conocer las distintas Guidelines que hay en Delphi podemos hacer la siguiente consulta:

SELECT * FROM Guideline;

2.2.1 1.1 Clasificación

Es importante adaptar los valores de cada columna acorde a la clasificación con la que queremos trabajar. Por ejemplo, la Guía de clasificación de HC tiene el ID: 5. Por lo que puedo consultar la información de la fila que me interesa a traves de la siguiente consulta:

SELECT * FROM Guideline WHERE id = 5;

De este modo, haciendo click con el botón derecho del mouse puedo copiar el contenido de la fila de interés con Copy Row (unquoted), así será más fácil hacer los INSERT de la `Guideline ya existente.

# Classification guideline 5 - HC
INSERT IGNORE INTO Guideline(id, title, author, applicability, url, rules, evaluation_type) VALUES 
(5, Score Based ACMG-AMP based classification guidelines for Heriditary Cancer, Richards et al, Mendelian disorders, 10.1038/gim.2015.30, {"benign": {"lower": -9999999999, "upper": -6}, "likely benign": {"lower": -6, "upper": -1}, "VUS-ice cold": {"lower": 0, "upper": 0}, "VUS-cold": {"lower": 1, "upper": 1}, "VUS-cool": {"lower": 2, "upper": 2}, "VUS-tepid": {"lower": 3, "upper": 3}, "VUS-warm": {"lower": 4, "upper": 4}, "VUS-hot": {"lower": 5, "upper": 5}, "likely pathogenic": {"lower": 6, "upper": 9}, "pathogenic": {"lower": 10, "upper": 9999999999}}, score);

2.2.2 1.2 Filtrado

El filtrado se puede dejar tal cual está hecho si estamos trabajando bajo un ID ya conocido. Por ejemplo, la Guía de filtrado de HC (Scores) tiene el ID: 2

# Filtering guideline 2 - HC
INSERT IGNORE INTO Guideline(id, title, author, applicability, url, rules, evaluation_type) VALUES 
(2, Filtering guidelines for Heriditary Cancer, PROS UPV, NULL, NULL, {"pass": [{"pathogenic_filtering": 1}]}, boolean);

2.3 2. Insertar Criterions.

Para conocer los distintas Criterions que hay en Delphi podemos hacer la siguiente consulta:

SELECT * FROM Criterion;

Y si queremos concretar la consulta para solo ver los criterios relacionados con una guideline en particular podemos hacer esta otra consulta:

SELECT c.*
FROM Criterion c
INNER JOIN Guideline_Criterion gc ON gc.criterion_id = c.id
WHERE gc.guideline_id = 5;

Esta consulta toma todos los registros de la tabla Criterion (c.*), los une con Guideline_Criterion (gc) por el id del criterio, en este caso es el 5 y de este modo filtra solo aquellos donde la guideline_id es 5.

También podemos conocer los Criterion de filtrado que comprenden a la guia 2 de filtrado para Hereditary Cancer:

SELECT c.*
FROM Criterion c
INNER JOIN Guideline_Criterion gc ON gc.criterion_id = c.id
WHERE gc.guideline_id = 2;

SELECT c.* FROM Criterion c INNER JOIN Guideline_Criterion gc ON gc.criterion_id = c.id WHERE gc.guideline_id = 2;

2.3.1 2.1 Criterions de novo

Tras descartar los criterios preexistentes, insertamos los criterios de novo con el siguiente formato:

# Add OMX criterions - HC
INSERT IGNORE INTO Criterion (id, name, dimension, description, pass_rule, strength, direction, suggested_strength, min_accepted_strength, max_accepted_strength) VALUES 
(89, OM2.1, Variant Type and Location, Protein length changes as a result of in-frame deletions in a known oncogene, AND, moderate, oncogenic, NULL, NULL, NULL),
(90, OM2.2, Variant Type and Location, Protein length changes as a result of in-frame insertions in a known oncogene, AND, moderate, oncogenic, NULL, NULL, NULL),
(91, OM2.3, Variant Type and Location, Protein length changes as a result of reading frame deletions and stop-loss variation in a known tumour suppressor gene, AND, moderate, oncogenic, NULL, NULL, NULL),
(92, OM2.4, Variant Type and Location, Protein length changes as a result of reading frame insertions and stop-loss variation in a known tumour suppressor gene, AND, moderate, oncogenic, NULL, NULL, NULL),
(93, OM3, Variant Type and Location,  Located in one of the hotspots on cancerhotspots.org with less than 50 samples that have a somatic variation at the same amino acid position  and the count of identical amino acid changes on cancerhotspots.org is at least 10, AND, moderate, oncogenic, NULL, NULL, NULL),

2.3.2 2.2 Criterions preexistentes

Los criterios que ya existan en la base de datos no hace falta insertarlos del mismo modo que se describe en el punto anterior, simplemente se establecerá la relación de sus ID con el ID de la guía a la que queremos que pertenezca.

2.4 3. Asociar Criterions a Guideline-Criterions

En esta parte, se empareja el ID de la Guía con cada uno de los ID de los Criterios pertenecientes. Tenemos que incluir los ID de los criterios que hemos creado de novo así como los ID de los Criterios preexistentes que no nos ha hecho falta incluir en el paso 2.

2.4.1 3.1 Clasificación

Se asocian los Criterions de clasificación a la Guide de clasificación que pertenecen a la dimensión: filtering.

# Bind filtering Guideline_Criterion
INSERT IGNORE INTO Guideline_Criterion(guideline_id, criterion_id) VALUES
(5, 88),
(5, 89),
(5, 90),
(5, 91),
(5, 92),
(5, 93),

2.4.2 3.1 Filtrado

Se asocian los Criterions de filtrado a la Guide de filtrado

## Filtering guideline 2 - HC
### Bind new and preexisitent Filtering criterions with Filtering guideline
INSERT IGNORE INTO Guideline_Criterion(guideline_id, criterion_id) VALUES
(2, 4),
(2, 7),
(2, 9),
(2, 19);

2.5 4. Insertar Métricas (Metric)

Antes de insertar las métricas de nuestro pipeline, es importante comprobar que no estén ya incluidas en la base de datos. Además, en los casos en los que encontremos coincidencias, debemos asegurarnos de que sean exactamente iguales. Para comprobarlo, se puede realizar la siguiente consulta:

SELECT c.name as 'c_name', m.* FROM Metric m
INNER JOIN Criterion_Metric cm ON cm.Metric_id = m.id
INNER JOIN Criterion c ON c.id = cm.Criterion_id
INNER JOIN Guideline_Criterion gc ON gc.criterion_id = c.id
WHERE gc.guideline_id = 5
ORDER BY c.name ASC;

Este SELECT en SQL está extrayendo información de varias tablas relacionadas entre sí, y su objetivo es recuperar todas las métricas (m.*) asociadas a una guía concreta (en este caso, la que tiene id = 5), junto con el nombre del criterio al que pertenece cada métrica.

2.5.1 2.1 Metrics de novo

Tras descartar los criterios preexistentes, insertamos las métricas nuevas con el siguiente formato:

# Add Metrics - OMx criterion - HC 
INSERT IGNORE INTO Metric (id, name, description, columns_pattern, min_percentage_data_fulfillment, data_evaluation_condition) VALUES 
(119, inframe_deletion, The variant is an in-frame deletion., ImpactPrediction.description@SnpEff, 1, {"operation": "IN", "value": ["conservative_inframe_deletion","disruptive_inframe_deletion"]}
(120, inframe_insertion, The variant is an in-frame insertion., ImpactPrediction.description@SnpEff, 1, {"operation": "IN", "value": (["conservative_inframe_insertion","disruptive_inframe_insertion"]}),
(121, same_residue_cancerhotspots_greater_50, Located in one of the hotspots on cancerhotspots.org with less than 50 samples showing somatic variation at the same amino acid position (residue)., Count.Mutations@CancerHotspots, 1, {"operation": "MINOR", "value": "49"}),
(122, same_aa_change_greaterequal_10, Same count of amino acid changes on cancerhotspots.org in at least 10 samples., Count.Variant@CancerHotspots, 1, {"operation": "MAJOR", "value": "10"}),
(123, 'oncogene', 'Deletion/insertion occurs in an oncogene', 'Gene.official_symbol@SnpEff', 1, '{"operation": "IN", "value": ["A1CF", "ABL1", "ABL2", "ACKR3", "ACVR1", "AFDN", "AFF3", "AFF4", "AKT1", "AKT2", "AKT3", "ALK", "APOBEC3B", "AR", "ARAF", "ARHGAP5", "ARNT", "ATF1", "ATP1A1", "BCL11A", "BCL11B", "BCL2", "BCL2L12", "BCL3", "BCL6", "BCL9", "BCL9L", "BCORL1", "BIRC3", "BIRC6", "BMPR1A", "BRAF", "BRD3", "BRD4", "BTK", "CACNA1D", "CALR", "CARD11", "CBL", "CCND1", "CCND2", "CCND3", "CCNE1", "CCR4", "CCR7", "CD28", "CD74", "CD79A", "CD79B", "CDH17", "CDK4", "CDK6", "CDKN1A", "CHD4", "CHST11", "CIC", "CREB1", "CREB3L2", "CREBBP", "CRLF2", "CRTC1", "CSF1R", "CSF3R", "CTNNA2", "CTNNB1", "CTNND2", "CUX1", "CXCR4", "CYSLTR2", "DAXX", "DDB2", "DDIT3", "DDR2", "DDX5", "DDX6", "DEK", "DGCR8", "EGFR", "ELF4", "ELK4", "EPAS1", "ERBB2", "ERBB3", "ERBB4", "ERG", "ESR1", "ETV1", "ETV4", "ETV5", "EWSR1", "EZH2", "FCGR2B", "FCRL4", "FES", "FEV", "FGFR1", "FGFR2", "FGFR3", "FGFR4", "FLI1", "FLT3", "FLT4", "FOXA1", "FOXL2", "FOXO1", "FOXO3", "FOXO4", "FOXP1", "FOXR1", "FSTL3", "FUBP1", "GATA1", "GATA2", "GATA3", "GLI1", "GNA11", "GNAQ", "GNAS", "GPC3", "GRM3", "H3F3A", "H3F3B", "HEY1", "HGF", "HIF1A", "HIP1", "HIST1H3B", "HLF", "HMGA1", "HMGA2", "HNRNPA2B1", "HOXA11", "HOXA13", "HOXA9", "HOXC11", "HOXC13", "HOXD11", "HOXD13", "HRAS", "IDH1", "IDH2", "IKBKB", "IKZF3", "IL6ST", "IL7R", "IRF4", "IRS4", "JAK1", "JAK2", "JAK3", "JUN", "KAT6A", "KAT7", "KCNJ5", "KDM5A", "KDM6A", "KDR", "KIT", "KLF4", "KMT2A", "KMT2D", "KNSTRN", "KRAS", "LCK", "LEF1", "LMO1", "LMO2", "LPP", "LYL1", "LYN", "MACC1", "MAF", "MAFB", "MALAT1", "MALT1", "MAML2", "MAP2K1", "MAP2K2", "MAP2K4", "MAP3K1", "MAP3K13", "MAPK1", "MDM2", "MDM4", "MECOM", "MET", "MITF", "MLLT10", "MN1", "MPL", "MRTFA", "MSI2", "MTCP1", "MTOR", "MUC16", "MUC4", "MYB", "MYC", "MYCL", "MYCN", "MYD88", "MYOD1", "NCOA2", "NFATC2", "NFE2L2", "NFKB2", "NKX2-1", "NOTCH1", "NOTCH2", "NPM1", "NR4A3", "NRAS", "NSD2", "NSD3", "NT5C2", "NTRK1", "NTRK2", "NTRK3", "NUP98", "NUTM1", "OLIG2", "P2RY8", "PABPC1", "PAX3", "PAX5", "PBX1", "PDCD1LG2", "PDGFB", "PDGFRA", "PDGFRB", "PIK3CA", "PIK3CB", "PIM1", "PLAG1", "PLCG1", "POLQ", "POU2AF1", "POU5F1", "PPM1D", "PRDM16", "PREX2", "PRKACA", "PRKAR1A", "PSIP1", "PTK6", "PTPN11", "QKI", "RAC1", "RAD21", "RAF1", "RAP1B", "RAP1GDS1", "RARA", "RECQL4", "REL", "RET", "RHOA", "ROS1", "RRAS2", "RSPO3", "RUNX1", "RUNX1T1", "SALL4", "SET", "SETBP1", "SETDB1", "SF3B1", "SGK1", "SH3GL1", "SIX1", "SIX2", "SKI", "SMO", "SND1", "SOX2", "SRC", "SRSF2", "SRSF3", "SSX1", "SSX2", "SSX4", "STAT3", "STAT5B", "STAT6", "STIL", "SUZ12", "SYK", "TAF15", "TAL1", "TAL2", "TBL1XR1", "TBX3", "TCF3", "TCF7L2", "TCL1A", "TEC", "TERT", "TET1", "TFE3", "TFEB", "TLX1", "TLX3", "TMSB4X", "TNC", "TNFRSF17", "TP53", "TP63", "TRIM24", "TRIM27", "TRRAP", "U2AF1", "UBR5", "USP6", "USP8", "WAS", "WT1", "WWTR1", "XPO1", "ZEB1", "ZNF521", "AIP", "DIS3L2", "DLST", "EGLN1", "EGLN2", "EPCAM", "ERCC1", "ERCC6", "FAN1", "FANCB", "FANCI", "FANCL", "FANCM", "GALNT12", "GCM2", "GPR101", "GREM1", "HNF1B", "HOXB13", "KIF1B", "MAD2L2", "MC1R", "MDH2", "MLH3", "MRE11", "MSH3", "MSR1", "NFIX", "POLH", "PTCH2", "RAD51", "RAD51C", "RAD51D", "RECQL", "RPS20", "SCG5", "SLC25A11", "SLX4", "UBE2T", "XRCC2", "XRCC3"]}'),
(124, 'tumour_suppressor_gene', 'Deletion/insertion/missing stop codon occurs in a tumour suppressor gene.', 'Gene.official_symbol@SnpEff', 1, '{"operation": "IN", "value": ["ABI1", "ACVR1B", "ACVR2A", "AMER1", "APC", "APOBEC3B", "ARHGAP26", "ARHGAP35", "ARHGEF10", "ARHGEF10L", "ARHGEF12", "ARID1A", "ARID1B", "ARID2", "ARNT", "ASXL1", "ASXL2", "ATM", "ATP1A1", "ATP2B3", "ATR", "ATRX", "AXIN1", "AXIN2", "B2M", "BAP1", "BARD1", "BAX", "BAZ1A", "BCL10", "BCL11B", "BCL9L", "BCOR", "BCORL1", "BIRC3", "BLM", "BMPR1A", "BRCA1", "BRCA2", "BRIP1", "BTG1", "BTG2", "BTK", "BUB1B", "CAMTA1", "CARS", "CASP3", "CASP8", "CASP9", "CBFA2T3", "CBFB", "CBL", "CBLB", "CCDC6", "CCNB1IP1", "CCNC", "CD274", "CDC73", "CDH1", "CDH10", "CDH11", "CDK12", "CDKN1A", "CDKN1B", "CDKN2A", "CDKN2C", "CDX2", "CEBPA", "CHD2", "CHEK2", "CIC", "CIITA", "CLTC", "CLTCL1", "CNBP", "CNOT3", "CNTNAP2", "CPEB3", "CREB3L1", "CREBBP", "CSMD3", "CTCF", "CTNNA1", "CUL3", "CUX1", "CYLD", "DAXX", "DDB2", "DDX10", "DDX3X", "DICER1", "DNM2", "DNMT3A", "DROSHA", "EBF1", "EED", "EIF3E", "ELF3", "ELF4", "ELL", "EP300", "EPAS1", "EPS15", "ERBB4", "ERCC2", "ERCC3", "ERCC4", "ERCC5", "ESR1", "ETNK1", "ETV6", "EXT1", "EXT2", "EZH2", "FANCA", "FANCC", "FANCD2", "FANCE", "FANCF", "FANCG", "FAS", "FAT1", "FAT4", "FBLN2", "FBXO11", "FBXW7", "FEN1", "FES", "FH", "FHIT", "FLCN", "FOXL2", "FOXO1", "FOXO3", "FOXO4", "FUS", "GATA1", "GATA3", "GPC3", "GPC5", "GRIN2A", "HNF1A", "HOXA11", "HOXA9", "ID3", "IGF2BP2", "IKZF1", "IKZF3", "IRF4", "IRS4", "JAK1", "KAT6B", "KDM5C", "KDM6A", "KEAP1", "KLF4", "KLF6", "KMT2C", "KMT2D", "KNL1", "LARP4B", "LATS1", "LATS2", "LEF1", "LEPROTL1", "LRIG3", "LRP1B", "LZTR1", "MALAT1", "MAP2K4", "MAP3K1", "MAP3K13", "MAX", "MED12", "MEN1", "MGMT", "MLF1", "MLH1", "MRTFA", "MSH2", "MSH6", "MUTYH", "MYH9", "N4BP2", "NAB2", "NBN", "NCOA4", "NCOR1", "NCOR2", "NDRG1", "NF1", "NF2", "NFE2L2", "NFKB2", "NFKBIE", "NKX2-1", "NOTCH1", "NOTCH2", "NRG1", "NTHL1", "NTRK1", "PABPC1", "PALB2", "PATZ1", "PAX5", "PBRM1", "PER1", "PHF6", "PHOX2B", "PIK3R1", "PML", "PMS2", "POLD1", "POLE", "POLG", "POLQ", "POT1", "PPARG", "PPP2R1A", "PPP6C", "PRDM1", "PRDM2", "PRF1", "PRKAR1A", "PTCH1", "PTEN", "PTK6", "PTPN13", "PTPN6", "PTPRB", "PTPRC", "PTPRD", "PTPRK", "PTPRT", "QKI", "RAD17", "RAD21", "RAD50", "RAD51B", "RANBP2", "RB1", "RBM10", "RECQL4", "RFWD3", "RHOA", "RHOH", "RMI2", "RNF43", "ROBO2", "RPL10", "RPL22", "RPL5", "RSPO2", "RUNX1", "RUNX1T1", "RXRA", "SBDS", "SDHA", "SDHAF2", "SDHB", "SDHC", "SDHD", "SETD1B", "SETD2", "SFPQ", "SFRP4", "SH2B3", "SIRPA", "SLC34A2", "SMAD2", "SMAD3", "SMAD4", "SMARCA4", "SMARCB1", "SMARCD1", "SMARCE1", "SMC1A", "SOCS1", "SOX21", "SPEN", "SPOP", "STAG1", "STAG2", "STAT5B", "STK11", "SUFU", "SUZ12", "TBL1XR1", "TBX3", "TCF3", "TENT5C", "TERT", "TET1", "TET2", "TGFBR2", "TMEM127", "TNFAIP3", "TNFRSF14", "TP53", "TP63", "TPM3", "TRAF7", "TRIM24", "TRIM33", "TSC1", "TSC2", "USP44", "VHL", "WIF1", "WNK2", "WRN", "WT1", "XPA", "XPC", "YWHAE", "ZBTB16", "ZFHX3", "ZMYM3", "ZNF331", "ZNRF3", "ZRSR2"]}');

2.5.2 2.2 Metrics preexistentes

Las métricas que ya existan en la base de datos no hace falta que sean insertadas del mismo modo que se describe en el punto anterior, simplemente se establecerá la relación de sus ID con el ID del criterio a la que queremos que pertenezca.

2.6 5. Asociar Metrics a Criterions

Puesto que una o varias métricas deben pertenecer a un criterio, en este paso hay que establecer la relación de métricas a criterio del siguiente modo:

# Bind Metrics to  OMx Criterion -HC 
INSERT IGNORE INTO Criterion_Metric (criterion_id, metric_id) VALUES
(88, 30),
(89, 119),
(89, 123),
(90, 120),
(90, 123),
(91, 119),
(91, 38),
(91, 124),
(92, 120),
(92, 38),
(92, 124),
(93, 121),
(93, 122);

2.7 6. Insertar genes de Interés

Primero evitamos los duplicados:

# Genes Of Interest
# ----------------

# Add a constraint to prevent duplicate genes
ALTER TABLE GeneOfInterest ADD CONSTRAINT uq_gene_name UNIQUE (name);

y así podemos insertar nuestro listado de genes de interés:

# Add Genes to GeneOfInterest - HC
INSERT IGNORE INTO GeneOfInterest (name) VALUES
('A1CF'),
('ABL1'),
('ABL2'),
('ACKR3'),
('ACVR1'),
('AFDN'),
('AFF3'),
('AFF4'),
('AKT1'),
('AKT2'),
('AKT3'),
('ALK'),
('APOBEC3B'),
('AR'),
('ARAF'),
('ARHGAP5'),
('ARNT'),
('ATF1'),
('ATP1A1'),
('BCL11A'),
('BCL11B'),
('BCL2'),
('BCL2L12'),
('BCL3'),
('BCL6'),
('BCL9'),
('BCL9L'),
('BCORL1'),
('BIRC3'),
('BIRC6'),
('BMPR1A'),
('BRAF'),
('BRD3'),
('BRD4'),
('BTK'),
('CACNA1D'),
('CALR'),
('CARD11'),
('CBL'),
('CCND1'),
('CCND2'),
('CCND3'),
('CCNE1'),
('CCR4'),
('CCR7'),
('CD28'),
('CD74'),
('CD79A'),
('CD79B'),
('CDH17'),
('CDK4'),
('CDK6'),
('CDKN1A'),
('CHD4'),
('CHST11'),
('CIC'),
('CREB1'),
('CREB3L2'),
('CREBBP'),
('CRLF2'),
('CRTC1'),
('CSF1R'),
('CSF3R'),
('CTNNA2'),
('CTNNB1'),
('CTNND2'),
('CUX1'),
('CXCR4'),
('CYSLTR2'),
('DAXX'),
('DDB2'),
('DDIT3'),
('DDR2'),
('DDX5'),
('DDX6'),
('DEK'),
('DGCR8'),
('EGFR'),
('ELF4'),
('ELK4'),
('EPAS1'),
('ERBB2'),
('ERBB3'),
('ERBB4'),
('ERG'),
('ESR1'),
('ETV1'),
('ETV4'),
('ETV5'),
('EWSR1'),
('EZH2'),
('FCGR2B'),
('FCRL4'),
('FES'),
('FEV'),
('FGFR1'),
('FGFR2'),
('FGFR3'),
('FGFR4'),
('FLI1'),
('FLT3'),
('FLT4'),
('FOXA1'),
('FOXL2'),
('FOXO1'),
('FOXO3'),
('FOXO4'),
('FOXP1'),
('FOXR1'),
('FSTL3'),
('FUBP1'),
('GATA1'),
('GATA2'),
('GATA3'),
('GLI1'),
('GNA11'),
('GNAQ'),
('GNAS'),
('GPC3'),
('GRM3'),
('H3F3A'),
('H3F3B'),
('HEY1'),
('HGF'),
('HIF1A'),
('HIP1'),
('HIST1H3B'),
('HLF'),
('HMGA1'),
('HMGA2'),
('HNRNPA2B1'),
('HOXA11'),
('HOXA13'),
('HOXA9'),
('HOXC11'),
('HOXC13'),
('HOXD11'),
('HOXD13'),
('HRAS'),
('IDH1'),
('IDH2'),
('IKBKB'),
('IKZF3'),
('IL6ST'),
('IL7R'),
('IRF4'),
('IRS4'),
('JAK1'),
('JAK2'),
('JAK3'),
('JUN'),
('KAT6A'),
('KAT7'),
('KCNJ5'),
('KDM5A'),
('KDM6A'),
('KDR'),
('KIT'),
('KLF4'),
('KMT2A'),
('KMT2D'),
('KNSTRN'),
('KRAS'),
('LCK'),
('LEF1'),
('LMO1'),
('LMO2'),
('LPP'),
('LYL1'),
('LYN'),
('MACC1'),
('MAF'),
('MAFB'),
('MALAT1'),
('MALT1'),
('MAML2'),
('MAP2K1'),
('MAP2K2'),
('MAP2K4'),
('MAP3K1'),
('MAP3K13'),
('MAPK1'),
('MDM2'),
('MDM4'),
('MECOM'),
('MET'),
('MITF'),
('MLLT10'),
('MN1'),
('MPL'),
('MRTFA'),
('MSI2'),
('MTCP1'),
('MTOR'),
('MUC16'),
('MUC4'),
('MYB'),
('MYC'),
('MYCL'),
('MYCN'),
('MYD88'),
('MYOD1'),
('NCOA2'),
('NFATC2'),
('NFE2L2'),
('NFKB2'),
('NKX2-1'),
('NOTCH1'),
('NOTCH2'),
('NPM1'),
('NR4A3'),
('NRAS'),
('NSD2'),
('NSD3'),
('NT5C2'),
('NTRK1'),
('NTRK2'),
('NTRK3'),
('NUP98'),
('NUTM1'),
('OLIG2'),
('P2RY8'),
('PABPC1'),
('PAX3'),
('PAX5'),
('PBX1'),
('PDCD1LG2'),
('PDGFB'),
('PDGFRA'),
('PDGFRB'),
('PIK3CA'),
('PIK3CB'),
('PIM1'),
('PLAG1'),
('PLCG1'),
('POLQ'),
('POU2AF1'),
('POU5F1'),
('PPM1D'),
('PRDM16'),
('PREX2'),
('PRKACA'),
('PRKAR1A'),
('PSIP1'),
('PTK6'),
('PTPN11'),
('QKI'),
('RAC1'),
('RAD21'),
('RAF1'),
('RAP1B'),
('RAP1GDS1'),
('RARA'),
('RECQL4'),
('REL'),
('RET'),
('RHOA'),
('ROS1'),
('RRAS2'),
('RSPO3'),
('RUNX1'),
('RUNX1T1'),
('SALL4'),
('SET'),
('SETBP1'),
('SETDB1'),
('SF3B1'),
('SGK1'),
('SH3GL1'),
('SIX1'),
('SIX2'),
('SKI'),
('SMO'),
('SND1'),
('SOX2'),
('SRC'),
('SRSF2'),
('SRSF3'),
('SSX1'),
('SSX2'),
('SSX4'),
('STAT3'),
('STAT5B'),
('STAT6'),
('STIL'),
('SUZ12'),
('SYK'),
('TAF15'),
('TAL1'),
('TAL2'),
('TBL1XR1'),
('TBX3'),
('TCF3'),
('TCF7L2'),
('TCL1A'),
('TEC'),
('TERT'),
('TET1'),
('TFE3'),
('TFEB'),
('TLX1'),
('TLX3'),
('TMSB4X'),
('TNC'),
('TNFRSF17'),
('TP53'),
('TP63'),
('TRIM24'),
('TRIM27'),
('TRRAP'),
('U2AF1'),
('UBR5'),
('USP6'),
('USP8'),
('WAS'),
('WT1'),
('WWTR1'),
('XPO1'),
('ZEB1'),
('ZNF521'),
('AIP'),
('DIS3L2'),
('DLST'),
('EGLN1'),
('EGLN2'),
('EPCAM'),
('ERCC1'),
('ERCC6'),
('FAN1'),
('FANCB'),
('FANCI'),
('FANCL'),
('FANCM'),
('GALNT12'),
('GCM2'),
('GPR101'),
('GREM1'),
('HNF1B'),
('HOXB13'),
('KIF1B'),
('MAD2L2'),
('MC1R'),
('MDH2'),
('MLH3'),
('MRE11'),
('MSH3'),
('MSR1'),
('NFIX'),
('POLH'),
('PTCH2'),
('RAD51'),
('RAD51C'),
('RAD51D'),
('RECQL'),
('RPS20'),
('SCG5'),
('SLC25A11'),
('SLX4'),
('UBE2T'),
('XRCC2'),
('XRCC3'),
('ABI1'),
('ACVR1B'),
('ACVR2A'),
('AMER1'),
('APC'),
('APOBEC3B'),
('ARHGAP26'),
('ARHGAP35'),
('ARHGEF10'),
('ARHGEF10L'),
('ARHGEF12'),
('ARID1A'),
('ARID1B'),
('ARID2'),
('ARNT'),
('ASXL1'),
('ASXL2'),
('ATM'),
('ATP1A1'),
('ATP2B3'),
('ATR'),
('ATRX'),
('AXIN1'),
('AXIN2'),
('B2M'),
('BAP1'),
('BARD1'),
('BAX'),
('BAZ1A'),
('BCL10'),
('BCL11B'),
('BCL9L'),
('BCOR'),
('BCORL1'),
('BIRC3'),
('BLM'),
('BMPR1A'),
('BRCA1'),
('BRCA2'),
('BRIP1'),
('BTG1'),
('BTG2'),
('BTK'),
('BUB1B'),
('CAMTA1'),
('CARS'),
('CASP3'),
('CASP8'),
('CASP9'),
('CBFA2T3'),
('CBFB'),
('CBL'),
('CBLB'),
('CCDC6'),
('CCNB1IP1'),
('CCNC'),
('CD274'),
('CDC73'),
('CDH1'),
('CDH10'),
('CDH11'),
('CDK12'),
('CDKN1A'),
('CDKN1B'),
('CDKN2A'),
('CDKN2C'),
('CDX2'),
('CEBPA'),
('CHD2'),
('CHEK2'),
('CIC'),
('CIITA'),
('CLTC'),
('CLTCL1'),
('CNBP'),
('CNOT3'),
('CNTNAP2'),
('CPEB3'),
('CREB3L1'),
('CREBBP'),
('CSMD3'),
('CTCF'),
('CTNNA1'),
('CUL3'),
('CUX1'),
('CYLD'),
('DAXX'),
('DDB2'),
('DDX10'),
('DDX3X'),
('DICER1'),
('DNM2'),
('DNMT3A'),
('DROSHA'),
('EBF1'),
('EED'),
('EIF3E'),
('ELF3'),
('ELF4'),
('ELL'),
('EP300'),
('EPAS1'),
('EPS15'),
('ERBB4'),
('ERCC2'),
('ERCC3'),
('ERCC4'),
('ERCC5'),
('ESR1'),
('ETNK1'),
('ETV6'),
('EXT1'),
('EXT2'),
('EZH2'),
('FANCA'),
('FANCC'),
('FANCD2'),
('FANCE'),
('FANCF'),
('FANCG'),
('FAS'),
('FAT1'),
('FAT4'),
('FBLN2'),
('FBXO11'),
('FBXW7'),
('FEN1'),
('FES'),
('FH'),
('FHIT'),
('FLCN'),
('FOXL2'),
('FOXO1'),
('FOXO3'),
('FOXO4'),
('FUS'),
('GATA1'),
('GATA3'),
('GPC3'),
('GPC5'),
('GRIN2A'),
('HNF1A'),
('HOXA11'),
('HOXA9'),
('ID3'),
('IGF2BP2'),
('IKZF1'),
('IKZF3'),
('IRF4'),
('IRS4'),
('JAK1'),
('KAT6B'),
('KDM5C'),
('KDM6A'),
('KEAP1'),
('KLF4'),
('KLF6'),
('KMT2C'),
('KMT2D'),
('KNL1'),
('LARP4B'),
('LATS1'),
('LATS2'),
('LEF1'),
('LEPROTL1'),
('LRIG3'),
('LRP1B'),
('LZTR1'),
('MALAT1'),
('MAP2K4'),
('MAP3K1'),
('MAP3K13'),
('MAX'),
('MED12'),
('MEN1'),
('MGMT'),
('MLF1'),
('MLH1'),
('MRTFA'),
('MSH2'),
('MSH6'),
('MUTYH'),
('MYH9'),
('N4BP2'),
('NAB2'),
('NBN'),
('NCOA4'),
('NCOR1'),
('NCOR2'),
('NDRG1'),
('NF1'),
('NF2'),
('NFE2L2'),
('NFKB2'),
('NFKBIE'),
('NKX2-1'),
('NOTCH1'),
('NOTCH2'),
('NRG1'),
('NTHL1'),
('NTRK1'),
('PABPC1'),
('PALB2'),
('PATZ1'),
('PAX5'),
('PBRM1'),
('PER1'),
('PHF6'),
('PHOX2B'),
('PIK3R1'),
('PML'),
('PMS2'),
('POLD1'),
('POLE'),
('POLG'),
('POLQ'),
('POT1'),
('PPARG'),
('PPP2R1A'),
('PPP6C'),
('PRDM1'),
('PRDM2'),
('PRF1'),
('PRKAR1A'),
('PTCH1'),
('PTEN'),
('PTK6'),
('PTPN13'),
('PTPN6'),
('PTPRB'),
('PTPRC'),
('PTPRD'),
('PTPRK'),
('PTPRT'),
('QKI'),
('RAD17'),
('RAD21'),
('RAD50'),
('RAD51B'),
('RANBP2'),
('RB1'),
('RBM10'),
('RECQL4'),
('RFWD3'),
('RHOA'),
('RHOH'),
('RMI2'),
('RNF43'),
('ROBO2'),
('RPL10'),
('RPL22'),
('RPL5'),
('RSPO2'),
('RUNX1'),
('RUNX1T1'),
('RXRA'),
('SBDS'),
('SDHA'),
('SDHAF2'),
('SDHB'),
('SDHC'),
('SDHD'),
('SETD1B'),
('SETD2'),
('SFPQ'),
('SFRP4'),
('SH2B3'),
('SIRPA'),
('SLC34A2'),
('SMAD2'),
('SMAD3'),
('SMAD4'),
('SMARCA4'),
('SMARCB1'),
('SMARCD1'),
('SMARCE1'),
('SMC1A'),
('SOCS1'),
('SOX21'),
('SPEN'),
('SPOP'),
('STAG1'),
('STAG2'),
('STAT5B'),
('STK11'),
('SUFU'),
('SUZ12'),
('TBL1XR1'),
('TBX3'),
('TCF3'),
('TENT5C'),
('TERT'),
('TET1'),
('TET2'),
('TGFBR2'),
('TMEM127'),
('TNFAIP3'),
('TNFRSF14'),
('TP53'),
('TP63'),
('TPM3'),
('TRAF7'),
('TRIM24'),
('TRIM33'),
('TSC1'),
('TSC2'),
('USP44'),
('VHL'),
('WIF1'),
('WNK2'),
('WRN'),
('WT1'),
('XPA'),
('XPC'),
('YWHAE'),
('ZBTB16'),
('ZFHX3'),
('ZMYM3'),
('ZNF331'),
('ZNRF3'),
('ZRSR2');

2.7.1 Usa python para el cambio de formato:

Para transformar una lista de strings en este formato que necesita MYSQL para insertar todos los genes de interés, se puede hacer fácilmente usando el siguiente código de Python:

# Lista original
lista = ("ABI1", "ACVR1B", "ACVR2A", "AMER1", "APC", "APOBEC3B", "ARHGAP26", "ARHGAP35", "ARHGEF10", "ARHGEF10L", "ARHGEF12", "ARID1A", "ARID1B", "ARID2", "ARNT", "ASXL1", "ASXL2", "ATM", "ATP1A1", "ATP2B3", "ATR", "ATRX", "AXIN1", "AXIN2", "B2M", "BAP1", "BARD1", "BAX", "BAZ1A", "BCL10", "BCL11B", "BCL9L", "BCOR", "BCORL1", "BIRC3", "BLM", "BMPR1A", "BRCA1", "BRCA2", "BRIP1", "BTG1", "BTG2", "BTK", "BUB1B", "CAMTA1", "CARS", "CASP3", "CASP8", "CASP9", "CBFA2T3", "CBFB", "CBL", "CBLB", "CCDC6", "CCNB1IP1", "CCNC", "CD274", "CDC73", "CDH1", "CDH10", "CDH11", "CDK12", "CDKN1A", "CDKN1B", "CDKN2A", "CDKN2C", "CDX2", "CEBPA", "CHD2", "CHEK2", "CIC", "CIITA", "CLTC", "CLTCL1", "CNBP", "CNOT3", "CNTNAP2", "CPEB3", "CREB3L1", "CREBBP", "CSMD3", "CTCF", "CTNNA1", "CUL3", "CUX1", "CYLD", "DAXX", "DDB2", "DDX10", "DDX3X", "DICER1", "DNM2", "DNMT3A", "DROSHA", "EBF1", "EED", "EIF3E", "ELF3", "ELF4", "ELL", "EP300", "EPAS1", "EPS15", "ERBB4", "ERCC2", "ERCC3", "ERCC4", "ERCC5", "ESR1", "ETNK1", "ETV6", "EXT1", "EXT2", "EZH2", "FANCA", "FANCC", "FANCD2", "FANCE", "FANCF", "FANCG", "FAS", "FAT1", "FAT4", "FBLN2", "FBXO11", "FBXW7", "FEN1", "FES", "FH", "FHIT", "FLCN", "FOXL2", "FOXO1", "FOXO3", "FOXO4", "FUS", "GATA1", "GATA3", "GPC3", "GPC5", "GRIN2A", "HNF1A", "HOXA11", "HOXA9", "ID3", "IGF2BP2", "IKZF1", "IKZF3", "IRF4", "IRS4", "JAK1", "KAT6B", "KDM5C", "KDM6A", "KEAP1", "KLF4", "KLF6", "KMT2C", "KMT2D", "KNL1", "LARP4B", "LATS1", "LATS2", "LEF1", "LEPROTL1", "LRIG3", "LRP1B", "LZTR1", "MALAT1", "MAP2K4", "MAP3K1", "MAP3K13", "MAX", "MED12", "MEN1", "MGMT", "MLF1", "MLH1", "MRTFA", "MSH2", "MSH6", "MUTYH", "MYH9", "N4BP2", "NAB2", "NBN", "NCOA4", "NCOR1", "NCOR2", "NDRG1", "NF1", "NF2", "NFE2L2", "NFKB2", "NFKBIE", "NKX2-1", "NOTCH1", "NOTCH2", "NRG1", "NTHL1", "NTRK1", "PABPC1", "PALB2", "PATZ1", "PAX5", "PBRM1", "PER1", "PHF6", "PHOX2B", "PIK3R1", "PML", "PMS2", "POLD1", "POLE", "POLG", "POLQ", "POT1", "PPARG", "PPP2R1A", "PPP6C", "PRDM1", "PRDM2", "PRF1", "PRKAR1A", "PTCH1", "PTEN", "PTK6", "PTPN13", "PTPN6", "PTPRB", "PTPRC", "PTPRD", "PTPRK", "PTPRT", "QKI", "RAD17", "RAD21", "RAD50", "RAD51B", "RANBP2", "RB1", "RBM10", "RECQL4", "RFWD3", "RHOA", "RHOH", "RMI2", "RNF43", "ROBO2", "RPL10", "RPL22", "RPL5", "RSPO2", "RUNX1", "RUNX1T1", "RXRA", "SBDS", "SDHA", "SDHAF2", "SDHB", "SDHC", "SDHD", "SETD1B", "SETD2", "SFPQ", "SFRP4", "SH2B3", "SIRPA", "SLC34A2", "SMAD2", "SMAD3", "SMAD4", "SMARCA4", "SMARCB1", "SMARCD1", "SMARCE1", "SMC1A", "SOCS1", "SOX21", "SPEN", "SPOP", "STAG1", "STAG2", "STAT5B", "STK11", "SUFU", "SUZ12", "TBL1XR1", "TBX3", "TCF3", "TENT5C", "TERT", "TET1", "TET2", "TGFBR2", "TMEM127", "TNFAIP3", "TNFRSF14", "TP53", "TP63", "TPM3", "TRAF7", "TRIM24", "TRIM33", "TSC1", "TSC2", "USP44", "VHL", "WIF1", "WNK2", "WRN", "WT1", "XPA", "XPC", "YWHAE", "ZBTB16", "ZFHX3", "ZMYM3", "ZNF331", "ZNRF3", "ZRSR2")

# Transformación
resultado = ",\n".join(f"('{item}')" for item in lista) + ";"

print(resultado)
## ('ABI1'),
## ('ACVR1B'),
## ('ACVR2A'),
## ('AMER1'),
## ('APC'),
## ('APOBEC3B'),
## ('ARHGAP26'),
## ('ARHGAP35'),
## ('ARHGEF10'),
## ('ARHGEF10L'),
## ('ARHGEF12'),
## ('ARID1A'),
## ('ARID1B'),
## ('ARID2'),
## ('ARNT'),
## ('ASXL1'),
## ('ASXL2'),
## ('ATM'),
## ('ATP1A1'),
## ('ATP2B3'),
## ('ATR'),
## ('ATRX'),
## ('AXIN1'),
## ('AXIN2'),
## ('B2M'),
## ('BAP1'),
## ('BARD1'),
## ('BAX'),
## ('BAZ1A'),
## ('BCL10'),
## ('BCL11B'),
## ('BCL9L'),
## ('BCOR'),
## ('BCORL1'),
## ('BIRC3'),
## ('BLM'),
## ('BMPR1A'),
## ('BRCA1'),
## ('BRCA2'),
## ('BRIP1'),
## ('BTG1'),
## ('BTG2'),
## ('BTK'),
## ('BUB1B'),
## ('CAMTA1'),
## ('CARS'),
## ('CASP3'),
## ('CASP8'),
## ('CASP9'),
## ('CBFA2T3'),
## ('CBFB'),
## ('CBL'),
## ('CBLB'),
## ('CCDC6'),
## ('CCNB1IP1'),
## ('CCNC'),
## ('CD274'),
## ('CDC73'),
## ('CDH1'),
## ('CDH10'),
## ('CDH11'),
## ('CDK12'),
## ('CDKN1A'),
## ('CDKN1B'),
## ('CDKN2A'),
## ('CDKN2C'),
## ('CDX2'),
## ('CEBPA'),
## ('CHD2'),
## ('CHEK2'),
## ('CIC'),
## ('CIITA'),
## ('CLTC'),
## ('CLTCL1'),
## ('CNBP'),
## ('CNOT3'),
## ('CNTNAP2'),
## ('CPEB3'),
## ('CREB3L1'),
## ('CREBBP'),
## ('CSMD3'),
## ('CTCF'),
## ('CTNNA1'),
## ('CUL3'),
## ('CUX1'),
## ('CYLD'),
## ('DAXX'),
## ('DDB2'),
## ('DDX10'),
## ('DDX3X'),
## ('DICER1'),
## ('DNM2'),
## ('DNMT3A'),
## ('DROSHA'),
## ('EBF1'),
## ('EED'),
## ('EIF3E'),
## ('ELF3'),
## ('ELF4'),
## ('ELL'),
## ('EP300'),
## ('EPAS1'),
## ('EPS15'),
## ('ERBB4'),
## ('ERCC2'),
## ('ERCC3'),
## ('ERCC4'),
## ('ERCC5'),
## ('ESR1'),
## ('ETNK1'),
## ('ETV6'),
## ('EXT1'),
## ('EXT2'),
## ('EZH2'),
## ('FANCA'),
## ('FANCC'),
## ('FANCD2'),
## ('FANCE'),
## ('FANCF'),
## ('FANCG'),
## ('FAS'),
## ('FAT1'),
## ('FAT4'),
## ('FBLN2'),
## ('FBXO11'),
## ('FBXW7'),
## ('FEN1'),
## ('FES'),
## ('FH'),
## ('FHIT'),
## ('FLCN'),
## ('FOXL2'),
## ('FOXO1'),
## ('FOXO3'),
## ('FOXO4'),
## ('FUS'),
## ('GATA1'),
## ('GATA3'),
## ('GPC3'),
## ('GPC5'),
## ('GRIN2A'),
## ('HNF1A'),
## ('HOXA11'),
## ('HOXA9'),
## ('ID3'),
## ('IGF2BP2'),
## ('IKZF1'),
## ('IKZF3'),
## ('IRF4'),
## ('IRS4'),
## ('JAK1'),
## ('KAT6B'),
## ('KDM5C'),
## ('KDM6A'),
## ('KEAP1'),
## ('KLF4'),
## ('KLF6'),
## ('KMT2C'),
## ('KMT2D'),
## ('KNL1'),
## ('LARP4B'),
## ('LATS1'),
## ('LATS2'),
## ('LEF1'),
## ('LEPROTL1'),
## ('LRIG3'),
## ('LRP1B'),
## ('LZTR1'),
## ('MALAT1'),
## ('MAP2K4'),
## ('MAP3K1'),
## ('MAP3K13'),
## ('MAX'),
## ('MED12'),
## ('MEN1'),
## ('MGMT'),
## ('MLF1'),
## ('MLH1'),
## ('MRTFA'),
## ('MSH2'),
## ('MSH6'),
## ('MUTYH'),
## ('MYH9'),
## ('N4BP2'),
## ('NAB2'),
## ('NBN'),
## ('NCOA4'),
## ('NCOR1'),
## ('NCOR2'),
## ('NDRG1'),
## ('NF1'),
## ('NF2'),
## ('NFE2L2'),
## ('NFKB2'),
## ('NFKBIE'),
## ('NKX2-1'),
## ('NOTCH1'),
## ('NOTCH2'),
## ('NRG1'),
## ('NTHL1'),
## ('NTRK1'),
## ('PABPC1'),
## ('PALB2'),
## ('PATZ1'),
## ('PAX5'),
## ('PBRM1'),
## ('PER1'),
## ('PHF6'),
## ('PHOX2B'),
## ('PIK3R1'),
## ('PML'),
## ('PMS2'),
## ('POLD1'),
## ('POLE'),
## ('POLG'),
## ('POLQ'),
## ('POT1'),
## ('PPARG'),
## ('PPP2R1A'),
## ('PPP6C'),
## ('PRDM1'),
## ('PRDM2'),
## ('PRF1'),
## ('PRKAR1A'),
## ('PTCH1'),
## ('PTEN'),
## ('PTK6'),
## ('PTPN13'),
## ('PTPN6'),
## ('PTPRB'),
## ('PTPRC'),
## ('PTPRD'),
## ('PTPRK'),
## ('PTPRT'),
## ('QKI'),
## ('RAD17'),
## ('RAD21'),
## ('RAD50'),
## ('RAD51B'),
## ('RANBP2'),
## ('RB1'),
## ('RBM10'),
## ('RECQL4'),
## ('RFWD3'),
## ('RHOA'),
## ('RHOH'),
## ('RMI2'),
## ('RNF43'),
## ('ROBO2'),
## ('RPL10'),
## ('RPL22'),
## ('RPL5'),
## ('RSPO2'),
## ('RUNX1'),
## ('RUNX1T1'),
## ('RXRA'),
## ('SBDS'),
## ('SDHA'),
## ('SDHAF2'),
## ('SDHB'),
## ('SDHC'),
## ('SDHD'),
## ('SETD1B'),
## ('SETD2'),
## ('SFPQ'),
## ('SFRP4'),
## ('SH2B3'),
## ('SIRPA'),
## ('SLC34A2'),
## ('SMAD2'),
## ('SMAD3'),
## ('SMAD4'),
## ('SMARCA4'),
## ('SMARCB1'),
## ('SMARCD1'),
## ('SMARCE1'),
## ('SMC1A'),
## ('SOCS1'),
## ('SOX21'),
## ('SPEN'),
## ('SPOP'),
## ('STAG1'),
## ('STAG2'),
## ('STAT5B'),
## ('STK11'),
## ('SUFU'),
## ('SUZ12'),
## ('TBL1XR1'),
## ('TBX3'),
## ('TCF3'),
## ('TENT5C'),
## ('TERT'),
## ('TET1'),
## ('TET2'),
## ('TGFBR2'),
## ('TMEM127'),
## ('TNFAIP3'),
## ('TNFRSF14'),
## ('TP53'),
## ('TP63'),
## ('TPM3'),
## ('TRAF7'),
## ('TRIM24'),
## ('TRIM33'),
## ('TSC1'),
## ('TSC2'),
## ('USP44'),
## ('VHL'),
## ('WIF1'),
## ('WNK2'),
## ('WRN'),
## ('WT1'),
## ('XPA'),
## ('XPC'),
## ('YWHAE'),
## ('ZBTB16'),
## ('ZFHX3'),
## ('ZMYM3'),
## ('ZNF331'),
## ('ZNRF3'),
## ('ZRSR2');

2.8 7. Crear el Pipeline

Para conocer las distintas Guidelines que hay en Delphi podemos hacer la siguiente consulta:

SELECT * FROM Pipeline;

Es importante adaptar los valores de cada columna acorde a la clasificación con la que queremos trabajar. Por ejemplo, el pipeline para Hereditary Cancer tiene el ID: 3. Por lo que puedo consultar la información de la fila que me interesa a traves de la siguiente consulta:

SELECT * FROM Pipeline WHERE id = 3;

De este modo se puede acceder fácilmente a la informacion que nos interese y crear el pipeline del siguiente modo:

# Create pipeline (id 3) - HC
INSERT IGNORE INTO Pipeline (id, name, description, filtering_guideline, classification_guideline, report_template_id) VALUES
(3, hereditary_cancer_scores, Pipeline for heriditary cancer with scored based evaluation, 2, 5, 3);

2.9 8. Asociar genes al Pipeline

Asociar genes de interés al nuevo pipeline, usar como referencia los genes de interés de la guía 5, del pipeline 3)

# Add Genes Of Interest to the pipeline - OMx - HC
INSERT IGNORE INTO Pipeline_GeneOfInterest(Pipeline_id, GeneOfInterest_id) SELECT 3, GeneOfInterest.id FROM GeneOfInterest WHERE GeneOfInterest.name IN ("A1CF", "ABL1", "ABL2", "ACKR3", "ACVR1", "AFDN", "AFF3", "AFF4", "AKT1", "AKT2", "AKT3", "ALK", "APOBEC3B", "AR", "ARAF", "ARHGAP5", "ARNT", "ATF1", "ATP1A1", "BCL11A", "BCL11B", "BCL2", "BCL2L12", "BCL3", "BCL6", "BCL9", "BCL9L", "BCORL1", "BIRC3", "BIRC6", "BMPR1A", "BRAF", "BRD3", "BRD4", "BTK", "CACNA1D", "CALR", "CARD11", "CBL", "CCND1", "CCND2", "CCND3", "CCNE1", "CCR4", "CCR7", "CD28", "CD74", "CD79A", "CD79B", "CDH17", "CDK4", "CDK6", "CDKN1A", "CHD4", "CHST11", "CIC", "CREB1", "CREB3L2", "CREBBP", "CRLF2", "CRTC1", "CSF1R", "CSF3R", "CTNNA2", "CTNNB1", "CTNND2", "CUX1", "CXCR4", "CYSLTR2", "DAXX", "DDB2", "DDIT3", "DDR2", "DDX5", "DDX6", "DEK", "DGCR8", "EGFR", "ELF4", "ELK4", "EPAS1", "ERBB2", "ERBB3", "ERBB4", "ERG", "ESR1", "ETV1", "ETV4", "ETV5", "EWSR1", "EZH2", "FCGR2B", "FCRL4", "FES", "FEV", "FGFR1", "FGFR2", "FGFR3", "FGFR4", "FLI1", "FLT3", "FLT4", "FOXA1", "FOXL2", "FOXO1", "FOXO3", "FOXO4", "FOXP1", "FOXR1", "FSTL3", "FUBP1", "GATA1", "GATA2", "GATA3", "GLI1", "GNA11", "GNAQ", "GNAS", "GPC3", "GRM3", "H3F3A", "H3F3B", "HEY1", "HGF", "HIF1A", "HIP1", "HIST1H3B", "HLF", "HMGA1", "HMGA2", "HNRNPA2B1", "HOXA11", "HOXA13", "HOXA9", "HOXC11", "HOXC13", "HOXD11", "HOXD13", "HRAS", "IDH1", "IDH2", "IKBKB", "IKZF3", "IL6ST", "IL7R", "IRF4", "IRS4", "JAK1", "JAK2", "JAK3", "JUN", "KAT6A", "KAT7", "KCNJ5", "KDM5A", "KDM6A", "KDR", "KIT", "KLF4", "KMT2A", "KMT2D", "KNSTRN", "KRAS", "LCK", "LEF1", "LMO1", "LMO2", "LPP", "LYL1", "LYN", "MACC1", "MAF", "MAFB", "MALAT1", "MALT1", "MAML2", "MAP2K1", "MAP2K2", "MAP2K4", "MAP3K1", "MAP3K13", "MAPK1", "MDM2", "MDM4", "MECOM", "MET", "MITF", "MLLT10", "MN1", "MPL", "MRTFA", "MSI2", "MTCP1", "MTOR", "MUC16", "MUC4", "MYB", "MYC", "MYCL", "MYCN", "MYD88", "MYOD1", "NCOA2", "NFATC2", "NFE2L2", "NFKB2", "NKX2-1", "NOTCH1", "NOTCH2", "NPM1", "NR4A3", "NRAS", "NSD2", "NSD3", "NT5C2", "NTRK1", "NTRK2", "NTRK3", "NUP98", "NUTM1", "OLIG2", "P2RY8", "PABPC1", "PAX3", "PAX5", "PBX1", "PDCD1LG2", "PDGFB", "PDGFRA", "PDGFRB", "PIK3CA", "PIK3CB", "PIM1", "PLAG1", "PLCG1", "POLQ", "POU2AF1", "POU5F1", "PPM1D", "PRDM16", "PREX2", "PRKACA", "PRKAR1A", "PSIP1", "PTK6", "PTPN11", "QKI", "RAC1", "RAD21", "RAF1", "RAP1B", "RAP1GDS1", "RARA", "RECQL4", "REL", "RET", "RHOA", "ROS1", "RRAS2", "RSPO3", "RUNX1", "RUNX1T1", "SALL4", "SET", "SETBP1", "SETDB1", "SF3B1", "SGK1", "SH3GL1", "SIX1", "SIX2", "SKI", "SMO", "SND1", "SOX2", "SRC", "SRSF2", "SRSF3", "SSX1", "SSX2", "SSX4", "STAT3", "STAT5B", "STAT6", "STIL", "SUZ12", "SYK", "TAF15", "TAL1", "TAL2", "TBL1XR1", "TBX3", "TCF3", "TCF7L2", "TCL1A", "TEC", "TERT", "TET1", "TFE3", "TFEB", "TLX1", "TLX3", "TMSB4X", "TNC", "TNFRSF17", "TP53", "TP63", "TRIM24", "TRIM27", "TRRAP", "U2AF1", "UBR5", "USP6", "USP8", "WAS", "WT1", "WWTR1", "XPO1", "ZEB1", "ZNF521", "AIP", "DIS3L2", "DLST", "EGLN1", "EGLN2", "EPCAM", "ERCC1", "ERCC6", "FAN1", "FANCB", "FANCI", "FANCL", "FANCM", "GALNT12", "GCM2", "GPR101", "GREM1", "HNF1B", "HOXB13", "KIF1B", "MAD2L2", "MC1R", "MDH2", "MLH3", "MRE11", "MSH3", "MSR1", "NFIX", "POLH", "PTCH2", "RAD51", "RAD51C", "RAD51D", "RECQL", "RPS20", "SCG5", "SLC25A11", "SLX4", "UBE2T", "XRCC2", "XRCC3"), 
("ABI1", "ACVR1B", "ACVR2A", "AMER1", "APC", "APOBEC3B", "ARHGAP26", "ARHGAP35", "ARHGEF10", "ARHGEF10L", "ARHGEF12", "ARID1A", "ARID1B", "ARID2", "ARNT", "ASXL1", "ASXL2", "ATM", "ATP1A1", "ATP2B3", "ATR", "ATRX", "AXIN1", "AXIN2", "B2M", "BAP1", "BARD1", "BAX", "BAZ1A", "BCL10", "BCL11B", "BCL9L", "BCOR", "BCORL1", "BIRC3", "BLM", "BMPR1A", "BRCA1", "BRCA2", "BRIP1", "BTG1", "BTG2", "BTK", "BUB1B", "CAMTA1", "CARS", "CASP3", "CASP8", "CASP9", "CBFA2T3", "CBFB", "CBL", "CBLB", "CCDC6", "CCNB1IP1", "CCNC", "CD274", "CDC73", "CDH1", "CDH10", "CDH11", "CDK12", "CDKN1A", "CDKN1B", "CDKN2A", "CDKN2C", "CDX2", "CEBPA", "CHD2", "CHEK2", "CIC", "CIITA", "CLTC", "CLTCL1", "CNBP", "CNOT3", "CNTNAP2", "CPEB3", "CREB3L1", "CREBBP", "CSMD3", "CTCF", "CTNNA1", "CUL3", "CUX1", "CYLD", "DAXX", "DDB2", "DDX10", "DDX3X", "DICER1", "DNM2", "DNMT3A", "DROSHA", "EBF1", "EED", "EIF3E", "ELF3", "ELF4", "ELL", "EP300", "EPAS1", "EPS15", "ERBB4", "ERCC2", "ERCC3", "ERCC4", "ERCC5", "ESR1", "ETNK1", "ETV6", "EXT1", "EXT2", "EZH2", "FANCA", "FANCC", "FANCD2", "FANCE", "FANCF", "FANCG", "FAS", "FAT1", "FAT4", "FBLN2", "FBXO11", "FBXW7", "FEN1", "FES", "FH", "FHIT", "FLCN", "FOXL2", "FOXO1", "FOXO3", "FOXO4", "FUS", "GATA1", "GATA3", "GPC3", "GPC5", "GRIN2A", "HNF1A", "HOXA11", "HOXA9", "ID3", "IGF2BP2", "IKZF1", "IKZF3", "IRF4", "IRS4", "JAK1", "KAT6B", "KDM5C", "KDM6A", "KEAP1", "KLF4", "KLF6", "KMT2C", "KMT2D", "KNL1", "LARP4B", "LATS1", "LATS2", "LEF1", "LEPROTL1", "LRIG3", "LRP1B", "LZTR1", "MALAT1", "MAP2K4", "MAP3K1", "MAP3K13", "MAX", "MED12", "MEN1", "MGMT", "MLF1", "MLH1", "MRTFA", "MSH2", "MSH6", "MUTYH", "MYH9", "N4BP2", "NAB2", "NBN", "NCOA4", "NCOR1", "NCOR2", "NDRG1", "NF1", "NF2", "NFE2L2", "NFKB2", "NFKBIE", "NKX2-1", "NOTCH1", "NOTCH2", "NRG1", "NTHL1", "NTRK1", "PABPC1", "PALB2", "PATZ1", "PAX5", "PBRM1", "PER1", "PHF6", "PHOX2B", "PIK3R1", "PML", "PMS2", "POLD1", "POLE", "POLG", "POLQ", "POT1", "PPARG", "PPP2R1A", "PPP6C", "PRDM1", "PRDM2", "PRF1", "PRKAR1A", "PTCH1", "PTEN", "PTK6", "PTPN13", "PTPN6", "PTPRB", "PTPRC", "PTPRD", "PTPRK", "PTPRT", "QKI", "RAD17", "RAD21", "RAD50", "RAD51B", "RANBP2", "RB1", "RBM10", "RECQL4", "RFWD3", "RHOA", "RHOH", "RMI2", "RNF43", "ROBO2", "RPL10", "RPL22", "RPL5", "RSPO2", "RUNX1", "RUNX1T1", "RXRA", "SBDS", "SDHA", "SDHAF2", "SDHB", "SDHC", "SDHD", "SETD1B", "SETD2", "SFPQ", "SFRP4", "SH2B3", "SIRPA", "SLC34A2", "SMAD2", "SMAD3", "SMAD4", "SMARCA4", "SMARCB1", "SMARCD1", "SMARCE1", "SMC1A", "SOCS1", "SOX21", "SPEN", "SPOP", "STAG1", "STAG2", "STAT5B", "STK11", "SUFU", "SUZ12", "TBL1XR1", "TBX3", "TCF3", "TENT5C", "TERT", "TET1", "TET2", "TGFBR2", "TMEM127", "TNFAIP3", "TNFRSF14", "TP53", "TP63", "TPM3", "TRAF7", "TRIM24", "TRIM33", "TSC1", "TSC2", "USP44", "VHL", "WIF1", "WNK2", "WRN", "WT1", "XPA", "XPC", "YWHAE", "ZBTB16", "ZFHX3", "ZMYM3", "ZNF331", "ZNRF3", "ZRSR2");

2.10 9. Asociar archivos/fuentes al pipeline

Asociar la Base de Datos al nuevo pipeline, usando como referencia los genes de interés de la guía 5, del pipeline 3

Para conocer las distintas File_ids que hay en los diferentes Pipelines de Delphi podemos hacer la siguiente consulta:

SELECT * FROM Pipeline_File;

Es importante adaptar los valores de estos IDs a la clasificación con la que queremos trabajar. Por ejemplo, el pipeline para Hereditary Cancer tiene el ID: 3. Por lo que solo me interesan los IDs de los archivos asociados a este pipeline en concreto

Para transformar los datos copiados desde la consulta de MYSQL (Copy (Unquoted)) de las filas de interés en este formato que necesitamos para asociar los archivos al ID del pipeline, se puede hacer fácilmente usando el siguiente código de Python:

# Entrada de texto (simulando tu input)
entrada = """
3, 64
3, 65
3, 66
3, 67
3, 68
3, 69
3, 70
3, 71
3, 72
3, 73
3, 74
3, 75
3, 81
3, 82
3, 83
3, 84
3, 85
3, 88
3, 89
3, 90
3, 99
3, 100
3, 207
3, 208
3, 722
3, 723
3, 727
3, 728
3, 729
3, 733
"""

# Procesamiento
lineas = entrada.strip().split('\n')  # Divide por líneas
resultado = ",\n".join([f"({linea.strip()})" for linea in lineas]) + ";"

# Salida
print(resultado)
## (3, 64),
## (3, 65),
## (3, 66),
## (3, 67),
## (3, 68),
## (3, 69),
## (3, 70),
## (3, 71),
## (3, 72),
## (3, 73),
## (3, 74),
## (3, 75),
## (3, 81),
## (3, 82),
## (3, 83),
## (3, 84),
## (3, 85),
## (3, 88),
## (3, 89),
## (3, 90),
## (3, 99),
## (3, 100),
## (3, 207),
## (3, 208),
## (3, 722),
## (3, 723),
## (3, 727),
## (3, 728),
## (3, 729),
## (3, 733);
# Add Sources to Pipeline (EXCLUDING CancerHotSpots id 67)
INSERT IGNORE INTO Pipeline_File (Pipeline_id, File_id) VALUES 
(3, 64),
(3, 65),
(3, 66),
(3, 67),
(3, 68),
(3, 69),
(3, 70),
(3, 71),
(3, 72),
(3, 73),
(3, 74),
(3, 75),
(3, 81),
(3, 82),
(3, 83),
(3, 84),
(3, 85),
(3, 88),
(3, 89),
(3, 90),
(3, 99),
(3, 100),
(3, 207),
(3, 208),
(3, 722),
(3, 723),
(3, 727),
(3, 728),
(3, 729),
(3, 733);