Outline (Day 1 - Nov 15) • Introduction to Cancer Genomics Data Cancer Genomics Data portals

publicité
Outline(Day1- Nov15)
• IntroductiontoCancerGenomicsData
• CancerGenomicsDataportals
• ThecBioPortal forcancergenomics
• Access/Explore/Analyse datathroughtheportal
• Access/Explore/Analyse datausingtheCGDSR-library
Outline(Day2- Nov16)
• JournalClub
• Cancermolecularheterogeneity
• Disease-modelmatching:doesthegeneticsmatter?
• CancerGenomicsDataportalsfortumormodels
• TheCCLPdataportal
• Celllinemolecularprofilesanddrugsensitivitydata
• Downloadandexplorecelllineandhumantumordataformodelselection
andtreatmentdesign
Outline(Day3- Nov17)
• BasicunderstandingofRNA-Seq dataprocessing.
• Dimensionalityreduction.
• Differentialexpression.
• Introductiontosingle-cellRNA-Seq
Outline(Day4- Nov18)
• Immunesignaturesingeneexpressiondata.
• Predictionofimmunecellfractions.
• Predictionofpeptide-HLAinteractionsappliedtoneo-antigens
Hands-on
cancergenomics
data
GiovanniCiriello
15/11/16
[email protected]
“Introto(Computational)CancerGenomics”
FocusonData
• Whatitis
• Howtoread/interpret
• Howtoexplore
• Howtomanipulate
“Introto(Computational)CancerGenomics”
FocusonDataPortals
• Todownload
• Toexplore
• Toanalyze
• Tointegrate
“Introto(Computational)CancerGenomics”
Day1
•
•
•
Cancergenomicsdata(anddatagenerators)
Dataportalsforcancergenomicsdata
Enablingcuratedandsystematicaccess
Day2
•
•
•
Exploringcancergenomicheterogeneity
Theimportanceofmodelselection
Dataportalsforcancermodelsandmodelfeatureinterrogation
Why?
1.Itwillsaveyoualotoftime
Why?
1.Itwillsaveyoualotoftime
Why?
2.TheHumanRelevance
Why?
3.Disease-modelmatching
Why?
CancerGenomics
Genetics
Epigenetics
Transcriptomics
Proteomics
CancerGenomics
Genetics
Epigenetics
Transcriptomics
DNAmutations
CopyNumberAlterations
Translocation
• GeneFusion
DNAMethylation
Histonemethylation
Chromatinstructural
changes
miRNA
lncRNA
mRNA-seq
• Geneexpression
• Isoformquantification
• Splicing junctions
Single-cellseq
Proteomics
RPPA
Mass-spec
Single-cellphosphoprotein (CyTOF)
CancerGenomics
Genetics
Epigenetics
Transcriptomics
DNAmutations
CopyNumberAlterations
Translocation
• GeneFusion
DNAMethylation
Histonemethylation
Chromatinstructural
changes
miRNA
lncRNA
mRNA-seq
• Geneexpression
• Isoformquantification
• Splicingjunctions
Single-cellseq
Proteomics
RPPA
Mass-spec
Single-cellphosphoprotein (CyTOF)
CancerGenomics
Genetics
• FromSangertoNextGeneration Sequencing (Illumina technology)
• Targetedsequencing
• Whole-exomesequencing
• Whole-genomesequencing
DNAmutations
CopyNumberAlterations
CancerGenomics
Genetics
Coverage
Targeted
DNAmutations
CopyNumberAlterations
WES
WGS
Numberof mutations
CancerGenomics
ClinicalSetting
Selectedgene/codon
panels
Noneedforgermline
High resolution /high
depth (500/1000x)
Genetics
Coverage
Targeted
DNAmutations
CopyNumberAlterations
WES
WGS
Numberof mutations
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
CancerGenomics
Genetics
Coverage
Targeted
DNAmutations
CopyNumberAlterations
WES
ResearchSetting
Allcodinggenes
Internationalconsortia
Discoveryofdriver
mutations
Currently,themost
diffuse typeofDNAseq
data
WGS
Numberof mutations
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
CancerGenomics
Genetics
Targeted
Coverage
ResearchSetting
Growing availability(ICGC)
Non-coding mutations
Structuralvariants
Clonalityinference
Challenging interpretability..
DNAmutations
CopyNumberAlterations
WES
WGS
Numberof mutations
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
• ArrayComparativeGenomic Hybridization (aCGH)
• Affymetrix SNP6.0(~2Mprobes)
CancerGenomics
• ArrayComparativeGenomic Hybridization (aCGH)
• Affymetrix SNP6.0(~2Mprobes)
Genetics
DNAmutations
CopyNumberAlterations
Segments ofuniform
copynumber status
CancerGenomics
Genetics
DNAmutations
CopyNumberAlterations
• ArrayComparativeGenomic Hybridization (aCGH)
• Affymetrix SNP6.0(~2Mprobes)
• GISTIC:recurrentcopynumberalterations
CancerGenomics
Epigenetics
DNAmethylation
• Illumina infinium array27Kà 450Kà 800K
• Probing DNAmethylationpreferentiallyatCpG promoters,
butnowalsogenebody /up-downstream generegions
• Additionofamethylgrouptothe5-carbonofcytosine
CancerGenomics
• Illumina infinium array27Kà 450Kà 800K
• Probing DNAmethylationpreferentiallyatCpG promoters,
butnowalsogenebody /up-downstream generegions
Epigenetics
DNAmethylation
β =
Methylatedmolecules
Allprobedmol.
CancerGenomics
• Illumina infinium array27Kà 450Kà 800K
• Probing DNAmethylationpreferentiallyatCpG promoters,
butnowalsogenebody /up-downstream generegions
Epigenetics
DNAmethylation
β =
Methylatedmolecules
Allprobedmol.
CancerGenomics
Transcriptomics
• RNA-seq hastakenovermicroarrays
• Statisticalanalysesoftenhasnot
RNA-seq
NegativeBinomialDistribution
CancerGenomics
Transcriptomics
• RNA-seq hastakenovermicroarrays
• Statisticalanalysesoftenhasnot
CDK4
RNA-seq
PTEN
CancerGenomics
Transcriptomics
RNA-seq
• RNA-seq hastakenovermicroarrays
• Statisticalanalysesoftenhasnot
• Log-transformation/qq-transformation
• tomimicnormal distribution and
• usenormaldistribution assuming statistics
CancerGenomics
Proteomics
RPPA
(ReversePhaseProtein
Array)
• Selectantibody panel
• ~120proteinantibody
• ~60phospho-protein antibody
Readoutof
signaling/pathway
activity
CancerGenomics
Proteomics
RPPA
(ReversePhaseProtein
Array)
• Selectantibody panel
• ~120proteinantibody
• ~60phospho-protein antibody
AKT
pS473
PTEN
(deletion/mutation)
Thegenomicsrevolution
(2001)
Thecancergenomicsrevolution
ICGC
December
2015
Thecancergenomicsrevolution
TheCancerGenomeAtlas
TheCancerGenomeAtlas
CancerGenomicsDataPortals
GenomicDataCommons(GDC)
GenomicDataCommons(GDC)
GenomicDataCommons(GDC)
GenomicDataCommons(GDC)
Good featureselectionfor
casedatasetbuilding
GenomicDataCommons(GDC)
Good featureselectionfor
casedatasetbuilding
GenomicDataCommons(GDC)
Good featureselectionfor
casedatasetbuilding
GenomicDataCommons(GDC)
Good featureselectionfor
datadownload
Onefilepersample– nodatamatrixdownload
GenomicDataCommons(GDC)
Good featureselectionfor
datadownload
ExampleofMAFfile
GenomicDataCommons(GDC)
• GDCDataPortal
•
•
•
•
•
•
•
https://gdc-portal.nci.nih.gov/
NOdatamatrixperdataset
NOanalysiscapabilities
NObrowsingbygene
YEScontrolledaccesstoRAWdata
YESfilteringcriteriaonpatients
YEScrosscohortsfilesearch
ICGCDataPortal
(InternationalCancerGenomeConsortium)
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Browsebyproject
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Multipleselectionmechanisms
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Analyticaltools
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Similarcasestudyselectionas
forGDC
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Casestudyselectioncombined
witheasydatabulkdownload!
ICGCDataPortal
(InternationalCancerGenomeConsortium)
Specificmutationtype
andmutationoccurrences
Notsimple graphical
overview
Allinfoarenotinthesame
place
ICGCDataPortal
(InternationalCancerGenomeConsortium)
• ICGCDataPortal (officialICGCportal)
•
•
•
•
•
https://dcc.icgc.org/
Bestcombinationofcasestudyselectionandbulkdownload
Providesanalyticaltools
Limitedgenesearch(oneatatime)
Data/Informationnotalwaysinthesameplace(convoluted)
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
GeneExpressionBox
GeneExpressionBox
StudySummaryBox
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
Selectcohortfordatadownload
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
Indexfor
Flatdatafiles
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
Indexfordataanalysisfiles
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
(OutputofFireHose dataanalysispipelinefromtheBroadInstitute)
• FireBrowse (TCGA)
•
•
•
•
http://firebrowse.org/
BasicdatarepositoryforTCGAdata,plentyofflatfilestodownload
PlusdownloadofanalysesfromtheAWG
Allanalysesarerunfrompipeline,hencenopost-processing
Selectoneor
morecancer
studies atonce
1
Selectoneor
morecancer
studies atonce
1
Selectthedatatype
(thismayvary 2
betweenstudies)
Selectoneor
morecancer
studies atonce
1
Selectthedatatype
(thismayvary 2
betweenstudies)
Selectthe
patientset
3
Selectoneor
morecancer
studies atonce
1
Selectthedatatype
(thismayvary 2
betweenstudies)
Selectthe
patientset
3
Queryyour
4
gene(s)ofinterest
Handlingcomplexity
• Eventcallabstraction:
eventeitheroccurornot
• Completeinformation is
organizedandeasily
accessible
Handlingcomplexity
• Eventcallabstraction:
eventeitheroccurornot
• Completeinformation is
organizedandeasily
accessible
Handlingcomplexity
• Eventcallabstraction:
eventeitheroccurornot
• Completeinformation is
organizedandeasily
accessible
• Integratemultiple data
types
Handlingcomplexity
• Eventcallabstraction:
eventeitheroccurornot
• Completeinformation is
organizedandeasily
accessible
• Integratemultiple data
types
• Dataanalysis
Cross-studiesquery/comparisons
•
Selectmultiple cancerstudiestoquerygeneticalterationsinaspecificgene
Cross-studiesquery/comparisons
•
•
Aggregating datatorevealmutationalhotspots
Exploremutationatthestructurallevel
EGFRmutationsacrosshumancancers
Cross-studiesquery/comparisons
•
Gene-specificexpressionlandscapeacrosscancers
EGFRmRNAexpressionacrosshumancancers
Téléchargement