Human Stem Cell Genome Project

Welcome to Human Stem Cell Genome Project data portal. This is an online resource for researchers to search, download and visualize high coverage whole genome sequence data generated from more than a hundred publicly available pluripotent cell lines. The overarching goal of this project is to capture the genetic variation across widely used stem cell lines and to enable rationale selection of these lines for downstream applications of interest.

To use this resource, users can search across cell lines for sequence genetic variants (point mutations and indels) or structural genetic variants (duplications, deletions, and copy number neutral loss of heterozygosity events). Any of these fields can be interrogated for variants of interest, and each cell line can be studied separately. Data resulting from a search can be downloaded as a CSV file. Due to data sharing restrictions, not all cell lines included in the manuscript accompanying this portal could be included here. If a cell line was not assessed for a certain type of variant, "n/a" is returned. For a brief overview of the function of this portal, please see "Tutorial".

In order to gain full access to the features included in this portal, please first apply for access. Once you have been approved, please visit our Terra workspace for accessing sequence data.

Citation: If you use this data portal, please cite our most recent publication

Source code is freely available on GitHub or Zenodo and is free of use. We hope it will be a resource for users that wish to create similar data portals to facilitate the exploration and dissemination of their own data.

Study Summary

Overview

The data below summarises essential information from the 143 human pluripotent stem cell lines included in this study. The number of sequence variants shown below correspond to predicted missense or loss of function (LOF) variants. Similarly, the number of structural variants refers to those large (> 1 MBP) and small (1 KBP - 1 MBP) variants that overlap protein coding genes. Where these variants were not ascertained, N/A is returned. To search for details about variants of interest, please see “Primary Data Access” to apply for permission to access raw and analysed data. Once access has been granted, a searchable “Whole Genome” page will become visible. Please see Tutorial to learn more about the functionality of this searchable resource. Briefly, users can search across all cell lines for a particular gene or genetic variant, or within a cell line of interest to ascertain all variants present in that line.

Note that for all cell lines obtained from University of California Los Angeles, we can only provide summary statistics on this page due to data use limitations.

Objectives

  • Use whole genome sequencing to map the genetic architecture of stem cell lines at the level of sequence and structural variants
  • Catalog inherited and acquired, rare and common variation in stem cell lines
  • Develop a framework for using genetic data to select cell lines for basic and clinical applications

Cell Lines

Cell Line Sample Institution NIH Reg. # Sex Chromosome Genotype Mean Seq. Coverage Median Seq. Coverage Banked? Sequence Variants Structural Variants (all)
Cell Line Institution NIH Reg. # Sex Chromosome Genotype Mean Seq. Coverage Median Seq. Coverage Banked? Sequence Variants Structural Variants (all)
CHB10 CHB10_P24_140728 Children's Hospital Corporation 9 XY 32.296 33 true 8956 339
CHB11 CHB11_P25_140722 Children's Hospital Corporation 10 XX 36.149 36 true 8972 345
CHB12 CHB12_P23_140809 Children's Hospital Corporation 11 XX 33.416 34 true 8960 317
CHB4 CHB4_P22_150729 Children's Hospital Corporation 4 XY 33.582 34 true 8834 N/A
CHB5 CHB5_P25_140801 Children's Hospital Corporation 5 XX 32.404 32 true 9223 412
CHB6 CHB6_P37_140722 Children's Hospital Corporation 6 XX 32.132 32 true 9036 366
CHB8 CHB8_P26_140723 Children's Hospital Corporation 7 XX 31.752 32 true 8719 358
CHB9 CHB9_P24_140728 Children's Hospital Corporation 8 XY 27.504 28 true 8961 365
CSES12 CSES12_P14_140616 Cedars-Sinai Medical Center 111 XY 26.662 27 true 8897 453
CSES15 CSES15_P26_140611 Cedars-Sinai Medical Center 114 XY 31.061 31 true 8981 1366
CSES2 CSES2_P18_140611 Cedars-Sinai Medical Center 106 XX 50.256 51 true 8802 323
CSES25 CSES25_P26_150119 Cedars-Sinai Medical Center 122 XX 30.581 30 true 8957 556
CSES4 CSES4_P27_140616 Cedars-Sinai Medical Center 107 XY 33.911 35 true 9119 338
CSES6 CSES6_P15_140809 Cedars-Sinai Medical Center 134 XX 32.13 32 true 8981 361
CSES7 CSES7_P14_140722 Cedars-Sinai Medical Center 108 XX 31.268 31 true 9029 335
CT1 CT1_P40_140809 University of Connecticut Sch. of Med./Dnt. 68 XX 32.707 33 true 9101 331
CT2 CT2_P11_140622 University of Connecticut Sch. of Med./Dnt. 69 XX 35.368 36 true 9097 332
CT3 CT3_P17_140809 University of Connecticut Sch. of Med./Dnt. 70 XX 25.674 26 true 8908 365
CT4 CT4_P15_140622 University of Connecticut Sch. of Med./Dnt. 71 XX 32.94 33 true 8965 342
ESI017 ESI017_P31_140619 BioTime, Inc. 93 XX 40.61 41 true 8898 373
ESI035 ESI035_P36_140617 BioTime, Inc. 129 XX 33.306 34 true 8905 466
ESI049 ESI049_P33_140612 BioTime, Inc. 130 XY 32.181 33 true 9074 341
ESI051 ESI051_P35_140610 BioTime, Inc. 131 XX 31.484 32 true 8835 344
ESI053 ESI053_P28_140611 BioTime, Inc. 132 XX 32.541 33 true 8866 314
Elf1 Elf1_P12_140624 University of Washington 156 XX 31.845 32 true 8889 371
Genea15 Genea15_P20_150110 Genea Biocells 228 XY 26.457 27 true 8773 330
Genea16 Genea16_P17_150104 Genea Biocells 229 XX 29.437 29 true 8884 348
Genea2 Genea2_P19_150119 Genea Biocells 151 XY 26.184 27 true 8794 343
Genea42 Genea42_P19_150107 Genea Biocells 231 XX 29.507 30 true 9301 391
Genea43 Genea43_P12_150104 Genea Biocells 232 XY 26.007 26 true 9173 1449
Genea47 Genea47_P10_150104 Genea Biocells 230 XX 27.771 28 true 8928 305
Genea48 Genea48_P8_150110 Genea Biocells 152 XY 26.979 27 true 8911 3795
Genea52 Genea52_P11_150110 Genea Biocells 234 XY 25.713 26 true 8867 344
Genea57 Genea57_P13_150110 Genea Biocells 233 XX 27.985 28 true 8820 315
HS1001 HS1001_P14 Karolinska Institute NA XY 26.916 26 false 8841 346
HS346 HS346_P25_140704 Karolinska Institute 201 XX 32.438 33 true 8807 331
HS401 HS401_P25_140619 Karolinska Institute 202 XY 30.109 31 true 8853 3342
HS420 HS420_P24_140623 Karolinska Institute 203 XY 32.977 34 true 8713 338
HS975 HS975_P8 Karolinska Institute NA XX 29.182 29 false 9015 363
HS980 HS980_P8 Karolinska Institute NA XX 29.253 29 false 9177 373
HS983a HS983a_P7 Karolinska Institute NA XY 30.873 31 false 9033 385
HS999 HS999_P12 Karolinska Institute NA XX 28.305 28 false 9117 690
HUES42 HUES42_P22_131008 Harvard University 157 XY 31.803 32 true 8933 362
HUES44 HUES44_P16_131008 Harvard University 158 XX 33.355 33 true 8845 299
HUES45 HUES45_P21_131206 Harvard University 76 XX 31.863 32 true 8912 356
HUES48 HUES48_P23_150902 Harvard University 53 XX 36.268 36 true 8869 4832
HUES49 HUES49_P25_130903 Harvard University 54 XX 35.19 35 true 9086 396
HUES53 HUES53_P16_131008 Harvard University 55 XY 25.187 25 true 8748 359
HUES62 HUES62_P18_131007 Harvard University 65 XX 28.444 28 true 8966 326
HUES63 HUES63_P15_140416 Harvard University 66 XY 33.521 34 true 8928 326
HUES64 HUES64_P14_131009 Harvard University 67 XY 28.813 29 true 8871 340
HUES65 HUES65_P15_131027 Harvard University 56 XY 34.053 35 true 9011 340
HUES66 HUES66_P15_131015 Harvard University 57 XX 27.341 27 true 8946 1427
HUES68 HUES68_P27_131112 Harvard University 176 XX 23.289 23 true 8915 853
HUES69 HUES69_P21_130922 Harvard University 178 XX 27.219 27 true 8859 328
HUES70 HUES70_P27_131019 Harvard University 177 XX 26.109 26 true 8844 310
HUES71 HUES71_P28_150210 Harvard University 281 XX 28.906 29 true 9158 6214
HUES72 HUES72_P20_150119 Harvard University 282 XX 28.833 29 true 8916 2210
HUES73 HUES73_P28_150210 Harvard University 283 XY 27.759 28 true 8984 2490
HUES74 HUES74_P7_150201 Harvard University 304 XY 25.709 26 true 8996 343
HUES75 HUES75_P7_150124 Harvard University 280 XX 26.144 26 true 9000 335
I3 I3_P27_150210 Technion R&D Foundation 204 XX 27.265 28 true 9037 297
I4 I4_P29_150210 Technion R&D Foundation 205 XX 27.626 28 true 9028 266
KCL019 KCL019_P8_141010 King's College London 270 XX 30.992 31 false 8870 329
KCL020 KCL020_P30_111121 King's College London 271 XY 28.914 29 false 8868 329
KCL022 KCL022_P22_111209 King's College London 264 XY 29.464 29 false 8867 8057
KCL031 KCL031_P17_120710 King's College London 263 XY 26.884 27 false 8923 322
KCL032 KCL032_P7_141015 King's College London 266 XX 27.101 27 false 8906 346
KCL033 KCL033_P17_120408 King's College London 267 XX 26.94 26 false 8909 339
KCL034 KCL034_P17_131020 King's College London 268 XY 25.662 26 false 8772 331
KCL037 KCL037_P7_141015 King's College London 269 XY 29.83 30 false 8863 322
KCL038 KCL038_P9_141015 King's College London 265 XY 60.914 62 false 8988 327
KCL039 KCL039_P8_141015 King's College London 274 XY 28.798 28 false 8957 349
KCL040 KCL040_P22_120905 King's College London 272 XX 29.378 29 false 8961 344
MShef11 MShef11_P18_140715 University of Sheffield NA XY 26.999 27 false 8937 370
MShef12 MShef12_P16 University of Sheffield NA XX 30.924 31 false 9023 362
MShef13 MShef13_P16_140715 University of Sheffield NA XY 26.463 27 false 8896 352
MShef14 MShef14_P13_130816 University of Sheffield NA XX 30.452 30 false 8767 308
MShef2 MShef2_P18_111121 University of Sheffield NA XX 28.302 28 false 8838 290
MShef3 MShef3_P10_140803 University of Sheffield NA XX 30.139 30 false 8954 2275
MShef4 MShef4_P35_130524 University of Sheffield NA XY 30.212 30 false 8859 341
MShef5 MShef5_P8_140803 University of Sheffield NA XY 24.821 25 false 8852 365
MShef7 MShef7_P13_140715 University of Sheffield NA XY 30.134 30 false 8736 342
MShef8 MShef8_P14_140715 University of Sheffield NA XY 29.31 29 false 8871 218
Man11 Man11_P27 University of Manchester NA XX 26.693 27 false 8831 392
Man12 Man12_P24 University of Manchester NA XY 27.666 28 false 8799 359
Mel1 Mel1_P20_150107 University of Queensland 139 XY 25.96 26 true 8782 331
Mel2 Mel2_P28_150121 University of Queensland 140 XX 26.531 27 true 8989 566
Mel3 Mel3_P38_150121 University of Queensland 141 XX 29.055 29 true 8918 2156
Mel4 Mel4_P37_150119 University of Queensland 142 XX 28.203 28 true 8894 340
Mshef10 Mshef10_P22 University of Sheffield NA XY 30.591 31 false 8851 414
RUES1 RUES1_P25_140730 The Rockefeller University 12 XY 33.097 33 true 8912 392
RUES2 RUES2_P18_170730 The Rockefeller University 13 XX 30.891 31 true 8989 2332
Shef3-2 Shef3-2_P13_140801 University of Sheffield 77 XY 32.263 32 true 8869 359
Shef6-1 Shef6-1_P12_140725 University of Sheffield 78 XX 49.628 50 true 8853 5034
UCLA1 UCLA1_P15_150817 University of California Los Angeles 58 XX 32.071 30 true 8961 N/A
UCLA10 UCLA10_P15_150817 University of California Los Angeles 146 XY 33.473 34 true 9144 N/A
UCLA11 UCLA11_P16_150821 University of California Los Angeles 185 XY 51.72 44 true 8775 N/A
UCLA12 UCLA12_P16_150819 University of California Los Angeles 186 XX 48.328 49 true 8995 N/A
UCLA13 UCLA13_P16_150817 University of California Los Angeles 293 XY 47.479 48 true 9041 N/A
UCLA14 UCLA14_P20_150817 University of California Los Angeles 294 XX 44.79 45 true 9008 N/A
UCLA15 UCLA15_P19_150823 University of California Los Angeles 295 XX 45.304 46 true 8961 N/A
UCLA16 UCLA16_P16_150817 University of California Los Angeles 296 XX 48.243 49 true 8951 N/A
UCLA17 UCLA17_P18_150823 University of California Los Angeles 297 XX 40.265 41 true 8960 N/A
UCLA18 UCLA18_P26_150817 University of California Los Angeles 298 XX 47.789 47 true 8879 N/A
UCLA2 UCLA2_P14_150817 University of California Los Angeles 59 XY 46.456 43 true 8903 N/A
UCLA3 UCLA3_P16_150817 University of California Los Angeles 60 XX 35.986 36 true 9081 N/A
UCLA4 UCLA4_P16_150819 University of California Los Angeles 87 XX 32.598 31 true 9160 N/A
UCLA5 UCLA5_P26_150819 University of California Los Angeles 88 XX 35.975 34 true 8808 N/A
UCLA6 UCLA6_P20_150817 University of California Los Angeles 89 XY 43.127 44 true 8837 N/A
UCLA7 UCLA7_P21_150817 University of California Los Angeles 143 XX 51.041 52 true 8956 N/A
UCLA8 UCLA8_P15_150817 University of California Los Angeles 144 XX 35.998 35 true 8972 N/A
UCLA9 UCLA9_P16_150821 University of California Los Angeles 145 XX 43.273 41 true 8920 N/A
UCSF4 UCSF4_P19_140613 University of California San Francisco 44 XX 28.793 29 true 8999 318
UM121-7 UM121-7_P10_150925 University of Michigan 291 XY 44.911 46 true 8879 1695
UM14-1 UM14-1_P26_150115 University of Michigan 162 XY 27.329 28 true 8765 340
UM22-2 UM22-2_P26_150107 University of Michigan 209 XX 28.397 29 true 8957 374
UM33-4 UM33-4_P25_150107 University of Michigan 279 XX 28.752 29 true 8879 354
UM4-6 UM4-6_P15_150116 University of Michigan 147 XY 31.755 32 true 8960 382
UM77-2 UM77-2_P12_150107 University of Michigan 278 XX 30.007 30 true 9113 331
UM78-2 UM78-2_P13_150922 University of Michigan 288 XY 32.436 32 true 9052 1699
WA01 WA01_P23_140528 WiCell Research Institute 43 XY 29.39 30 true 8809 1125
WA09 WA09_P25_140528 WiCell Research Institute 62 XX 27.878 28 true 9178 365
WA13 WA13_P33_140530 WiCell Research Institute 63 XY 31.771 32 true 8949 341
WA14 WA14_P21_131206 WiCell Research Institute 64 XY 32.657 33 true 8935 352
WA15 WA15_P31_140529 WiCell Research Institute 96 XY 34.231 35 true 8829 343
WA17 WA17_P12_140530 WiCell Research Institute 98 XY 31.759 32 true 8875 314
WA18 WA18_P9_140530 WiCell Research Institute 99 XY 35.953 37 true 8935 361
WA19 WA19_P11_140529 WiCell Research Institute 100 XY 33.876 35 true 8845 308
WA20 WA20_P23_140613 WiCell Research Institute 101 XX 31.841 32 true 8798 5012
WA21 WA21_P28_140528 WiCell Research Institute 102 XY 27.507 28 true 8802 360
WA22 WA22_P14_140530 WiCell Research Institute 103 XX 31.617 32 true 9035 358
WA23 WA23_P20_140606 WiCell Research Institute 104 XY 30.548 31 true 8822 10590
WA24 WA24_P15_140528 WiCell Research Institute 105 XY 28.215 29 true 8768 352
WA25 WA25_P14_140606 WiCell Research Institute 196 XX 26.874 27 true 8847 320
WA26 WA26_P15_140606 WiCell Research Institute 197 XX 24.057 24 true 8852 336
WA27 WA27_P15_140613 WiCell Research Institute 198 XX 31.943 32 true 8982 552
WA7 WA7_P33_140529 WiCell Research Institute 61 XX 32.126 32 true 9097 339
WIBR1 WIBR1_P34_140608 Whitehead Institute for Biomedical Research 74 XX 31.364 32 true 9018 1007
WIBR2 WIBR2_P33_140617 Whitehead Institute for Biomedical Research 75 XX 31.879 32 true 8971 328
WIBR3 WIBR3_P23_140624 Whitehead Institute for Biomedical Research 79 XX 32.186 33 true 8840 342
WIBR5 WIBR5_P13_140611 Whitehead Institute for Biomedical Research 81 XX 32.151 32 true 8787 346
WIBR6 WIBR6_P11_140617 Whitehead Institute for Biomedical Research 82 XX 34.679 35 true 8678 423

You can do exclude results in searches using the "!" character before the term.

Methods (Cell Banking and Analysis)

Tutorial

Please watch this short video tutorial to see some basic features of the portal.

Primary Data Access

Whole genome data are accessible via DUOS, listed as dataset DUOS-000121. If you wish to verify variants identified in this whole genome sequencing dataset in an independently generated whole exome sequencing (WES) dataset of most of the samples included in this portal, these data are available in controlled access databases: dgGaP accession phs001343.v1.p1, EGA accession EGAS00001002400, and these data are searchable on the "Whole Genome" page (accessible once logged in).

Contact Us

To obtain access to the primary or analysed data, please request access via DUOS as explained in Primary Data Access. If you have specific questions about the portal architecture or would like to replicate it for your own data, commented code is available on GitHub. If you have specific questions about the analysis or conclusions, please contact one of the corresponding authors at

  • fm436 [at] medschl.cam.ac.uk
  • eggan [at] mcb.harvard.edu
  • mccarroll [at] genetics.med.harvard.edu

Exome Search

Exomes are available via Anvil (see data access tab for more info). Exomes are aligned to GRCh38, whole genomes shown in this application are aligned to GRCh37.