使用ensembl api鉴定假常染色体区段

PAR regions

The pseudo-autosomal regions are homologous DNA sequences on the (human) X and Y chromosomes. They allow the pairing and crossing-over of these sex chromosomes the same way the autosomal chromosomes do during meiosis. As these genomic regions are identical between X and Y, they are oftentimes only stored once.

To pull out the coordinates of the pseudo-autosomal regions (PAR) from the Ensembl database, you can perform the following query on the Ensembl core database:

Code:

select (select sr.name from seq_region sr where sr.seq_region_id=ae.seq_region_id) as chrom_1, ae.seq_region_start as start_1, ae.seq_region_end as end_1, (select sr.name from seq_region sr where sr.seq_region_id=ae.exc_seq_region_id) as chrom_2, ae.exc_seq_region_start as start_2, ae.exc_seq_region_end as end_2 from assembly_exception ae where ae.exc_type="PAR";

For the human database schema 61 (assembly GRCh37/hg19) you will get where the corresponding region is located:

+---------+----------+----------+---------+-----------+-----------+
| chrom_1 | start_1  | end_1    | chrom_2 | start_2   | end_2     |
+---------+----------+----------+---------+-----------+-----------+
| Y       |    10001 |  2649520 | X       |     60001 |   2699520 |
| Y       | 59034050 | 59373566 | X       | 154931044 | 155270560 |
+---------+----------+----------+---------+-----------+-----------+

For the old assembly (NCBI36/hg18) you will get:

+---------+----------+----------+---------+-----------+-----------+
| chrom_1 | start_1  | end_1    | chrom_2 | start_2   | end_2     |
+---------+----------+----------+---------+-----------+-----------+
| Y       |        1 |  2709520 | X       |         1 |   2709520 |
| Y       | 57443438 | 57772954 | X       | 154584238 | 154913754 |
+---------+----------+----------+---------+-----------+-----------+

You can alternatively use the API:

Code:

my $aefa = $db->get_AssemblyExceptionFeatureAdaptor();
my $sa   = $db->get_SliceAdaptor;
my $slice = $sa->fetch_by_region("chromosome", "Y");
my @aefs = @{$aefa->fetch_all_by_Slice($slice)};
foreach my $ae (@aefs){
  print $ae->display_id."\t".$ae->start."\t".$ae->end."\n";
}
X	10001	2649520
X	59034050	59373566

or for X:

Y	60001	2699520
Y	154931044	155270560

So to translate from Y to X PAR locations you can use the following for GRCh37 / hg19:

Y 10001 - 2649520      <->  X 60001 - 2699520, band Xp22.33
Y 59034050 - 59373566  <->  X 154931044 - 155270560, band Xq28

and for NCBI36 / hg18:

Y 1 - 2709520          <-> X  1 - 2709520, band Xp22.33
Y 57443438 - 57772954  <-> X  154584238 - 154913754, band Xq28

Please note that these coordinates do not agree with the definitions at the GRC and NCBI. This difference of the PAR-2 end coordinates (chrX:155.260.560 / 155.270.560 or chrY:59.363.566 / 59.373.566) is caused by the 10kb telomeric (gap) region which needs to be included in the PAR-2 definition to correctly represent this arrangement.

See also the telomere & centromer definition notes.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值