Genome Browser

最新推荐文章于 2024-03-25 14:55:11 发布

keying0520

最新推荐文章于 2024-03-25 14:55:11 发布

阅读量1.7k

点赞数

文章标签： browser documentation assembly file alignment attributes

本文链接：https://blog.csdn.net/keying0520/article/details/6476436

版权

The Genome Browser provides dozens of aligned annotation tracks that have been computed at UCSC or have been provided by outside collaborators. In addition to these standard tracks, it is also possible for users to upload their own annotation data for temporary display in the browser. These custom annotation tracks are viewable only on the machine from which they were uploaded and are automatically discarded 48 hours after the last time they are accessed. Optionally, users can make custom annotations viewable by others as well.

Custom tracks are a wonderful tool for research scientists using the Genome Browser. Because space is limited in the Genome Browser track window, many excellent genome-wide tracks cannot be included in the standard set of tracks packaged with the browser. Other tracks of interest may be excluded from distribution because the annotation track data is too specific to be of general interest or can't be shared until journal publication. Many individuals and labs have contributed custom tracks to the Genome Browser website for use by others. To view a list of these custom annotation tracks, click the Custom Tracks link on the Genome Browser home page.

Custom annotation tracks are similar to standard tracks, but never become part of the MySQL genome database. Each track has its own controller and persists even when not displayed in the Genome Browser window, e.g. if the position changes to a range that no longer includes the track. Typically, custom annotation tracks are aligned under corresponding genomic sequence, but they can also be completely unrelated to the data. For example, a track can be displayed under a long sequence consisting of millions of Ns.

Genome Browser annotation tracks are based on files in line-oriented format. Each line in the file defines a display characteristic for the track or defines a data item within the track. Annotation files contain three types of lines: browser lines, track lines, and data lines. Empty lines and those starting with "#" are ignored.

To construct an annotation file and display it in the Genome Browser, follow these steps:

Step 1. Format the data set
Formulate your data set as a tab-separated file using one of the formats supported by the Genome Browser. Annotation data can be in standard GFF format or in a format designed specifically for the Human Genome Project or UCSC Genome Browser, including BEDGRAPH, GTF, PSL, BED, bigBed, WIG, bigWig, MAF, and microarray (BED15). GFF and GTF files must be tab-delimited rather than space-delimited to display correctly. Chromosome references must be of the form chrN (the parsing of chromosome names is case-sensitive). You may include more than one data set in your annotation file; these need not be in the same format.

Step 2. Define the Genome Browser display characteristics
Add one or more optional browser lines to the beginning of your formatted data file to configure the overall display of the Genome Browser when it initially shows your annotation data. Browser lines allow you to configure such things as the genome position that the Genome Browser will initially open to, the width of the display, and the configuration of the other annotation tracks that are shown (or hidden) in the initial display. NOTE: If the browser position is not explicitly set in the annotation file, the initial display will default to the position setting most recently used by the user, which may not be an appropriate position for viewing the annotation track.

Step 3. Define the annotation track display characteristics
Following the browser lines--and immediately preceding the formatted data--add a track line to define the display attributes for your annotation data set. Track lines enable you to define annotation track characteristics such as the name, description, colors, initial display mode, use score, etc. If you have included more than one data set in your annotation file, insert a track line at the beginning of each new set of data.

Example 1:
Here is an example of a simple annotation file that contains a list of chromosome coordinates.

browser position chr22:20100000-20100900

track name=coords description="Chromosome coordinates list" visibility=2

chr22 20100000 20100100

chr22 20100011 20100200

chr22 20100215 20100400

chr22 20100350 20100500

chr22 20100700 20100800

chr22 20100700 20100900

Click here to view this track in the Genome Browser.

Example 2:
Here is an example of an annotation file that defines 2 separate annotation tracks in BED format. The first track displays blue one-base tick marks every 10000 bases on chr 22. The second track displays red 100-base features alternating with blank space in the same region of chr 22.

browser position chr22:20100000-20140000

track name=spacer description="Blue ticks every 10000 bases" color=0,0,255,

chr22 20100000 20100001

chr22 20110000 20110001

chr22 20120000 20120001

track name=even description="Red ticks every 100 bases, skip 100" color=255,0,0

chr22 20100000 20100100 first

chr22 20100200 20100300 second

chr22 20100400 20100500 third

Click here to view this track in the Genome Browser.

Example 3:
This example shows an annotation file containing one data set in BED format. The track displays features with multiple blocks, a thick end and thin end, and hatch marks indicating the direction of transcription. The track labels display in green (0,128,0), and the gray level of the each feature reflects the score value of that line. NOTE: The track name line in this example has been split over 2 lines for documentation purposes. If you paste this example into the Genome Browser, you must remove the line break to display the track successfully. Click here for a copy of this example that can be pasted into the browser without editing.

browser position chr22:1000-10000

browser hide all

track name="BED track" description="BED format custom track example" visibility=2

color=0,128,0 useScore=1

chr22 1000 5000 itemA 960 + 1100 4700 0 2 1567,1488, 0,2512

chr22 2000 7000 itemB 200 - 2200 6950 0 4 433,100,550,1500 0,500,2000,3500

Click here to view this track in the Genome Browser.

Step 4. Display your annotation track in the Genome Browser
To view your annotation data in the Genome Browser, open the Genome Browser home page and click the Genome Browser link in the top menu bar. On the Gateway page that displays, select the genome and assembly on which your annotation data is based, then click the "add custom tracks" button. (Note: if the Gateway displays the "manage custom tracks" button instead, see Displaying and Managing Custom Tracks for information on how to display your track.)

On the Add Custom Tracks page, load the annotation track data or URL for your custom track into the upper text box and the track documentation (optional) into the lower text box, then click the Submit button. Tracks may be loaded by entering text, a URL, or a pathname on your local computer. For more information on these methods, as well as information on creating and adding track documentation, see Loading a Custom Track into the Genome Browser.

If you encounter difficulties displaying your annotation, read the section Troubleshooting Annotation Display Problems.

Step 5. (Optional) Add details pages for individual track features
After you've constructed your track and have successfully displayed it in the Genome Browser, you may wish to customize the details pages for individual track features. The Genome Browser automatically creates a default details page for each feature in the track containing the feature's name, position information, and a link to the corresponding DNA sequence. To view the details page for a feature in your custom annotation track (in full, pack, or squish display mode), click on the item's label in the annotation track window.

You can add a link from a details page to an external web page containing additional information about the feature by using the track line url attribute. In the annotation file, set the url attribute in the track line to point to a publicly available page on a web server. The url attribute substitutes each occurrence of '$$' in the URL string with the name defined by the name attribute. You can take advantage of this feature to provide individualized information for each feature in your track by creating HTML anchors that correspond to the feature names in your web page.

Example 4:
Here is an example of a file in which the url attribute has been set to point to the file http://genome.ucsc.edu/goldenPath/help/clones.html. The '#$$' appended to the end of the file name in the example points to the HTML NAME tag within the file that matches the name of the feature (cloneA, cloneB, etc.). NOTE: The track line in this example has been split over 2 lines for documentation purposes. If you paste this example into the browser, you must remove the line break to display the track successfully. Click here for a copy of this example that can be pasted into the browser without editing.

browser position chr22:10000000-10020000

browser hide all

track name=clones description="Clones" visibility=2

color=0,128,0 useScore=1

url="http://genome.ucsc.edu/goldenPath/help/clones.html#$$"

chr22 10000000 10004000 cloneA 960

chr22 10002000 10006000 cloneB 200

chr22 10005000 10009000 cloneC 700

chr22 10006000 10010000 cloneD 600

chr22 10011000 10015000 cloneE 300

chr22 10012000 10017000 cloneF 100

Click here to display this track in the Genome Browser.

Step 6. (Optional) Share your annotation track with others
The previous steps showed you how to upload annotation data for your own use on your own machine. However, many users would like to share their annotation data with members of their research group on different machines or with colleagues at other sites. To learn how to make your Genome Browser annotation track viewable by others, read the section Sharing Your Annotation Track with Others.

Loading a Custom Track into the Genome Browser

Using the Genome Browser's custom track upload and management utility, annotation tracks may be added for display in the Genome Browser, deleted from the Genome Browser, or updated with new data and/or display options. You may also use this interface to upload and manage custom track sets for multiple genome assemblies.

To load a custom track into the Genome Browser:

Step 1. Open the Add Custom Tracks page
Click the "add custom tracks" button on the Genome Browser Gateway page. (Note: if one or more tracks have already been uploaded during the current Browser session, additional tracks may be loaded on the Manage Custom Tracks page. In this case, the button on the Gateway page will be labeled "manage custom tracks" and will automatically direct you to the track management page. See Displaying and Managing Custom Tracks for more information.)

Step 2. Load the custom track data
The Add Custom Tracks page contains separate sections for uploading custom track data and optional custom track descriptive documentation. Load the annotation data into the upper section by one of the following methods:

Enter one or more URLs for custom tracks (one per line) in the data text box. The Genome Browser supports both the HTTP and FTP (passive-only) protocols.
Click the "Browse" button directly above the data text box, then choose a custom track file from your local computer, or type the pathname of the file into the "upload" text box adjacent to the "Browse" button. The custom track data may be compressed by any of the following programs: gzip (.gz), compress (.Z), or bzip2 (.bz2). Files containing compressed data must include the appropriate suffix in their names.
Paste or type the custom track data directly into the data box. Because the text in this box will not be saved to a file, this method is not recommended unless you have a copy of the data elsewhere.

Multiple custom tracks may be uploaded at one time on the Add Custom Tracks page through one of the following methods:

Put all the tracks into the same file (rather than separate files), then load the file via the Browse button.
Place your track files in a web-accessible location on your server, then load them into the Genome Browser by pasting their URLs into the data box.

Step 3. (Optional) Load the custom track description page
If desired, you can provide optional descriptive text (in plain or HTML format) to accompany your custom track. This text will be displayed when a user clicks the track's description button on the Genome Browser annotation tracks page. Descriptive text may be loaded by one of the following methods:

Click the "Browse" button directly above the documentation text box, then choose a text file from your local computer, or type the pathname of the file into the "upload" text box adjacent to the "Browse" button.
Paste or type the custom track data directly into the data box. Note that the text in this box will not be saved to a file; therefore, this method is not recommended except for temporary documentation purposes.
If your descriptive text is located on a website, you can reference it from your custom track file by defining the track line attribute "htmlUrl": htmlUrl=<external_url>. In this case, there is no need to insert anything into the documentation text box.

To format your description page in a style that is consistent with standard Genome Browser tracks, click the template link below the documentation text box for an HTML template that may be copied and pasted into a file for editing.

If you load multiple custom tracks simultaneously using one of the methods described in Step 2, a track description can be associated only with the last custom track loaded, unless you upload the descriptive text using the track line "htmlUrl" attribute described above.

Step 4. Upload the track
Click the Submit button to load your custom track data and documentation into the Genome Browser. If the track uploads successfully, you will be directed to the custom track management page where you can display your track, update an uploaded track, add more tracks, or delete uploaded tracks. If the Genome Browser encounters a problem while loading your track, it will display an error. See the section Troubleshooting Annotation Display Problems for help in diagnosing custom track problems.

Displaying and Managing Custom Tracks

After a custom track has been successfully loaded into the Genome Browser, you can display it -- as well as manage your entire custom track set -- via the options on the Manage Custom Tracks page. This page automatically displays when a track has been uploaded into the Genome Browser (see Loading a Custom Track into the Genome Browser). Alternatively, you can access the track management page by clicking the "manage custom tracks" button on the Gateway or Genome Browser annotation tracks pages. (Note that the track management page is available only if at least one track has been loaded during the current browser session; otherwise, this button is labeled "add custom tracks" and opens the Add Custom Track page.)

The table on the Manage Custom Tracks page shows the current set of uploaded custom tracks for the genome and assembly specified at the top of the page. If tracks have been loaded for more than one genome assembly, pulldown lists are displayed; to view the uploaded tracks for a different assembly, select the desired genome and assembly option from the lists.

The following track information is displayed in the Manage Custom Tracks table:

Name: a hyperlink to the Update Custom Track page where you can update your track configuration and data.
Description: the value of the "description" attribute from the track line, if present. If no description is included in the input file, this field contains the track name.
Type: the track type, determined by the Browser based on the format of the data.
Doc: displays "Y" (Yes) if a description page has been uploaded for the track; otherwise the field is blank.
Items: the number of data items in the custom track file. An item count is not displayed for tracks lacking individual items (e.g. wiggle format data).
Pos: the default chromosomal position defined by the track file in either the browser line "position" attribute or the first data line. Click this link to open the Genome Browser or Table Browser at the specified position (Note: only the chromosome name is shown in this column). The Pos column remains blank if the track lacks individual items (e.g. wiggle format data) and the browser line "position" attribute hasn't been set.

Displaying a custom track in the Genome Browser
Click the "go to genome browser" button to display the entire custom track set for the specified genome assembly in the Genome Browser. By default, the browser will open to the position specified in the browser line "position" attribute or first data line of the first custom track in the table, or the last-accessed Genome Browser position if the track is in wiggle data format. To open the display at the default position for another track in the list, click the track's position link in the Pos column.

Viewing a custom track in the Table Browser
Click the "go to table browser" button to access the data for the custom track set in the Table Browser. The custom tracks will be listed in the "Custom Tracks" group pulldown list.

Loading additional custom tracks
To load a new custom track into the currently displayed track set, click the "add custom tracks" button. To change the genome assembly to which the track should be added, select the appropriate options from the pulldown lists at the top of the page. For instructions on adding a custom track on the Add Custom Tracks page, see Loading a Custom Track into the Genome Browser.

Removing one or more custom tracks
To remove custom tracks from the uploaded track set, click the checkboxes in the "delete" column for all tracks you wish to remove, then click the "delete" button. A custom track may also be removed by clicking the "Remove custom track" button on the track's description page. Note: removing the track from the Genome Browser does not delete the track file from your server or local disk.

Updating a custom track
To update the stored information for a loaded custom track, click the track's link in the "Name" column in the Manage Custom Tracks table. A custom track may also be updated by clicking the "Update custom track" button on the track's description page.

The Update Custom Track page provides sections for modifying the track configuration information (the browser lines and track lines), the annotation data, and the descriptive documentation that accompanies the track. Existing track configuration lines are displayed in the top "Edit configuration" text box. In the current implementation of this utility, the existing annotation data is not displayed. Because of this, the data cannot be incrementally edited through this interface, but instead must be fully replaced using one of the data entry methods described in Loading a Custom Track into the Genome Browser. If description text has been uploaded for the track, it will be displayed in the track documentation edit box, where it may be edited or completely replaced. Once you have completed your updates, click the Submit button to upload the new data into the Genome Browser.

If the data or description text for your custom track was originally loaded from a file on your hard disk or server, you should first edit the file, then reload it from the Update Custom Track page using the "Browse" button. Note that edits made on this page to description text uploaded from a file will not be saved to the original file on your computer or server. Because of this, we recommend that you use the documentation edit box only for changes made to text that was typed or pasted in.

Browser Lines

Browser lines configure the overall display of the Genome Browser window when your annotation file is uploaded. Each line defines one display attribute. Browser lines consist of the format:

browser attribute_name attribute_value(s)

For example, if the browser line browser position chr22:1-20000 is included in the annotation file, the Genome Browser window will initially display the first 20000 bases of chr 22.

The following browser line attribute name/value options are available. The value track_name must be set to the name of the primary table on which the the track is based. To identify this table, open up the Table Browser, select the correct genome assembly, then select the track name from the track list. The table list will show the primary table. Alternatively, the primary table name can be obtained from a mouseover on the track name in the track control section.

Note that composite track subtracks are not valid track_name values. To find the symbolic name of a composite track, look in the tableName field in the trackDb table, or mouseover the track name in the track control section. It is not possible to display only a subset of the subtracks at this time.

position <position> - Determines the part of the genome that the Genome Browser will initially open to, in chromosome:start-end format.
pix width
hide all - Hides all annotation tracks except for those listed in the custom track file.
hide <track_name(s)> - Hides the listed tracks. Multiple track names should be space-separated.
dense all - Displays all tracks in dense mode. NOTE: Use the "all" option cautiously. If the browser display includes a large number of tracks or a large position range, this option may overload your browser's resources and cause an error or timeout.
dense <track_name(s)> - Displays the specified tracks in dense mode. Symbolic names must be used. Multiple track names should be space-separated.
pack all - Displays all tracks in pack mode. See NOTE for "dense all".
pack <track_name(s)> - Displays the specified tracks in pack mode. Symbolic names must be used. Multiple track names should be space-separated.
squish all - Displays all tracks in squish mode. See NOTE for "dense all".
squish <track_name(s)> - Displays the specified tracks in squish mode. Symbolic names must be used. Multiple track names should be space-separated.
full all - Displays all tracks in full mode. See NOTE for "dense all".
full <track_name(s)> - Displays the specified tracks in full mode. Symbolic names must be used. Multiple track names should be space-separated.

Note that the Genome Browser will open to the range defined in the Gateway page position or search term box or the position saved as the default unless the browser line position attribute is defined in the annotation file. Although this attribute is optional, it's recommended that you set this value in your annotation file to ensure that the track will appear in the display range when it is uploaded into the Genome Browser.

Track Lines

Track lines define the display attributes for all lines in an annotation data set. If more than one data set is included in the annotation file, each group of data must be preceded by a track line that describes the display characteristics for that set of data. A track line begins with the word track, followed by one or more attribute=value pairs. Unlike browser lines - in which each attribute is defined on a separate line - all of the track attributes for a given set of data are listed on one line with no line breaks. The inadvertent insertion of a line break into a track line will generate an error when you attempt to upload the annotation track into the Genome Browser.

The following track line attribute=value pairs are defined in the Genome Browser:

name=<track_label> - Defines the track label that will be displayed to the left of the track in the Genome Browser window, and also the label of the track control at the bottom of the screen. The name can consist of up to 15 characters, and must be enclosed in quotes if the text contains spaces. We recommend that the track_label be restricted to alpha-numeric characters and spaces to avoid potential parsing problems. The default value is "User Track".
description=<center_label> - Defines the center label of the track in the Genome Browser window. The description can consist of up to 60 characters, and must be enclosed in quotes if the text contains spaces. The default value is "User Supplied Track".
visibility=<display_mode> - Defines the initial display mode of the annotation track. Values for display_mode include: 0 - hide, 1 - dense, 2 - full, 3 - pack, and 4 - squish. The numerical values or the words can be used, i.e. full mode may be specified by "2" or "full". The default is "1".
color=<RRR,GGG,BBB> - Defines the main color for the annotation track. The track color consists of three comma-separated RGB values from 0-255. The default value is 0,0,0 (black).
itemRgb=On - If this attribute is present and is set to "On", the Genome Browser will use the RGB value shown in the itemRgb field in each data line of the associated BED track to determine the display color of the data on that line.
altColor=RRR,GGG,BBB - Defines the secondary color for the track. The alternate color consists of three comma-separated RGB values from 0-255. The default is a lighter shade of whatever the color
useScore=<use_score> - If this attribute is present and is set to 1, the score field in each of the track's data lines will be used to determine the level of shading in which the data is displayed. The track will display in shades of gray unless the color attribute is set to 100,50,0 (shades of brown) or 0,60,120 (shades of blue). The default setting for useScore is "0". This table shows the Genome Browser's translation of BED score values into shades of gray:

shade
score in range ≤ 166 167-277 278-388 389-499 500-611 612-722 723-833 834-944 ≥ 945
group=<group> - Defines the annotation track group in which the custom track will display in the Genome Browser window. By default, group is set to "user", which causes custom tracks to display at the top of the window.
priority=<priority> - When the group attribute is set, defines the display position of the track relative to other tracks within the same group in the Genome Browser window. If group is not set, the priority attribute defines the track's order relative to other custom tracks displayed in the default group, "user".
db=<UCSC_assembly_name> - When set, indicates the specific genome assembly for which the annotation data is intended; the custom track manager will display an error if a user attempts to load the track onto a different assembly. Any valid UCSC assembly ID may be used (eg. hg18, mm8, felCat1, etc.). The default setting is blank, allowing the custom track to be displayed on any assembly.
offset=<offset> - Defines a number to be added to all coordinates in the annotation track. The default is "0".
url=<external_url> - Defines a URL for an external link associated with this track. This URL will be used in the details page for the track. Any '$$' in this string this will be substituted with the item name. There is no default for this attribute.
htmlUrl=<external_url> - Defines a URL for an HTML description page to be displayed with this track. There is no default for this attribute. A template for a standard format HTML track description is here.

BED Lines

BED format provides a flexible way to define the data lines that are displayed in an annotation track. BED lines have three required fields and nine additional optional fields. The number of fields per line must be consistent throughout any single set of data in an annotation track. The order of the optional fields is binding: lower-numbered fields must always be populated if higher-numbered fields are used.

If your data set is BED-like, but it is very large and you would like to keep it on your own server, you should use the bigBed data format.

The first three required BED fields are:

chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.

The 9 additional optional BED fields are:

name - Defines the name of the BED line. This label is displayed to the left of the BED line in the Genome Browser window when the track is open to full display mode or directly to the left of the item in pack mode.
score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray). This table shows the Genome Browser's translation of BED score values into shades of gray:

shade
score in range ≤ 166 167-277 278-388 389-499 500-611 612-722 723-833 834-944 ≥ 945
strand - Defines the strand - either '+' or '-'.
thickStart - The starting position at which the feature is drawn thickly (for example, the start codon in gene displays).
thickEnd - The ending position at which the feature is drawn thickly (for example, the stop codon in gene displays).
itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0). If the track line itemRgb attribute is set to "On", this RBG value will determine the display color of the data contained in this BED line. NOTE: It is recommended that a simple color scheme (eight colors or less) be used with this attribute to avoid overwhelming the color resources of the Genome Browser and your Internet browser.
blockCount - The number of blocks (exons) in the BED line.
blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.

See Example 3 for a demonstration of a custom track that uses a complete BED12 definition.

Example 5:
This example shows an annotation track that uses the itemRgb attribute to individually color each data line. In this track, the color scheme distinguishes between items named "Pos*" and those named "Neg*". See the usage note in the itemRgb description above for color palette restrictions. NOTE: The track and data lines in this example have been reformated for documentation purposes. Click here for a copy of this example that can be pasted into the browser without editing.

browser position chr7:127471196-127495720

browser hide all

track name="ItemRGBDemo" description="Item RGB demonstration" visibility=2

itemRgb="On"

chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0

chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0

chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0

chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0

chr7 127475864 127477031 Neg1 0 - 127475864 127477031 0,0,255

chr7 127477031 127478198 Neg2 0 - 127477031 127478198 0,0,255

chr7 127478198 127479365 Neg3 0 - 127478198 127479365 0,0,255

chr7 127479365 127480532 Pos5 0 + 127479365 127480532 255,0,0

chr7 127480532 127481699 Neg4 0 - 127480532 127481699 0,0,255

Click here to display this track in the Genome Browser.

PSL Lines

PSL lines represent alignments, and are typically taken from files generated by BLAT or psLayout. See the BLAT documentation for more details. All of the following fields are required on each data line within a PSL file:

matches - Number of bases that match that aren't repeats
misMatches - Number of bases that don't match
repMatches - Number of bases that match but are part of repeats
nCount - Number of 'N' bases
qNumInsert - Number of inserts in query
qBaseInsert - Number of bases inserted in query
tNumInsert - Number of inserts in target
tBaseInsert - Number of bases inserted in target
strand - '+' or '-' for query strand. In mouse, second '+'or '-' is for genomic strand
qName - Query sequence name
qSize - Query sequence size
qStart - Alignment start position in query
qEnd - Alignment end position in query
tName - Target sequence name
tSize - Target sequence size
tStart - Alignment start position in target
tEnd - Alignment end position in target
blockCount - Number of blocks in the alignment
blockSizes - Comma-separated list of sizes of each block
qStarts - Comma-separated list of starting positions of each block in query
tStarts - Comma-separated list of starting positions of each block in target

Example 6:
Here is an example of an annotation track in PSL format. Note that line breaks have been inserted into this example for documentation display purposes. Click here for a copy of this example that can be pasted into the browser without editing.

browser position chr22:13073000-13074000

browser hide all

track name=fishBlats description="Fish BLAT" visibility=2

useScore=1

59 9 0 0 1 823 1 96 +- FS_CONTIG_48080_1 1955 171 1062 chr22

47748585 13073589 13073753 2 48,20, 171,1042, 34674832,34674976,

59 7 0 0 1 55 1 55 +- FS_CONTIG_26780_1 2825 2456 2577 chr22

47748585 13073626 13073747 2 21,45, 2456,2532, 34674838,34674914,

59 7 0 0 1 55 1 55 -+ FS_CONTIG_26780_1 2825 2455 2676 chr22

47748585 13073727 13073848 2 45,21, 249,349, 13073727,13073827,

Click here to display this track in the Genome Browser.

Be aware that the coordinates for a negative strand in a PSL line are handled in a special way. In the qStart and qEnd fields, the coordinates indicate the position where the query matches from the point of view of the forward strand, even when the match is on the reverse strand. However, in the qStarts list, the coordinates are reversed.

Example 7:
Here is a 30-mer containing 2 blocks that align on the minus strand and 2 blocks that align on the plus strand (this sometimes can happen in response to assembly errors):

0 1 2 3 tens position in query

0123456789012345678901234567890 ones position in query

++++ +++++ plus strand alignment on query

-------- ---------- minus strand alignment on query

Plus strand:

qStart=12

qEnd=31

blockSizes=4,5

qStarts=12,26

Minus strand:

qStart=4

qEnd=26

blockSizes=10,8

qStarts=5,19

Essentially, the minus strand blockSizes and qStarts are what you would get if you reverse-complemented the query. However, the qStart and qEnd are not reversed. To convert one to the other:

qStart = qSize - revQEnd qEnd = qSize - revQStart

GFF Lines

GFF (General Feature Format) lines are based on the GFF standard file format. GFF lines have nine required fields that must be tab-separated. If the fields are separated by spaces instead of tabs, the track will not display correctly. For more information on GFF format, refer to http://www.sanger.ac.uk/Software/formats/GFF.

Here is a brief description of the GFF fields:

seqname - The name of the sequence. Must be a chromosome or scaffold.
source - The program that generated this feature.
feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon".
start - The starting position of the feature in the sequence. The first base is numbered 1.
end - The ending position of the feature (inclusive).
score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray). If there is no score value, enter ".".
strand - Valid entries include '+', '-', or '.' (for don't know/don't care).
frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
group - All lines with the same group are linked together into a single item.

Example 8:
Here's an example of a GFF-based track. Click here for a copy of this example that can be pasted into the browser without editing. NOTE: Paste operations on some platforms will replace tabs with spaces, which will result in an error when the GFF track is uploaded. If you encounter an error when loading a GFF track, check that the data lines contain tabs rather than spaces.

browser position chr22:10000000-10020000

browser hide all

track name=regulatory description="TeleGene(tm) Regulatory Regions"

visibility=2

chr22 TeleGene enhancer 10000000 10001000 500 + . TG1

chr22 TeleGene promoter 10010000 10010100 900 + . TG1

chr22 TeleGene promoter 10020000 10025000 800 - . TG2 Click here to display this track in the Genome Browser.

GTF Lines

GTF (Gene Transfer Format) is a refinement to GFF that tightens the specification. The first eight GTF fields are the same as GFF. The group field has been expanded into a list of attributes. Each attribute consists of a type/value pair. Attributes must end in a semi-colon, and be separated from any following attribute by exactly one space.

The attribute list must begin with the two mandatory attributes:

gene_id value - A globally unique identifier for the genomic source of the sequence.
transcript_id value - A globally unique identifier for the predicted transcript.

Example:
Here is an example of the ninth field in a GTF data line:

gene_id "Em:U62317.C22.6.mRNA"; transcript_id "Em:U62317.C22.6.mRNA"; exon_number 1

For more information on this format, see http://genes.cse.wustl.edu/GTF2.html.

The Genome Browser groups together GTF lines that have the same transcript_id value. It only looks at features of type exon and CDS.

Microarray Format

The datasets for the built-in microarray tracks in the Genome Browser are stored in BED15 format, an extension of BED format that includes three additional fields: expCount, expIds, and expScores. To display correctly in the Genome Browser, microarray tracks require the setting of several attributes in the trackDb file associated with the track's genome assembly. Each microarray track set must also have an associated microarrayGroups.ra configuration file that contains additional information about the data in each of the arrays.

User-created microarray custom tracks are similar in format to BED custom tracks with the addition of three required track line parameters in the header--expNames, expScale, and expStep--that mimic the trackDb and microarrayGroups.ra settings of built-in microarray tracks.

For a complete description of the microarray track format and an explanation of how to construct a microarray custom track, see the Genome Browser Wiki.

Sharing Your Annotation Track with Others

To make your Genome Browser annotation track viewable by people on other machines or at other sites, follow the steps below. (Note that some of the URL examples in this section have been broken up into 2 lines for documentation display purposes).

Step 1. Put your formatted annotation file on your web site. Be sure that the file permissions allow it to be read by others.

Step 2. Construct a URL that will link this annotation file to the Genome Browser. The URL must contain 3 pieces of information specific to your annotation data:

The species or genome assembly on which your annotation data is based. To automatically display the most recent assembly for a given organism, set the org parameter: e.g. org=human. To specify a particular genome assembly for an organism, use the db parameter, db=database_name, where database_name is the UCSC code for the genome assembly. For a list of these codes, see the Genome Browser FAQ. Examples of this include: db=hg16 (Human July 2003 assembly), db=mm6 (Mouse Mar. 2005 assembly).
The genome position to which the Genome Browser should initially open. This information is of the form position=chr_position, where chr_position is a chromosome number, with or without a set of coordinates. Examples of this include: position=chr22, position=chr22:15916196-31832390.
The URL of the annotation file on your web site. This information is of the form hgt.customText=URL, where URL points to the annotation file on your website. An example of an annotation file URL is http://genome.ucsc.edu/goldenPath/help/test.bed.
If your website requires a password and username for access, you can add that to the URL. The syntax for URLs that include the username and password is this: http://user:password@server.com/path/file?options. Be aware that this raises security issues. The username and password given on the URL is potentially visible to anyone capable of snooping on the tcp connection.

Combine the above pieces of information into a URL of the following format (the information specific to your annotation file is highlighted):

http://genome.ucsc.edu/cgi-bin/hgTracks?org=organism_name&

position=chr_position&hgt.customText=URL.

Example 10:
The following URL will open up the Genome Browser window to display chr 22 of the latest human genome assembly and will show the annotation track pointed to by the URL http://genome.ucsc.edu/goldenPath/help/test.bed:

http://genome.ucsc.edu/cgi-bin/hgTracks?org=human&position=chr22&

hgt.customText=http://genome.ucsc.edu/goldenPath/help/test.bed

Step 3. Provide the URL to others. To upload a custom annotation track pointed to by a URL into the Genome Browser, paste the URL into the large text edit box on the Add Custom Tracks page, then click the Submit button.

If you'd like to share your annotation track with a broader audience, send the URL for your track—along with a description of the format, methods, and data used—to the UCSC Genome mailing list genome@soe.ucsc.edu.