1-DAV-202 Data Management 2023/24
Previously 2-INF-185 Data Source Integration

Materials · Introduction · Rules · Contact
· Grades from marked homeworks are on the server in file /grades/userid.txt


Difference between revisions of "Genomika: Rozvojové projekty"

From MAD
Jump to navigation Jump to search
Line 32: Line 32:
 
** From several overlapping matches keep only the strongest (try tool overlapSelect)
 
** From several overlapping matches keep only the strongest (try tool overlapSelect)
 
** More ambitious: Explore creating image of each RNA structure and somehow linking it to the info page for the match (as in non-coding RNA track in the human genome browser - see for example http://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr1%3A16520585%2D16520658, display non-coding RNA track and click on the tRNA match)
 
** More ambitious: Explore creating image of each RNA structure and somehow linking it to the info page for the match (as in non-coding RNA track in the human genome browser - see for example http://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr1%3A16520585%2D16520658, display non-coding RNA track and click on the tRNA match)
 +
 +
===Information for users===
 +
* Each track should provide basic information for users in the HTML document displayed after clicking on track name or left bar of the browser image.
 +
* The information should summarize what is displayed, what was source of the data, what program was used to produce the results etc
 +
** keep it less technical, with a link to your github wiki page for the track for potential developers replicating your work
 +
* See examples for tracks on the http://genome-euro.ucsc.edu/ browser
 +
* Also, the genome as a whole should have a description page. On the title page of http://genome-euro.ucsc.edu/ you see details of the selected assembly, e.g. for the guinea pig genome you see text
 +
<pre>
 +
Guinea pig Genome Browser - cavPor3 assembly
 +
The Feb. 2008 Cavia porcellus draft assembly (Broad Institute cavPor3) was produced by the Broad Institute at MIT and Harvard.
 +
...
 +
</pre>
 +
* You should create some explanatory text for you species and genome and make it display on the title page
 +
** This already works for Yarrowia lipolitica on genomika server, so you can try to find out how it was done
 +
 +
==MalSym group==
 +
 +
===Clickable genes===
 +
* If you click on a gene or other displayed item in a well-setup genome browser, you get a page with more information about this item
 +
* This does not work satisfactorily on our genomika browser
 +
* Look at all tracks displaying gene information in four browsers:
 +
** sacCer3 in original UCSC genome browser [http://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=sacCer3], tracks NCBI RefSeq, SGD Genes, Ensembl Genes
 +
** sacCer3 in our genomika genome browser [http://genomika.compbio.fmph.uniba.sk/cgi-bin/hgTracks?db=sacCer3], tracks
 +
** yarLip1 in our genomika genome browser [http://genomika.compbio.fmph.uniba.sk/cgi-bin/hgTracks?db=yarLip1], tracks Ens. Genes (L), RefSeq Genes (L)
 +
** malSym1 in our genomika genome browser [http://genomika.compbio.fmph.uniba.sk/cgi-bin/hgTracks?db=malSym1], track
 +
* For each explored track find out what gets displayed after clicking at a gene, whether there are any error messages, whether the page contains a link to the source database (e.g. Ensembl, RefSeq, NCBI, SGD)
 +
*
 +
  
 
===Information for users===
 
===Information for users===

Revision as of 09:18, 12 April 2018

MalGlo group

User trackDb, code management

  • Think how to better manage changes to browser code in the future instances of the course
  • Explore possibilities of each user having their own trackDb
  • Start by reading short info in /kentsrc/trackDb/makefile on genomika server
# Browser supports multiple trackDb's so that individual developers
# can change things rapidly without stepping on other people's toes. 
...
  • Write a manual how to do your suggested changes and test it

Rfam

  • Rfam http://rfam.xfam.org/ is a database of families of non-coding RNAs
  • It contains a covariance model for each family
  • The database can be downloaded and searched against a genome using Infernal tool http://eddylab.org/infernal/
  • Do this search, then convert the output to appropriate format and display in the browser
  • Possibly use BEDdetail format https://genome.ucsc.edu/FAQ/FAQformat.html#format1.7
  • After clicking on an Rfam match, there should be some display of additional information about the match and a link to the Rfam database. You can achieve this by the following lines in trackDb.ra:
type bedDetail 14
url http://rfam.xfam.org/family/$$
urlLabel Rfam:

Example of BEDdetail format for a Rfam match (items should be tab-separated, the last column starts at "truncated:")

chrom chromStart chromEnd name score strand thickStart thickEnd reserved blockCount blockSizes chromStarts id description
contigA 75109 75380 Fungi_SRP-1 1002 - 75109 75109 0 1 271 0 RF01502 truncated: no, E-value: 3.5e-19
  • Further things which you might want to explore:
    • Remove matches that correspond to tRNAScan-SE matches (try tool overlapSelect)
    • From several overlapping matches keep only the strongest (try tool overlapSelect)
    • More ambitious: Explore creating image of each RNA structure and somehow linking it to the info page for the match (as in non-coding RNA track in the human genome browser - see for example http://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr1%3A16520585%2D16520658, display non-coding RNA track and click on the tRNA match)

Information for users

  • Each track should provide basic information for users in the HTML document displayed after clicking on track name or left bar of the browser image.
  • The information should summarize what is displayed, what was source of the data, what program was used to produce the results etc
    • keep it less technical, with a link to your github wiki page for the track for potential developers replicating your work
  • See examples for tracks on the http://genome-euro.ucsc.edu/ browser
  • Also, the genome as a whole should have a description page. On the title page of http://genome-euro.ucsc.edu/ you see details of the selected assembly, e.g. for the guinea pig genome you see text
Guinea pig Genome Browser - cavPor3 assembly
The Feb. 2008 Cavia porcellus draft assembly (Broad Institute cavPor3) was produced by the Broad Institute at MIT and Harvard.
...
  • You should create some explanatory text for you species and genome and make it display on the title page
    • This already works for Yarrowia lipolitica on genomika server, so you can try to find out how it was done

MalSym group

Clickable genes

  • If you click on a gene or other displayed item in a well-setup genome browser, you get a page with more information about this item
  • This does not work satisfactorily on our genomika browser
  • Look at all tracks displaying gene information in four browsers:
    • sacCer3 in original UCSC genome browser [1], tracks NCBI RefSeq, SGD Genes, Ensembl Genes
    • sacCer3 in our genomika genome browser [2], tracks
    • yarLip1 in our genomika genome browser [3], tracks Ens. Genes (L), RefSeq Genes (L)
    • malSym1 in our genomika genome browser [4], track
  • For each explored track find out what gets displayed after clicking at a gene, whether there are any error messages, whether the page contains a link to the source database (e.g. Ensembl, RefSeq, NCBI, SGD)


Information for users

  • Each track should provide basic information for users in the HTML document displayed after clicking on track name or left bar of the browser image.
  • The information should summarize what is displayed, what was source of the data, what program was used to produce the results etc
    • keep it less technical, with a link to your github wiki page for the track for potential developers replicating your work
  • See examples for tracks on the http://genome-euro.ucsc.edu/ browser
  • Also, the genome as a whole should have a description page. On the title page of http://genome-euro.ucsc.edu/ you see details of the selected assembly, e.g. for the guinea pig genome you see text
Guinea pig Genome Browser - cavPor3 assembly
The Feb. 2008 Cavia porcellus draft assembly (Broad Institute cavPor3) was produced by the Broad Institute at MIT and Harvard.
...
  • You should create some explanatory text for you species and genome and make it display on the title page
    • This already works for Yarrowia lipolitica on genomika server, so you can try to find out how it was done