AnnoLnc2 FAQ

If you have more questions, please contact [email protected].

What is AnnoLnc2?

AnnoLnc2 is a web server for integratively annotating novel human and mouse lncRNAs. Designed as a one-stop portal, it accepts human and mouse lncRNA sequences as input, and generates a full spectrum of annotations covering sequence and structure features, regulation, expression, protein interaction, genetic association, evolution and subcellular localization. Furthermore, heterogeneous annotations are integrated to facilitate unveiling biologically meaningful clues.
Why do you develop AnnoLnc2?

Recent years, long noncoding RNAs (lncRNAs) have been demonstrated to be such essential and widespread molecules as proteins. While the number of newly identified lncRNAs is increasing continuously, their functions are still largely elusive. Although there are some lncRNA databases, they could not deal with newly identified lncRNAs. So we developed this server to help biologists interrogate novel lncRNAs from various perspectives, and we are dedicated to help biologists discover useful clues underlying heterogeneous annotations. Of course, AnnoLnc2 also accepts known lncRNAs.
Is AnnoLnc2 a web server or a database?

AnnoLnc2 provides on-the-fly analysis for input sequences rather than simple data retrieval. Every valid input sequence will be mapped to the human genome to get the genomic location, then followed by downstream analysis. For efficiency, we employ a "cache" strategy. Note that even a submitted sequence is one base different from cached lncRNAs, AnnoLnc will regard it as a novel sequence to reanalyze. In a word, AnnoLnc2 is not only a database for its base knowledge for known lncRNAs, but also a web server for its annotate ability for novel lncRNAs.
How does each annotation module of AnnoLnc work?

Please see the details in the methods page.
How to submit lncRNA sequences? What is the requirement for input sequences?
There are 3 ways to submit lncRNA sequences:
- Paste lncRNA sequences into the big input box at the home page.
- Upload fasta file by the web service API.
- Download the standalone version of AnnoLnc and run it at your own computer.
Submit requirement (AnnoLnc will discard sequences that don't meet the requirement):
- Fasta format: your sequences must be in fasta format. That is to say, sequence names are required.
- The limitation of total sequences: if you paste sequences, up to 100 one time; if you upload a fasta file, up to 500 sequences one time.
- Name requirement: names of your sequences should be less than 100 characters. Only characters in [A-Za-z0-9_.,-] are allowed. Illegal characters will be discarded.
- Sequence requirement: your sequences should be longer than 20bp and shorter than 100,000bp. Only characters in DNA and RNA sequences are allowed.
Why don't some of my sequences have "Locus" in the "Overview" page?
There are two situations:
- The analysis has just been started, the sequence hasn’t been mapped to the human genome. In this case, the page will be refreshed automatically every 15 seconds.
- If all the analyses are finished and it still has no "Locus", this means that it can't be located in the human genome hg38. Maybe it's not a human sequence.
Why do some of my sequences have multiple "Locus" in the "Overview" page?

It's because this sequence can be mapped to multiple loci in the human genome hg19. AnnoLnc will run analyses for all the loci.
How to understand the annotation results?

Click the at the annotation page, you can see detailed explanation of each annotation result. If you have more questions, please contact [email protected].
How are the ChIP-Seq datasets analyzed?

AnnoLnc2 integrated TF binding sites from GTRD database, with 91.7 million ChIP-Seq-based binding sites for 1,339 human Transcription Factors (TFs) as well as 80.8 million sites for 738 mouse Transcription Factors. For each input transcript, AnnoLnc will search putative TF binding sites within upstream 5Kb and downstream 1Kb, and report all sites based on their relative position to the transcript, as "upstream transcriptional start site (TSS)", "overlap with TSS", "inside the lncRNA loci", "overlap with transcriptional end site (TES)" and "downstream TES".
How are the RNA-Seq datasets analyzed?

Please see the details at here.
How are the CLIP-Seq datasets analyzed?

Please see the details at here.
What is the integrated view?
The integrated view is a set of pre-tuned custom tracks in the UCSC genome browser. Taking advantage of the nature of the genome browser view, spatial correlations across different kinds of annotations can be easily discovered. Annotations at the transcript level including transcript structure, TF binding sites, miRNA binding sites, protein binding sites, SNP locations are sent to the UCSC genome browser by URL. PhyloP scores and conserved elements are presented as UCSC local tracks. Note that only annotations nearby the lncRNA can be displayed. About the definition of "nearby", please refer to details of each module in the method page.
Tracks are:
- User submitted sequence: the transcript structure of the lncRNA.
- Trait associated SNPs.
- PhyloP score of mammals: the UCSC local track.
- Conserved elements: the UCSC local track.
- Protein binding sites: binding sites of the same kind of samples (the same cell type treated with the same condition) are merge. Different colors mean different cell lines.
- miRNA binding sites: binding sites are colored by red if they are supported by AGO CLIP-Seq data.
- Transcriptional regulation: TF binding sites of the same kind of samples (the same cell type treated with the same condition) are merge. Different colors mean different cell lines.
有中文的介绍或者教程吗? (Do you have an introduction or tutorial in Chinese?)

微信公众号"医学数据库百科"有一篇介绍AnnoLnc2的文章，点击这里查看。

Table of Contents

Q&A