BLAST is a wonderful utility for sequence analysis studies. Bioinformatic analysis of thousand of sequences without BLAST is unthinkable. But if you analyzing the thousands of sequences or,in other words, if your data set is really big then analyzing the results of BLAST is another Herculean job. Therefore, people start to write BLAST parser. BLAST parser are tools which format your result in most abstractive way so that you can easily analyse your final result. PERL are most popular computing language that is used by researcher to write parser. Therefore, PERL scripts are unavoidable tools. NCBI BLAST is most commonly used bioinformatics software for sequences analysis. You can found an online NCBI BLAST parser on GreenGene server also.
Please remember that I have tested these NCBI BLAST parser for Standalone NCBI BLAST programme. It may/may not work for web version of NCBI BLAST
There may be two kind of scenario for parsing your BLAST result.
- You want to get the information about all your queries.
- You want to get the information about only those queries which have some hits.
Although, you can use outfmt in your BALST+ query to get output in tabular, tabular with comments, or comma-separated value formats bout it would not be easy to sort out your sequences. So Lets parser you NCBI BLAST results. These parsers were originally written by Dr. Xiaodong Bai.
Dependencies
Both PERL script NCBI blast parser are depend upon Bioperl Modules, Bio::SearchIO, so make sure they are installed on you computer.
- Extract out quesries with hits or without hits from NCBI BLAST result file
ncbiblastparser.pl
Script name | Download |
---|---|
blastparser.pl |
Uses
perl ncbiblastparser.pl <blast-result-file> <no of hits> <result-file name>
If my BLAST result is saved in 'blast-result.txt' and I want top '5' hits in my parsed result file 'parsed-result.txt' then my command will be
perl ncbiblastparser.pl blast-result.txt 5 parsed-result.txt
parsed-result.txt will contain following information about hits and queries
query_name\query_length\Hit accession_number\Hit length\Hit description\E value\bit score\frame\query_start
Finally, you may like to finally sort out/ extract query without any hits. You can simply import tabulated NCBI blast parse file into Excel and then sort them alphabetically. Finally you can copy those queries for further processing. Follow this Bioinformatics video tutorial for more
Post a Comment