Ronald Robertson

Sanjay Singh

Scientist/Writer
  • Emailsanjaysingh765@gmail.com
  • Socail@lampatlex
  • VisitorSince 1982
  • LocationKentucky, USA



NCBI BLAST parser : Extract query and best hits





BLAST is a wonderful utility for sequence analysis studies. Bioinformatic analysis of thousand of sequences without BLAST is unthinkable. But if you analyzing the thousands of sequences or,in other words, if your data set is really big then analyzing the results of BLAST is another Herculean job.  Therefore, people start to write BLAST parser. BLAST parser are tools which format your result in most abstractive way so that you can easily analyse your final result. PERL are most popular computing language that is used by researcher to write parser. Therefore, PERL scripts  are unavoidable tools. NCBI BLAST is most commonly used bioinformatics software for sequences analysis. You can found an online NCBI BLAST parser on GreenGene server also.




  • How to install BLAST on window HERE

  • How to run BLAST on WINDOW   HERE



Please remember that I have tested these NCBI BLAST parser for Standalone NCBI BLAST programme. It may/may not work for web version of NCBI BLAST



There may be two kind of scenario for parsing your BLAST result.




  • You want to get the information about all your queries.

  • You want to get the information about only those queries which have some hits. 



Although, you can use outfmt  in your BALST+ query to get output in tabular, tabular with comments, or comma-separated value formats bout it would not be easy to sort out your sequences. So Lets parser you NCBI  BLAST results. These parsers were originally written by Dr. Xiaodong Bai.



Dependencies

Both PERL script NCBI blast parser are depend upon Bioperl Modules, Bio::SearchIO, so make sure they are installed on you computer.







  1. NCBI BLAST parser 1

  2. Final Steps






  1. Extract out quesries with hits or without hits from NCBI BLAST result file


ncbiblastparser.pl













Script name Download
blastparser.pl



Uses



perl ncbiblastparser.pl <blast-result-file> <no of hits> <result-file name> 



If my BLAST result is saved in 'blast-result.txt' and I want top '5' hits in my parsed result file 'parsed-result.txt' then my command will be

perl ncbiblastparser.pl blast-result.txt 5 parsed-result.txt 



parsed-result.txt will contain following information about hits and queries

query_name\query_length\Hit accession_number\Hit length\Hit description\E value\bit score\frame\query_start





Finally, you may like to finally sort out/ extract query without any hits. You can simply import tabulated NCBI blast parse file into Excel and then sort them alphabetically. Finally you can copy those queries for further processing. Follow this Bioinformatics video tutorial for more











Comments