Ronald Robertson

Sanjay Singh

Scientist/Writer
  • Emailsanjaysingh765@gmail.com
  • Socail@lampatlex
  • VisitorSince 1982
  • LocationKentucky, USA



How to retrieve NCBI GenBank records with a range of accession numbers






I have previously discussed the way to download the sequence from NCBI database if you have a list of accession numbers. But what if you have a range of accession numbers (e.g. EF100000…………EF102000). If you search by a single accession number in the NCBI GenBank then you have no problem pulling up a record, but obviously you would not like to do this for thousands of EST records. So what is the easiest way to retrieve all these records when you way provide a range of accession numbers simultaneously from GenBank? Actually there are two way to perform this task. You can use either GenBank’s web interface or can go for command
line option also if you are comfortable with. For command line option you can
use, as usual, PERL script to query the GenBank .






1. GenBank’s web interface 


This is the easiest way to download multiple sequences from NCBI GenBank if you have a range of accession numbers. It can be done in few steps:




  • Go to the NCBI webpage

  • Choose the database (protein, nucleotide, EST,GSS)

  • Give the range of accession numbers followed by [accn] tag



For example I have an accession number range EF100000…………EF102000 so my query will be like this EF100000:EF102000[accn]. Click HERE for actual example.









2. Command line option


If you don't like to leave your desktop then command line is made for you. Download the NCBI search PERL script and search the database rite from you shell.
















Script name Download
NCBI search.pl



Uses


perl ncbi_search.pl -q EF100000:EF102000[accn] -o results.txt -d nucleotide -r fasta -m 2000



In this command PERL script will search the nucleotide database with query EF100000:EF102000[accn] and save all 2000 sequences in result.txt file in FASTA format.



Options

 -q [STRING]     : raw query text (Required)
-o [FILE] : output file to create (Required).
-d [STRING] : name of the NCBI database to search, such as 'nucleotide', 'pubmed' (Required).
-r [STRING] : the type of information requested. For sequences, 'fasta' is often used.
-m [INTEGER] : the maximum number of records to return (Optional)


Comments

design

Onboarding Motivation

Onboarding Motivation
branding

Delivery App Wireframe

Delivery App Wireframe
creative

Guest App Walkthrough Screens

Guest App Walkthrough Screens
art

Project Managment Illustration

Project Managment Illustration
Photography

Photography

Lorem ipsum dolor sit amet consectetuer adipiscing elit aenean commodo ligula eget.

Web Development

Web Development

Lorem ipsum dolor sit amet consectetuer adipiscing elit aenean commodo ligula eget.

UI/UX design

UI/UX design

Lorem ipsum dolor sit amet consectetuer adipiscing elit aenean commodo ligula eget.

Sanjay Singh

Sanjay Singh

I am lexicographer

I'm a Scientist with 20+ years of broad experience in molecular biology and bioinformatics. Extensive experience in the regulation of metabolic pathways in plants. Proven success in the design, development, and deployment of several result-oriented research projects. Have worked with both academic and industrial collaborators.

Get in touch

Let's talk about everything!

Don't like forms? Send me an email. 👋

Experience

2019 - Present

Acamedic Degree

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.

2017 - 2013

Bachelor’s Degree

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.

2013 - 2009

Honours Degree

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.

2019 - Present

Web Designer

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.

2017 - 2013

Front-End Developer

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.

2013 - 2009

Back-End Developer

Lorem ipsum dolor sit amet quo ei simul congue exerci ad nec admodum perfecto.