Ulrich, L.E. and Zhulin, I.B. Bioinformatics (2014)


SeqDepot Overview


2 Dec 2014: Domain architecture images (PNG and SVG) are now drawn by using pfam27 hits.

10 Oct 2014: SeqDepot is updated with non-redundant database


SeqDepot massively simplifies the retrieval of primary sequences and associated secondary data by:

  1. Consolidating protein sequences and precomputed data,
  2. providing access via a well-documented REST API, and
  3. supplying tools to rapidly integrate knowledge of interest into your research.

Despite many excellent existing resources and databases, gathering biological information remains a tedious and error prone task. SeqDepot streamlines this process by minimizing the data retrieval and consumption burden, so that you can focus on what's important. More


SeqDepot can benefit anyone working with protein sequence data; however, because all access must be done via the REST API, basic programming skills are helpful. Thus, bioinformaticians and computational biologists will likely benefit the most.

I want to do X, but do not know how to program!

We have also developed a handy Perl program, sdQuery, that performs many useful tasks (e.g. retrieving precomputed data for FASTA sequences, downloading images, etc.). Give it a try and read the documentation to learn its capabilities.

Get Started

To demonstrate SeqDepot's capabilities and how to use it effectively, we developed a query interface that communicates with the SeqDepot database using the public API.

We recommend you start by experimenting with different parameters and different sample inputs to develop a feel for how it works.

Awesome features!

Intrinsic identifiers derived solely from the raw sequence enable rapid, fool-proof querying independent of external cross-references (although SeqDepot handles that too).
Duplicate sequences are merged into a single record (along with any relevant metadata). This improves database performance and facilitates data retrieval.
Using the REST API interface, easily fetch entire or partial record data in a single request to a semantically meaningful URI. Batch retrieval for up to 1,000 queries per request is also supported.
Each sequence contains predictions derived from 19 distinct analytical tools - Pfam / SMART domains, transmembrane regions, signal peptides, SuperFamily, and several more.
Results are encoded in JSON and may readily be converted into native data structures for downstream manipulation.
Query using a wide variety of external database identifiers - GenBank (GI), UniProt, PDB, or MD5 hexadecimal digests.
Want to run SeqDepot locally? Download the database and crank up your own local copy powered by MongoDB.
Easily generate PNG or SVG images visualizing a sequence's domain architecture.
Download sdQuery to perform many common tasks (e.g. retrieving precomputed data for FASTA sequences, download images, etc). Perl / Python modules simplify server interaction with several easy to use methods.