Programming interface to BCSDB 3
Attention: API has not been ported from BCSDB version 2.
If you need an API for automated cross-project data exchange and you are ready to fund a joint project, please let us know via feedback form. The estimated project length is 2-3 monthes (1-3 seminars to achieve agreement on data formats, and invited scientistship in your institution or remote interaction between programmers at your and our sides).
Programming interface to BCSDB 2
BCSDB version 2 maintenance is ceased in 2008. These description is for reference only.
This interface is based on SOAP (Small Objects Access Protocol) and includes the following:
Data retrieval functions
- to return a branched associative array with all the data for the certain record. The input parameter is a BCSDB ID.
- to return a GLYDE 1.2 XML encoded structure from the record with given BCSDB ID.
- to return a PubMed XML encoded bibliography of the primary paper from the record with given BCSDB ID.
Search functions
- to search for records using a substructure or a set of multiple substructures as a criterion. The input parameter is a substructure in either BCSDB linear encoding (see Structure encoding rules for details) or in GLYDE 1.2 XML encoding (see below) or in the form of the exchange array datablock (as PHP array or as XML). These search functions returns the array of IDs or error messages.
- to search for records using an exact structure as a criterion. The input parameter is a structure in BCSDB encoding or GLYDE 1.2 XML data. These functions return an array of IDs, which usually contains a single ID.
- to search for records using keywords. The input parameters are: an array of keywords (item order is not principal) that should all be present in the record; an option to search for this terms in titles and abstracts too; an option to allow variations in word endings. The function returns an array of IDs.
- to search for records using bibliography. The input parameters are either PubMed XML description (more details here: PubMed XML tagged data, click here for DTD) or some separate bibliographic data (an array of author names grouped with AND by default, with initials optionally included, journal name with wildcards allowed, publication year, volume number, start page). Both variants of the function also accept an array of terms to search for in titles and optionally abstracts. The functions return an array of IDs.
- to search for records using NMR data. The input parameters are: a space-separated list of chemical shifts, the similarity threshold used to filter the data and the nucleus identifier ("H" or "C").
A function returns an array of IDs of subrecords containing ID, structure, NMR spectrum for the given nucleus and similarity. The higher the similarity is the better is the correlation between the input subspectrum and the full NMR spectrum of the compound. More comments are available inside sample PHP client.
- to search for records using NCBI TaxID. The input parameter is the NCBI TaxID of the microorganism which is converted to genus and species name on the server side and used to search BCSDB. The function returns an array of IDs.
Conversion functions
- to convert a linear-encoded structure (as returned by the database) to the expanded IUPAC view
- to convert a linear-encoded structure (as returned by the database) to the GLYDE 1.2 XML data
- to convert the GLYDE 1.2 XML data to a linear-encoded structure
NMR prediction function
- to predict 13C NMR assignment table for the give structure. The input parameter is a structure in BCSDB encoding. The output is the assignment table with residue names and linkages tracked and subspectra and accuracies calculated for each residue. The total sorted spectrum and overall consistency are also returned. Please refer to comments inside a sample PHP client for detailed output format.
To view the WSDL description of these services click here.
More details about the format of these calls, parameters etc. are provided as comments in the sample PHP client that you can download below. If you are programming in other language you have to make the calls functionally similar to this client program. The base SOAP functionality is provided in the include file nusoap.php (for PHP; please be sure to use version 1.66 exclusively) or as SOAP::Lite module for Perl. The base XML functionality is provided in the include file xml22.inc (for PHP).
Here you can download ZIP-packed client program in PHP (the main file is csdb2_client.php, the rest are the support files that you never have to edit), SOAP include file (nusoap.php, ver 1.66) and XML include file (xml22.inc).
GLYDE XML structure description
There-and-back conversion between BCSDB linear encoding and GLYDE 1.2 XML format is established. GLYDE is a standart language for GLYcan Data Exchange. The learn more about GLYDE click here. The DTD schema for GLYDE 1.2 is available here.
The example of glyde description for the structure given below is:
<?xml version="1.0"?>
<glycan>
<residue name="GlcNAc" anomer="b" chirality="D" ring_form="p" linking_atom="1" link="4" type="glycosyl" repeat="n">
<residue name="P" anomer="null" chirality="null" ring_form="null" linking_atom="O" link="3" type="non-carbohydrate">
<residue name="Fuc" anomer="a" chirality="L" ring_form="p" linking_atom="1" link="O" type="glycosyl" repeat_end="1"/>
</residue>
<residue name="Neu5Ac" anomer="a" chirality="null" ring_form="p" linking_atom="2" link="6" type="glycosyl"/>
</residue>
</glycan>
Performance data
These are the performance data for some typical requests. The operation time was measured on the client side (100Mbit LAN) and is a period between timestamps before and after the call of the server function.
operation | average time, sec | server->client traffic, bytes | client->server traffic, bytes | data transferred |
---|
GLYDE2CSDB_conversion | 0.198 | 481 | 1533 | 1 string |
Substructure_search | 0.246 | 1120 | 562 | 11 IDs |
Exact_structure_search | 0.176 | 467 | 599 | 1 ID |
NMR_data_search | 0.261 | 1865 | 634 | 4 blocks (ID, structure, NMR spectrum, similarity) |
Direct_bibliography_search | 0.227 | 514 | 842 | 3 IDs |
Bibliography_search | 0.138 | 465 | 2121 | 1 ID |
Keyword_search | 0.143 | 1061 | 745 | 33 IDs |
Taxonomy_search | 0.086 | 4300 | 560 | 212 IDs |
Retrieve_all_data | 0.216 | 4994 | 543 | 1 block (all data) |
Retrieve_GLYDE | 0.187 | 2026 | 555 | 1 XML string |
Retrieve_PubMed | 0.184 | 3909 | 557 | 1 XML string |
BCSDB2JUPAC_conversion | 0.226 | 678 | 630 | 1 multiline string |
BCSDB2GLYDE_conversion | 0.180 | 1569 | 591 | 1 XML string |
C13_NMR_prediction | 0.190 | 589 | 1972 | 1 block (names, lineage, subspectra, free subspectra, consistencies, whole spectrum, total consistency, warnings) |
Home