CSDB usage: advanced features

This section describes details of additional CSDB user operations, which are available from the Extras and Maintenance sections of the menu. For basic help, refer to the CSDB usage help.
 

Content:

 
Extras:
 
 
Maintenance:
  • RDF feed
  • Data submission
  • Structure translation
  • Feedback
  • Data export
  • Data import
  • Structure validation
  • Database initialization
  • Update of lists
  • Data removal
  • RDF feed

    Major CSDB data are exportable in the Resource Description Framework (RDF), a semantic web data model designed for conceptual description of information. The data are exported as a set of triples (serialized RDF feed), and utilize the ontology "GlycoRDF" developed during Biohackathon 2012 (Toyama, Japan) and Biohackathon 2013 (Dalian, China). Current version of GlycoRDF is documented: ontology visualized online, participants & publications, ontology in OWL, documentation.

    To generate an RDF feed, select an appropriate serialization format (Turtle, RDF/XML, RDF/JSON, N-Triples) from the drop-down list at the bottom of the ID search form and click Make RDF. The web-service for production of RDF feeds is located at http://csdb.glycoscience.ru/integration/make_rdf.php and accepts the following parameters:

    As an example, run http://csdb.glycoscience.ru/integration/make_rdf.php?id_list=21720&mode=record&clean=0&db=bacterial&format=turtle and look at the RDF feed. Please note, that the service is designed to make the URLs in the total RDF feed valid, and is not proposed for bulk generation of data. In case you need the whole CSDB RDFized, download the RDF dump (ZIPped TTL-files, generated on 2013 Nov 29).

    Data submission

    Users can submit their own data to CSDB on the data submission page. Please fill in the fields with published data only. The meaning of each field is explained beside it, and additional tools are provided to help with author spelling, bibliographic, structural and taxonomic references, structure encoding and NMR spectra assignment encoding. The obligatory fields are shown in bold.

    After submission automatic error check is performed, and the report is returned in a separate window, so you can correct the data and try submission again. If either compound or publication is already present in the database in another context, cross-references will be generated automatically. In this case, it is advised to follow these references to check how the existing data correlate with what you are submitting. After successful validation of the submitted record, a CSDB dump file is displayed and sent to the CSDB staff for manual curation, approval and upload.

    If you are familiar with the CSDB dump format and wish to submit massive blocks of data, the best way is to send your dump file to Philip Toukach and wait for an error report before the next step.

    Structure translation

    Structure translator allows conversion of the structure description to and from other glycan encoding languages.
    The Translate from GlycoCT section instructs how to make translation in an automated way (via API) and provides a converter of user input. You can copy-paste a GlycoCT condensed code into the textbox and press Convert to obtain a CSDB linear encoding. The

    Translate from CSDB section allows export of structures in four formats (GlycoCT condensed, GlycoCT XML, GlydeII XML, LinUCS) with or without monomeric namespace translation by selected engine (GlycomeDB, MonosaccharideDB). This tool is implemented on the Deutches Krebbsforschungszentrum server.

    Feedback

    The feedback form is proposed for contacts with the CSDB staff. Please, provide your name and email and select a contact reason:

    Pressing the Submit feedback button sends data to the CSDB team. If your feedback implies an answer, please allow 10 days.

    Data export

    This password-protected feature is located in the Maintenance subdivision and is proposed for export of the CSDB data in a record-oriented form (text CSDB dump file; see also Dump format). To perform export, specify the range of CSDB record IDs and press the Make dump button. Use commas to separate different IDs, e.g. 1,2, and hyphens to specify ID ranges, e.g. 1-10. If the Warn... checkbox is checked, warnings about missing IDs will be included in the dump. The data are output to the browser, from where you can copy them.

    Please note, that our server is not capable to run massive export. The better way to export big parts (or all) of the database is to request a dump using the feedback form.

    Data import

    This password-protected feature is located in the Maintenance subdivision and is proposed for import of data. The data should be provided as a file in a record-oriented form (text CSDB dump file; see also Dump format). If the Clear database checkbox is checked, the whole database except the residue-related tables and journal list will be cleared before import. Please use it only to re-import a complete dump of CSDB.

    Pressing the Import button starts the validation procedure for every record in the provided dump and imports data if no errors are found. By default, the validator reports results in a verbose form, i.e. its output contains critical error messages, non-critical warnings or successful import reports for every record. The Suppress non-critical warnings checkbox can be used to simplify the report by exclusion of warnings, which do not impede the data import. The generated report is output to the browser and can be used for subsequent correction of errors in the dump.

    Structure validation

    The structure validation tool is provided separately from the validator built in the import feature because more than half of errors in user dumps are errors in the structure encoding. You are supposed to check structures with this tool before any further operations with dumps. Please provide a CSDB dump (see also Dump format) containing structure(s), select an appropriate level of strictness by checkboxes below and press the Check button to display a report. The validator checks the content of the ST1 and ST2 fields in the dump for correct syntax, spelling of monomeric names, chemical and topological allowability of the structure.

    If you uncheck the Display unexplained alias errors checkbox, error messages for structures containing an alias unexplained after double slash (Subst etc.) will not be included in the report. Unless the Display non-critical parsing warnings checkbox is checked, warnings that do not prevent the structure encoding from parsing are not displayed.

    If a structure contains errors that can be corrected automatically (selection of main chain, side chain order etc.) they are normalized, and a link to the dump with normalized structures is provided above the report. Please use this pre-processed dump as a basis for subsequent error corrections. Reports on structure normalizations are output together with other messages if the Display structure auto-correction reports chackbox was checked.

    Database initialization

    This password-protected Maintenance feature is proposed for the database initialization. It rebuilds the database structure and clears its content. If the short mode of initialization is selected, the residue-related tables and journal list remain intact.

    The checkboxes Add residue information... and Add journal information... instruct the engine to pre-fill the corresponding tables with service data after initialization. These data are located in the RESIDUES.TXT and JOURNALS.TXT files, accordingly. If you edited these files (e.g. added a journal or a residue), please use the forms in the bottom of the page to upload them to the server and reinitialize the database in the full mode afterwards with the Add.. checkboxes checked.

    Update of lists

    Selection of the Update lists item from the menu forces the database engine to regenerate service files used in the web front-end: number of structures and publications, lists of genera, species, strains, organisms, journals etc. After operations, which affect the database content, this procedure is called automatically. Manual list update can be used for monitoring of the import process started in another browser window.

    Data removal

    This password-protected feature is located in the Maintenance subdivision and allows removal of specified records from the database. To delete records, specify the range of the CSDB record IDs and press the Delete button. If the Warn... checkbox is checked, warnings about trying to delete IDs that are not present in the database will be included in the report.

    If an associated publication is not present in any other records, the publication, corresponding article ID, authors and other related data are also removed from the database.
    If an associated compound is not present in any other records, the structure, corresponding compound ID, residue instances and rows in the connection table are also removed from the database.
    If an associated organism is not present in any other records, the organism, corresponding organism ID, genus, species and other related data are also removed from the database.

    Home