Commit afb329c6 authored by Matija Obreza's avatar Matija Obreza

Documentation

parent 7705555e
......@@ -19,6 +19,10 @@ In this manual, all URLs are pointing to the Genesys sandbox environment at http
include::sections/security.adoc[]
include::sections/accession-api.adoc[]
include::sections/api-accession.adoc[]
include::sections/api-crop.adoc[]
== Acknowledgements
Special thanks go to Luca Matteis and Richard Bruskiewich from Bioversity International who have contributed
to the original documentation of the APIs.
Accession passport data basics
==============================
December 2015: Documentation commit {buildNumber}
:revnumber: {projectVersion}
:doctype: book
:toc: left
:toclevels: 5
:icons: font
:numbered:
:source-highlighter: pygments
:pygments-css: class
:pygments-linenums-mode: table
[[intro]]
Introduction
------------
This manual contains basic information on commonly used standards for accession
documentation and formats for data exchange. It documents Genesys extensions to
the standards.
https://www.genesys-pgr.org[Genesys PGR] (Plant Genetic Resources) is a free online global portal accessible at
link:$$https://www.genesys-pgr.org$$[www.genesys-pgr.org]
that allows the exploration of the world’s crop diversity through a single website. The
data published on Genesys follows the <<mcpd,Multi-crop Passport Descriptors>> standard.
The manual introduces
* <<wiews,FAO WIEWS>> database and <<wiews-instcode,WIEWS Institute codes>>
* FAO/Bioversity <<mcpd,Multi-crop Passport Descriptors>>
* <<mcpd-genesys,Genesys extensions>> to MCPD
* <<other-standards,Other standards>> relevant to accession documentation
include::sections/accedoc.adoc[]
include::sections/wiews.adoc[]
include::sections/mcpd.adoc[]
include::sections/iso.adoc[]
......@@ -67,3 +67,11 @@ include::sections/backup.adoc[]
include::sections/recovery.adoc[]
:leveloffset: 0
include::sections/wiews.adoc[]
include::sections/mcpd.adoc[]
include::sections/security.adoc[]
include::sections/api-accession.adoc[]
include::sections/api-crop.adoc[]
[[accedoc]]
== Accession documentation in genebanks
Collections of PGRFA material in genebanks document at least the following for each accession
* <<accedoc-accenumb,Accession number>>
* Acquisition date `ACQDATE` when accession entered the collection
* <<accedoc-other,Other accession identifiers>>
* <<accedoc-tax,Taxonomy>>
* <<accedoc-storage,Storage and maintenance>>
A single accession is usually maintained as several individual *inventories* or lots, where each inventory
follows different management policies and is maintained in different conditions (e.g. cryo and in vitro,
or base and active collection).
Inventory management is a topic of genebank collection management and is not further described here.
[[accedoc-accenumb]]
=== Accession number
Accession number is the unique identifier assigned to the material as it enters
the collection. This identifier generally has three components:
Prefix + Sequence number + Suffix
The *prefix* is commonly used to differentiate between different crop collections
maintained by the genebank.
.Some prefixes used by http://www.iita.org[IITA] genebank
* `TMe` Cassava _Manihot esculenta_ collection
* `TVSu` Bambara groundnut _Vigna subterranea_ collection
* `TZm` Maize _Zea mays_ collection
*Sequence number* is assigned manually or by a computer system to ensure there are
no duplicates. Some institutes prefer to zero-pad the number `00000102`.
The *suffix* allows differentiating samples of the same original material. The
exact meaning of the suffix is different for every institute.
[cols="1,1,1,2", options="header"]
.Example accession numbers
|===
|Prefix|Sequence number|Suffix|Accession number
|TMe|419||TMe-419
|TVSu|13||TVSu-13
|===
[[accedoc-other]]
=== Other accession identifiers
Material enters the collection by collecting, from breeding programs,
or acquisition from other institutes. In each case, the material will already have some
identifier assigned by the collector, breeder or other institute.
*Accession name* is the vernacular name of the material and is commonly captured by
the collector or assigned by the breeder.
[[accedoc-coll]]
==== Collected material
Genebank accessions obtained through collecting missions should maintain data about the site and dates of the
collecting and collector information.
[[accedoc-bred]]
==== Breeders material
Lines developed by breeding programs of the institute may be included the collection. Information provided by the breeders
should include the pedigree, ancestral information of the material, along with names and identifiers used by the breeding
program and the codes and names of institutes that developed the material.
[[accedoc-acq]]
==== Acquisitions
Material coming from other institutes and genebanks must be accompanied by accession passport data
as documented in the source genebank.
NOTE: *Country of origin* is the country where the material was collected or bred, not the country of source genebank.
Accession documentation should capture any identifiers provided by the source institute. This data
allows for validation and curation of passport data between the genebanks and allows researchers
to obtain material from either collection.
[[accedoc-tax]]
=== Taxonomy
Accession genus, species, species author, subtaxon and subtaxon authority are usually
known, but are subject to change after expert identification or change in taxonomic system.
https://npgsweb.ars-grin.gov/gringlobal/taxon/abouttaxonomy.aspx[GRIN Taxonomy for Plants] and the Mansfeld database can serve for validating accession taxa.
[[accedoc-storage]]
=== Storage and maintenance
Ex situ genebanks maintain PGR material as seed, in the field, in vitro, cryo or in DNA collections.
Inventories (lots) of one accession may be managed by different methods (e.g. seed and cryo).
See <<mcpd-storage,Storage>> in MCPD standard on how to capture multiple types of storage.
[[chApiAccession]]
== Managing Passport Data
Passport data is based on FAO Multi-Crop Passport Descriptors <<mcpd2>> format.
Accession records are *upserted*, meaning that when the matching accession record
. exists, it will be updated
. does not exist, a new record will be created
Accession data in the database will be updated with whatever data is provided in the
request JSON.
=== Accession identity
Prior to full adoption of Permanent Unique Identifiers for Germplasm, accessions could be
identified by the holding institute code (INSTCODE) and the accession number (ACCENUMB).
Genebanks maintaining two or more collections of crops would sometimes use the same
accession number, unique within one collection.
Genesys uses the *instCode*, *acceNumb* and *genus* triplet to uniquely identify an
accession in an institute:
[source,json,linenums]
----
{
"instCode": "NGA039", <1>
"acceNumb": "TMp-123", <2>
"genus": "Musa" <3>
}
----
<1> Holding institute code (INSTCODE)
<2> Accession number (ACCENUMB)
<3> Genus (GENUS)
=== JSON data model
The JSON data model of accession passport data closely follows <<mcpd2, MCPD>> definitions.
By default, institutes in Genesys are configured to "Use unique accession numbers within the institute".
The accession JSON object must provide two identifying elements: `instCode` and `acceNumb`.
In cases where accession numbers are not unique within the institute, `genus` is used to identify
the unique accession within the institute. Then the Accession JSON object must always provide three
identifying elements: `instCode`, `acceNumb` and `genus`.
All other fields are optional.
[source,json,linenums]
----
{
"instCode": "XYZ111",
"acceNumb": "M12345",
"genus": "Musa",
"species": "acuminata",
"spauthor": "Colla",
"subtaxa": "var. sumatrana",
"subtauthor": "(Becc.) Nasution",
"orgCty": ...,
"acqDate": "20010301",
"mlsStat": true,
"inTrust": false,
"available": true,
"historic": false,
"storage": [10, 20],
"sampStat": 200,
"duplSite": "BEL084",
"bredCode": ...,
"ancest": ....,
"remarks": [ "remark1", "remark2" ],
"acceUrl": "https://my-genebank.org/accession/1",
"geo": {
... <1>
},
"coll": {
... <2>
}
}
----
<1> JSON object with geographic data
<2> JSON object with collecting data
=== Clearing existing values
To reset or clear an existing value in the accession passport data, it should be provided
as `null`. Not providing a field means the field in the database should not be modified.
[source,json,linenums]
----
{
"instCode": "NGA039",
"acceNumb": "TMp-123",
"genus": "Musa",
"orgCty": null <1>
}
----
<1> Country of origin of accession is cleared by sending a `null` value.
=== Insert or update accessions
REST endpoint URL `/api/v0/acn/{instCode}/upsert` allows for inserting new accessions
or updating existing records in Genesys. It accepts a JSON array of Accession JSON objects.
The array provides for sending batches of 50 or 100 accessions in one call, reducing
the HTTP overhead and improving performance.
NOTE: Only the instCode and acceNumb are required (And in some cases genus).
NOTE: If a property is set to `null`, the existing value will be removed from the database.
NOTE: The server will return an error when `instCode` of JSONs does not match the `instCode` in the URL!
=== Deleting accessions
With the introduction of permanent identifiers for accession records in Genesys we have
also introduced the *Accession Archive*. The Archive holds passport data for accession records
that have been deleted from the active database.
REST endpoint URL `/api/v0/acn/{instCode}/delete` accepts an array of `instCode`, `acceNumb`, `genus` triplets
and deletes corresponding accession record from Genesys. The *DELETE* permission is required for this operation.
NOTE: Delete operation will fail if C&E data exists for any accessions listed.
.Delete 3 accessions from active database
[source,http,linenums]
----
POST /api/v0/acn/SYR002/delete
[{
"instCode": "SYR002",
"acceNumb": "12345",
"genus": "Vicia"
}, {
"instCode": "SYR002",
"acceNumb": "12345",
"genus": "Vicia"
}, {
"instCode": "SYR002",
"acceNumb": "IG 1",
"genus": "Vicia"
}]
----
[bibliography]
- [[[mcpd2]]] Alercia, A; Diulgheroff, S; Mackay, M.
http://www.bioversityinternational.org/e-library/publications/detail/faobioversity-multi-crop-passport-descriptors-v2-mcpd-v2/[FAO/Bioversity Multi-Crop Passport Descriptors V.2]. 2012.
[[chApiAccession]]
[[chApiCrop]]
== Managing Crop data
......@@ -6,6 +6,11 @@ Genesys maintains a database of crops and crop groups (e.g. forages). In additio
description, each crop defines a list of taxonomic rules that determine which taxonomies are
included (or excluded) in the group.
Crops and crop groups are referred to and identified by the crop's *short name*. The short name
placeholder in documentation below is marked by `{shortName}`. The short name should have no spaces
and it should contain US-ASCII characters only (a-Z, 0-9).
[NOTE]
.Crop Taxonomic rules
=====================================================================
......@@ -126,10 +131,36 @@ include::{snippets}/crop-create/request-fields.adoc[]
The response is a single crop record as stored on the server.
==== `curl` example
.Example request to register a new crop
include::{snippets}/crop-create/curl-request.adoc[]
=== Localization of crop title and description
The `i18n` field of the JSON crop object is a string encoded JSON object of a two level
JSON formatted dictionary string with first level keys `name` (for the name field)
and `description` (for the description field) and second level keys corresponding to ISO_639_2
encoded vernacular language tags.
For example:
[source,json,linenums]
----
{
"name": {
"en": "Musa",
"es": "Musa",
"ru": "Муса",
"zh": "穆萨"
},
"description": {
"en": "Bananas and plantains",
"es": "Los bananos y plátanos",
"ru": "Бананы и бананы",
"zh": "香蕉和大蕉"
}
}
----
=== Updating taxonomic rules
......@@ -166,38 +197,3 @@ This will remove the crop and crop rules from the system.
.Deleting a crop
include::{snippets}/crop-delete/curl-request.adoc[]
== Managing Passport Data
Accession records are *upserted*, meaning that when the matching accession record
. exists, it will be updated
. does not exist, a new record will be created
Accession data in the database will be updated with whatever data is provided in the
request JSON.
TIP: If you want to clear or un-set a value, upsert it as *null*.
`curl` Call
And this thing HTTP request
include::{snippets}/crop-create/http-request.adoc[]
Request fields
include::{snippets}/crop-create/request-fields.adoc[Kaboom]
HTTP Response
include::{snippets}/crop-create/http-response.adoc[]
Response fields
include::{snippets}/crop-create/response-fields.adoc[]
\ No newline at end of file
[[other-standards]]
== Other relevant standards
[[iso-3166]]
=== ISO-3166 Country codes
https://en.wikipedia.org/wiki/ISO_3166[ISO-3166] standard defines 'Codes for the representation of
names of countries and their subdivisions'.
https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3[ISO-3166-1 alpha-3] codes are three-letter country codes. The
Wikipedia page contains the listing of valid country codes.
Genesys uses http://download.geonames.org/export/dump/countryInfo.txt as the source of ISO-3166 country codes.
[[un-m49]]
=== UN M.49
UN defines standard country or area codes and geographical regions for statistical use:
* http://unstats.un.org/unsd/methods/m49/m49.htm
* http://unstats.un.org/unsd/methods/m49/m49alpha.htm
This diff is collapsed.
[[chSecurityModel]]
== Security model
[[chSecurity]]
== Security
Access to selected resources in Genesys is protected and user permissions are checked before
any API action is executed. Each organization contributing data to Genesys will have
......@@ -11,10 +11,6 @@ Permission to access and manage data for the organization is granted by
helpdesk@genesys-pgr.org upon request. Please contact helpdesk@genesys-pgr.org with the list
of WIEWS codes of institutes you wish to manage.
[[chOAuth]]
== Security and OAuth
To access resources with the APIs described in this manual, you will first need to
create a user account. The simplest is to https://sandbox.genesys-pgr.org/google/login[use your Google+ account]
or alternatively https://sandbox.genesys-pgr.org/registration[creating an account manually].
......@@ -101,6 +97,31 @@ or include it in the request URL as a query string parameter:
$ curl 'https://sandbox.genesys-pgr.org/api/v0/me?access_token=OAUTH-ACCESS-TOKEN'
----
=== Using the refresh token
OAuth access tokens have a fairly short lifetime. When an access token expires, the
refresh token can be used to obtain a new access token. Refresh token is returned as
part of JSON response when verification code is used to obtain the access token:
[source,json]
----
{
"access_token": "28d96a4e-9a31-479b-abc8-17ee1e8c9906",
"token_type": "bearer",
"refresh_token": "2583fd78-bd88-4c2b-afc0-fb231b37d95f",
"expires_in": 43199,
"scope": "read write"
}
----
The refresh token must be securely persisted and can be used to request a new access token
when access token expires:
----
$ curl 'https://sandbox.genesys-pgr.org/oauth/token?grant_type=refresh_token&client_id=CLIENTID&client_secret=SECRET&redirect_uri=oob&refresh_token=OAUTH-REFRESH-TOKEN'
----
== Client errors
......
[[wiews]]
== FAO WIEWS
The http://www.fao.org/wiews-archive/wiews.jsp[World Information and Early Warning System (WIEWS)]
on Plant Genetic Resources for Food and Agriculture (PGRFA), has been established by FAO,
as a world-wide dynamic mechanism to foster information exchange among Member Countries and as an instrument for the periodic
assessment of the State of the World's PGRFA.
The FAO WIEWS database contains basic information about institutes working with PGRFA. The data includes
full names, acronyms, website links and contact information.
Genesys regularly updates the list of institutes from the FAO WIEWS database and makes them accessible
at https://www.genesys-pgr.org/wiews/active.
NOTE: This data cannot be directly managed through Genesys, changes must be applied to the WIEWS database.
[[wiews-instcode]]
=== WIEWS Institute Codes
The FAO WIEWS code of the institute consist of the 3-letter <<iso-3166,ISO 3166-1 alpha 3>> country code of
the country where the institute is located plus a number (e.g. COL001, USA1004).
The Multi-Crop Passport Descriptors standard relies on WIEWS codes.
The automated import of institute data allows Genesys to present individual pages for
genebanks registered in FAO WIEWS database.
[cols="1,2", options="header"]
.Direct access to genebank pages using WIEWS code
|===
|WIEWS Code
|Genesys URL
|COL001
|https://www.genesys-pgr.org/wiews/COL001
|NGA039
|https://www.genesys-pgr.org/wiews/NGA039
|===
=== Obtaining a WIEWS code
Institute codes can be generated on line by national FAO WIEWS administrators (WIEWS@fao.org) or
can be requested by filling the form at http://www.fao.org/wiews-archive/wiewspage.jsp?show=newuserdialogMCPD.jsp
=== Inactive WIEWS codes
The WIEWS code of an institute may change. In that case, the record is marked as inactive and it
will refer to the newly assigned code. Genesys will render a message that the institute record
is archived and provide a link to the new code:
.https://www.genesys-pgr.org/wiews/ALB017
image::wiews-archived.png[role="text-center"]
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment