How to publish phenotypic data on the Genesys Catalog
Table of Contents
Genesys is the global portal of plant germplasm accessions. It currently holds information for 3,611,454 accessions from 481 institutes. The Genesys Catalog [ADD LINK] links existing phenotypic information to germplasm accessions. To the date, it contains information of XX phenotypic datasets for XX crops.
To publish phenotypic datasets in the Genesys Catalog, your institute needs to be registered in Genesys and assign a person to be in charge of publishing data on behalf of the institute [ADD LINK]. Here you can check if your institute is already listed in GENESYS: https://www.genesys-pgr.org/wiews/active. If you find any discrepancies and/or outdated information on your institute’s description, please contact FAO’s information system on Plant Genetic Resources for Food and Agriculture WIEWS.
Please follow the steps described here to publish phenotypic datasets in the Genesys Catalog:
1. Identify dataset
You can identify an adequate phenotypic dataset for publishing when it responds to the following considerations:
Perceived usefulness of the dataset: Defined by the regional importance of the crop(s), the significance of the traits measured, perceived interest of datasets for scientists and policy makers etc., whether dataset is one of a series of datasets produced in ongoing characterization/evaluation trials.
Completeness of the dataset: This is explained by the number of accessions and traits originally measured and documented in trial(s).
Documentation status: Whether data already documented and digitized, whether descriptor lists developed for traits measured, whether analysis completed for characterization/evaluation trials to facilitate documentation.
Availability of datasets: The willingness of scientists to share datasets and contribute to the cleaning and verification of the data.
Pursuant to 1-4: The estimated inputs and time required to establish workflows and prepare pilot datasets for publication on GENESYS and elsewhere.
2. Prepare the dataset file
Digitize your data (if necessary) and store it in a table-like format (XLS, XLSX, CSV, TXT) that facilitates the use of the information contained in the phenotypic dataset. And standardize critical data fields. Phenotypic datasets registered in the Genesys Catalog are discoverable through the accessions and traits recorded in the phenotypic dataset. For this you will need to:
Identify and standardize the unique identifier of the accessions listed in phenotypic dataset (e.g., PUID, ACCENUMB, DOI. See MCDP V.2.1.).
Standardize the traits recorded in the dataset by using an existing controlled vocabulary or crop ontology.
3. Compile complementary data.
Plant phenotypes are a result of the interaction between genotypes and the environment they are exposed to. Therefore, complementary information on the agronomic management and environmental conditions of the experiment in which the data stored in the phenotypic dataset was obtained, are highly valuable.
Below you will find a list of parameters that can be reported as complementary data, as described in Ćwiek-Kupczyńska et al., 2016. Make sure you have this information on hand when completing the dataset metadata:
- Greenhouse or growth chamber: Carbon dioxide is controlled or uncontrolled?
- Average carbon dioxide during the light and dark period
- Air humidity (moisture)
- Daily photon flux (light intensity)
- Light quality
- Rooting medium: Aeroponics/hydroponics (water based, solid-media based)/soil type (sand, clay, peat, mixed, etc)
- Greenhouse: Container type, volume, height, other dimensions, number of plants per container.
- Field: Plot size, sowing density
- Soil parameters: Soil penetration strength, water retention capacity, organic matter content, porosity, rooting medium temperature.
- For hydroponics: Composition, concentration.
- For soil: Extractable N content per unit ground area before fertiliser added; type and amount of fertiliser added per container or sq. m; concentration of P and other nutrients before start of the experiment; extractable N content per unit ground area at the end of the experiment.
- Irrigation type: irrigation from top/bottom/ drip irrigation
- Volume (L) and frequency of water added per container or sq. m
- For soil: Range in water potential (MPa)
- Concentration of Na, Cl and Mg in the water used for irrigation
- For soils and hydroponics: Electrical conductivity (dS/m)
- If sample was submerged and emerged: Depth, time
- Water temperature
- Tidal phase
- Description of interacting organism (patho- gens, mutualists, herbivores, endophytes, etc.)
Treatments (if available, this can be provide as a separate file. This is usually available )
- Examples of treatments include: Seasonal environment, air temperature regime, soil temperature regime, antibiotic regime, chemical administration, disease status, fertilizer regime, fungicide regime, gaseous regime, gravity, growth hormone regime, herbicide regime, mechanical treatment, mineral nutrient regime, humidity regime, non-mineral nutrient regime, radiation (light, UV-B, X-ray) regime, rainfall regime, salt regime, watering regime, water temperature regime, standing water regime, pesticide regime, pH regime, other perturbation.
- Blocking: Block ID, sub-block ID, sub-sub-block ID, superblock ID, row ID, column ID, other ID
- Replication: Biological replication, technical replication, experimental unit
For more information to document the experiment's conditions, see MIAPPE.
4. Register dataset and complete metadata
This step is performed online [ADD LINK]. Here you will be asked to complete the metadata of your dataset, and upload your dataset for registering it. Find recommendations to complete metadata here.
Our system will register your dataset and it will assign it a unique identifier. The system will also perform a two-step validation to check:
- That the accessions reported in the dataset are registered in GENESYS. And,
- That the traits from your dataset follow a controlled vocabulary.
5. Publish dataset
Once you finish the registration of your dataset, you will receive a summary of all the metadata provided. You will be required to verify this information and give your approval for publication in the Genesys Catalog.
[Provide/link to an exemplary dataset published in the Genesys Catalog].
Make sure you meet the following elements for publishing your institute's datasets into the Genesys Catalog:
- Phenotypic dataset prepared in machine-readable format (XLS, XLSX, CSV).
- Accessions’ passport data registered in Genesys.
- Accessions’ unique identifier and traits standardized.
- Complimentary datasets compiled in electronic and machine-readable formats.
- Phenotypic dataset and complimentary datasets uploaded in repository. Dataset unique identifier obtained.
- Metadata completed, reviewed and approved.