Importing images from GG
GGCE's file repository does not expose attachments folder directly. Access to attachment bytes is controlled by the API that enforces permissions checks. This is a significant difference to GG, where attachments sit in a publicly readable location.
support#302 (closed) requests for migration of image and document data from legacy GG to GGCE:
We currently have images and PDF files loaded in our standard GG DBs. We have migrated our DBs to GGCE for testing but our attachments are not showing.
GG documentation about "Locations of Attachment Files"
1 Each organization running GRIN-Global will have its own unique file server where the image and document files are stored. The GG administrator can assist with deleting the files if necessary.
The following information is intended primarily for Administrators. The full directory path breaks down into several parts:
Example: C:\inetpub\wwwroot\gringlobal\uploads\images
2 further says that a Method attachment would be located at https://npgsweb.ars-grin.gov/gringlobal/uploads/images/
for method_attach/RYE.AGRON.ABERDEEN.16/img_jca.JPG
.
NOTE: GG always puts the images in the uploads\images
directory under the GG site install directory. (You couldn't change that without altering the MT code and rebuilding.)
Links
..._attach
tables can also contain links (URLs). These have have category_code = 'LINK'
, while uploaded files use DOCUMENT
and IMAGE
.
Migrating attachments
Two mechanisms for migrating attachments are possible: (a) providing a base URL pointing to the GG server http...uploads/images folder and (b) providing a copy of the C:\inetpub\wwwroot\gringlobal\uploads\images
folder to GGCE.
In both cases, GGCE Administrator must provide the base path
to this location, either in a form of a URL https://..../uploads/images
or filesystem path /path/to/uploads/images
.
Downloading bytes from GG website
If GG server is accessible, providing the URL is the more convenient option as there's no need to back up, transfer and make the folder accessible to GGCE.
The Administrator specifies the base path https://npgsweb.ars-grin.gov/gringlobal/uploads/images/
where attachments are hosted.
GGCE scans the ..._attach
tables and downloads attachment bytes from the remote server for import into the file repository.
Migrating bytes from a folder
When dealing with a large number of attachments, then providing direct access to the uploads/images
folder will be the faster option as there is no HTTP overhead and the bytes are directly accessible for importing.
The backup is extracted on the host server and made available to GGCE API as a docker volume, for example with -v ./ggimages:/migration/images
. This makes the folder available to GGCE on /migration/images
.
The Administrator then specifies /migration/images
as the base path and GGCE scans the ..._attach
tables and looks for attachment bytes in this folder for import into the file repository.
Ingesting bytes into GGCE file repository
GGCE server provides a form where the Administrator can:
- Selects one of the
..._attach
tables from a dropdown - Provides base path to attachment data (either URL or filesystem path)
- Specifies maximum number of errors that GGCE will tolerate before stopping the process
- Start with
0
to force GGCE to stop immediately - Increase to
100
- Any value
< 0
instructs GGCE to continue migration regardless of errors
- Start with
GGCE server then:
- Scans the selected
..._attach
tablewhere category_code != 'LINK' AND repository_file_id IS NULL
- If
attach.virtual_path
is a valid HTTP URL, then skip the record -- links stay links. - Construct a path to each attachment
basepath + attach.virtual_path
- Replace all
\
with/
invirtual_path
, replace//
with/
- Get the bytes (either with a HTTP request or by opening the file)
- If bytes are found, add them to the file repository and populate
file_repository_id
of the attachment record in the database - If bytes are not found, log the problem and continue until a maximum number of errors is reached.
accession_inv_attach
Sample data from accession_inv_attach_id | inventory_id | virtual_path | thumbnail_virtual_path | GG URL |
---|---|---|---|---|
22155 | 43694 | AIA/Zea/14/6114/SH18-797_DIM.pdf | AIA/Zea/14/6114/SH18-797_DIM_thumbnail.pdf | https://mgb.cimmyt.org/gringlobal/uploads/images/AIA/Zea/14/6114/SH18-797_DIM.pdf |
12373 | 6114 | AIA/Zea/14/6114/CIMMYTMA 6118 - Reference Inventory.jpg | AIA/Zea/14/6114/CIMMYTMA 6118 - Reference Inventory_thumbnail.jpg | https://mgb.cimmyt.org/gringlobal/uploads/images/AIA/Zea/14/6114/CIMMYTMA%206118%20-%20Reference%20Inventory.jpg |
The base URL for accession_inv_attach
for https://mgb.cimmyt.org is https://mgb.cimmyt.org/gringlobal/uploads/images