Run the unit test
AccessionControllerTest#getAccessionsTest and observe that the performance of
AccessionRefAspect is horribly slow:
WARN AccessionRefAspect:112 - Re-referencing AccessionRefs for 2000 accessions took 25291ms
AccessionRef is now embedded and therefore the code must check every single Dataset and Subset individually, load all data, stream and process it for each individual accession every time an accession record is persisted.
AccessionRefAspect must execute in a few milliseconds. It is a very important aspect and it affects the speed of uploading data to Genesys.
Ideally the aspect would only run
update accession_ref set accession_id = ? where instCode = ? and ((accenumb = ? and genus = ?) or (doi = ?))
We need direct access to accession references so that we can do a direct query and update. The code should allow for single and List batch updating of references.
I see two options and am looking for ideas.
AccessionRef and make it abstract and convert
dataset_accessions to two separate
SubsetAccessionRef that extend AccessionRef. This way the primary key can remain on instcode, genus, species and dataset/subset ID.
No data migration is required, only code changes.
private Subset subset and
private Dataset dataset (only one can be set). Liquibase needs to move data to the new table.
SubsetController now accepts a list of accession UUIDs, but it should accept a list of AccessionRefs (same as Dataset).