ETT-1459: Record first ingest date#185
Conversation
* add first ingest date column to feed_audit table * record item in feed_audit at the end of collate * remove record_audit functionality from LocalPairtree (now unused except in development); emit warning (could record to feed_storage for consistency if we want instead, but we aren't really using it..?) * testing with storage classes in collate is a bit messy because of the distinction between depositing to the repo and reading back from the repo * add additional logging options in Stage (need to DRY out though) * additional logging for collate (should log duration; see ETT-824) * add some notes towards ETT-1687 * Mock depositing item for collate tests with mocked storage
This addresses two issues: * We are no longer using symlinks to deposit material into the repository. * When we read from the repo, we just care about the root of the repo (just like e.g. babel apps reading from the repo), not about any symlinks, etc. Specific changes: * remove LinkedPairtree * remove "repository" key in config & references to link_dir / obj_dir; replace with a "repository_root" key * TempDirs keeps track of what it creates; callers can create additional temp dirs that will get cleaned up at the end of a test.
| `id` varchar(30) NOT NULL, | ||
| `sdr_partition` tinyint(4) DEFAULT NULL, | ||
| `zip_size` bigint(20) DEFAULT NULL, | ||
| `first_ingest_date` datetime NULL DEFAULT CURRENT_TIMESTAMP, |
There was a problem hiding this comment.
This is the change for recording first ingest date; the application side doesn't need to handle it directly at all beyond making sure that something is recorded in feed_audit
| my $self = shift; | ||
|
|
||
| return $self->{volume}->get_zip_path(get_config('staging', 'zipfile')) . '.gpg'; | ||
| return $self->{volume}->get_zip_path(get_config('staging', 'zipfile')) . "-$self->{name}.gpg"; |
There was a problem hiding this comment.
This avoids collisions with encrypted zips left over from other storages. They should get cleaned up but don't always in practice.
There was a problem hiding this comment.
See inline suggestion re logging.
I'm pleased with all the cleanup happening here.
There still appears to be a brittle test but that's something that's happened sporadically to me for a long time, probably out of scope for this. I tend to suspect a race condition because it's a pretty simple test.
# Failed test 'HTFeed::Storage::ObjectStore with encryption enabled stores the mets and encrypted zip'
# at t/storage_object_store.t line 194.
|
re: brittle test I haven't seen it locally or in github. I can keep an eye out for it though. |
8b17af6 to
a7bd638
Compare
|
I will wait to merge & deploy; I'd like to get the schema updated in production and see about populating info using the existing audit stuff. |
This change moves the functionality for recording items in
feed_auditto the end of the Collate stage rather than a particular storage.It also takes the opportunity to:
link_dirandobj_dirSee comments in more detail on each commit.
I had looked into options for rolling back failed deposits to S3, but the ideas I had didn't work out (see ETT-1483).