Comet Shoreline Pipeline MVP - Offcycle Summer 2023

The goal of these two offcycle sprints during the 2023 summer was to get the pipeline for ingesting into Comet and publishing to Shoreline working. We can successfully ingest to Comet and publish to Shoreline!

What was the goal of Sprint?
The goal of these two summer 2023 sprints (local UCSD and Surfliner offcycle) is the pipeline to ingest public shapefile data into Comet and publish them to Shoreline.

  • Merge updated M3 that supports geospatial objects and update code according to new fields
  • Superskunk support to handle non-M3 fields and any concatenation or modification fields for ingest into Shoreline
  • Ingest multiple files from a local campus location via Bulkrax
  • Provide access to original datasets from Shoreline
  • Setup single-step mediated workflow in Comet to publish files to Shoreline
  • Resolve bounding box display issues
  • Mint Arks for published geospatial objects and use Ark in ID field
  • Create Surfliner honeycomb team for review Apps
  • Review and work on collection creation in Comet to support display in Shoreline

What is the milestone that this sprint is supporting?
This was an offcycle sprint to finalize a few elements that had been worked on in the Comet 24 / Lark Sprint 11

Accomplishments of Sprint X:

  • M3 supporting geospatial elements were merged and the code was updated to support the new field names.
  • Superskunk was updated handle non-M3 fields (Access Rights, Title) and any concatenation or modification fields (Language, Bounding Box and Geometry coordinates, shorted campus name) for ingest into Shoreline
  • Can ingest multiple files (PreservationFile and ServiceFile) from a local campus location via Bulkrax
  • Original datasets can be downloaded from Shoreline
  • A single-step mediated workflow is available in Comet that supports review of an object before publishing files to Shoreline. Arks are minted in the publishing process and pushed to the ID field which is used by Shoreline.
  • Bounding box display issues were resolved
  • Created a Surfliner honeycomb team for review Apps
  • Work was done on collection creation in Comet to support display in Shoreline

Did we do everything we set out to accomplish?
Mostly! We can ingest public shapefiles into Comet, review them, publish them, and have them display in Shoreline!!! Can I repeat this again, WE CAN INGEST INTO COMET AND HAVE THE METADATA DISPLAY AND DATA FILES AVAILABLE IN SHORELINE!!!! There still some work to do on Collections, but getting data into Comet and pushed to a discovery platform was hugely successful!

What’s next? [if applicable]
Some work around collections and some fine tuning of Bulkrax ingest. There's also some work to finish supporting Collection relationships.

GitLab link:

Offcycle Shoreline https://gitlab.com/surfliner/surfliner/-/milestones/102#tab-issues

UCSD Shoreline Pipeline MVP https://gitlab.com/surfliner/surfliner/-/milestones/101#tab-issues