Skip to main content

How it works

ProcessArchiveReader turns a bucket and key into a typed object by combining an Avro schema-on-read strategy with S3 object metadata that the producer side (the jEAP Process Archive Service) writes when an artifact is archived.

The read flow

  1. The object is fetched from S3 together with its user metadata (GetObject for the bytes, HeadObject for the metadata). When a version is supplied, the S3 versionId is used for both calls.
  2. The writer schema is located via the object's schema-file-key metadata entry, fetched as a separate S3 object and parsed with the Avro Schema.Parser.
  3. A SpecificDatumReader decodes the binary payload, resolving the writer schema against the reader schema embedded in the requested Avro-generated class. Avro schema resolution makes compatible evolution (added fields with defaults, etc.) transparent.

S3 object layout

Each archived artifact relies on two S3 objects and on object metadata:

ElementSourceMeaning
Object payloadbucket + keyThe binary Avro-encoded artifact
schema-file-key metadataobject user metadataKey of the S3 object that holds the writer schema (.avsc)
Writer schema objectbucket + schema-file-keyThe Avro schema the payload was written with
is_encrypted metadataobject user metadatatrue when the payload is encrypted (see encrypted artifacts doc)

Versions

readArtifact(type, bucket, key) reads the current version of the object. The overload readArtifact(type, bucket, key, version) passes the value through to S3 as the object versionId, so it returns a specific historical version from a version-enabled bucket.

Error handling

All failures surface as an unchecked ProcessArchiveReaderException. Its static factory methods map the distinct failure modes:

Factory methodCause
readExceptionAvro AvroTypeException — writer/reader schema incompatible
ioExceptionIOException while decoding the payload
writerSchemaNotReadableExceptionS3 error while fetching the writer schema object
writerSchemaParseExceptionThe writer schema could not be parsed