Identifiers in your crate
According to the JSON-LD spec, IRI's should be used to identify nodes. That said, the spec defines some other types of identifiers that are valid in JSON-LD but are not valid IRI's.
The RO Crate specification goes further in this regard.
For the sake of validation, this component conforms to both but the SHOULD's in the RO Crate spec are treated as MUST's.
What is valid
Here's the validateId function with comments explaining what the check is
export function validateId({ id, type }) {
if (!id) {
return { isValid: false };
}
// if it's the root descriptor - it's valid
if (id === "ro-crate-metadata.json") {
return { isValid: true };
}
if (type) {
// if type matches File or Dataset then whatever is provided is valid
type = isArray(type) ? type.join(", ") : type;
if (type.match(/file/i)) return { isValid: true };
if (type.match(/Dataset/i)) return { isValid: true };
}
// if there are spaces in the id - encode them
if (id.match(/\s+/)) {
id = id.replace(/\s+/g, "%20");
}
// @id is relative
if (id.match(/^\/.*/)) return { isValid: true };
// @id starting with . is valid
if (id.match(/^\..*/)) return { isValid: true };
// @id starting with # is valid
if (id.match(/^\#.*/)) return { isValid: true };
// @id with blank node is valid
if (id.match(/^\_:.*/)) return { isValid: true };
// arcp URI's are valid
if (id.match(/arcp:\/\/name,.*/)) return { isValid: true };
if (id.match(/arcp:\/\/uuid,.*/)) return { isValid: true };
if (id.match(/arcp:\/\/ni,sha-256;,.*/)) return { isValid: true };
// return { isValid: true };
// otherwise check that the id is a valid IRI
try {
let result = validateIriPkg.validateIri(id, validateIriPkg.IriValidationStrategy.Strict);
if (!result) {
// it's valid
return { isValid: true };
} else if (result?.message?.match(/Invalid IRI according to RFC 3987:/)) {
// otherwise
const message = `${result.message.replace(
/Invalid IRI according to RFC 3987:/,
"Invalid identifier"
)}. See https://github.com/describo/crate-builder-component/blob/master/README.identifiers.md for more information.`;
return { isValid: false, message };
}
} catch (error) {
return { isValid: false };
}
}
What happens when crates are loaded into the component
Every node is checked for a valid id and if they all are, then the crate is loaded.
As the component is ingesting each of the entities, it checks that it has an @type property and tries to set it accordingly. To do this it uses the @id of the entity and if it's a URL, then @type becomes URL
otherwise @type is set to Thing
.