Skip to content

Conversation

@jeff-zucker
Copy link
Member

As discussed, this PR includes a make-core.js script which removes the following from the data:

  • unWantedType = ['Ontology','Specification','ClassOfProduct'];
  • unWantedCategory = ['ResearchPaper','Primer','OtherTechResource'];
  • unWantedPredicate = ['conformsTo','hasDependencyOn','developer','platform','siloId','siloUsername','member'];

The catalog-data.ttl and catalog-shacl.shce have been modified with those items removed from both sources.

I have left in some predicates we may want to remove later including softwareStackIncludes, programmingLanguage,
clientID.

This PR also adds validateTypes.js to the test scripts. It ensures

  • all records have an rdf:type
  • all rdf:types match our shapes
  • all predicates match our shapes

@elf-pavlik - I presume you will want to keep the items I have removed from the core in your own data and shacl. A diff of this version and the previous version of those files should show the changes.

@elf-pavlik elf-pavlik removed their request for review September 5, 2025 02:58
@elf-pavlik
Copy link
Member

@elf-pavlik - I presume you will want to keep the items I have removed from the core in your own data and shacl. A diff of this version and the previous version of those files should show the changes.

Yes, I can get whatever I need from the git history.

@jeff-zucker
Copy link
Member Author

I have modified the SKOS categories to a) edit out the removed predicates and b) break the software libraries into narrower categories and c) add a "Related Tools/Resources" to the "LearningResources" category to accommodate things like Practioners Peertube page, LOV, etc.

If you'd like to see these changes in action, I've created https://jeff-zucker.solidcommunity.net/catalog-core/ with all of the changes in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this validation be done using a SHACL engine rather than having a custom script. Note that

"test:shacl": "rdf-ext-cli --shacl-url ./catalog-shacl.ttl ./catalog-data.ttl --pretty --output-prefix sh=http://www.w3.org/ns/shacl# --output-prefix skos=http://www.w3.org/2004/02/skos/core# --output-prefix ex=http://example.org/# --output-prefix xsd=http://www.w3.org/2001/XMLSchema# --shacl-details --shacl-error",
is already doing some SHACL validation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to do this in SHACL but so far I have not discovered how. Can you tell me
a) how to check that all records have an rdf:type? The current SHACL does not complain if I test a record without a type. I have tried multiple ways to target all records, none of which has worked.
b) how to check that all predicates are ones defined in the shapes? Again, the current SHACL does not complain if an unknown predicate is introduced or if e.g. a predicate is misspelled.

Copy link
Member Author

@jeff-zucker jeff-zucker Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[EDIT : never mind this produces errors even on valid data] Upshot : still looking for something that requires all subjects to have an rdf:type.]

Finally! This solves the rdf:type issue.

:RequireTypeShape
  a sh:NodeShape ;
  sh:targetNode [
    a sh:SPARQLTarget ;
    sh:select "SELECT ?s WHERE { ?s ?p ?o . }" ;
  ] ;
  sh:property [
    sh:path rdf:type ;
    sh:minCount 1 ;
  ] .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I can forbid unknown predicates by adding sh:closed to each shape. @jeswr - Are there downsides or gotchas to doing that?

Copy link
Member

@jeswr jeswr Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I can forbid unknown predicates by adding sh:closed to each shape. @jeswr - Are there downsides or gotchas to doing that?

Provided you do not expect to be putting unknown predicates in the files that you are validating - there are no gotchas.

@jeff-zucker jeff-zucker merged commit e9fe58e into main Sep 24, 2025
3 checks passed
@jeff-zucker jeff-zucker deleted the core branch October 4, 2025 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants