Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Information on how to annotate datasets: https://developers.google.com/search/docs/data-types/dataset

> We can understand structured data in Web pages about datasets, using either schema.org Dataset markup, or equivalent structures represented in W3C's Data Catalog Vocabulary (DCAT) format. We also are exploring experimental support for structured data based on W3C CSVW, and expect to evolve and adapt our approach as best practices for dataset description emerge. For more information about our approach to dataset discovery, see Making it easier to discover datasets.

For more info on those:

- W3C's Data Catalog Vocabulary: https://www.w3.org/TR/vocab-dcat-3/

- Schema.org dataset: https://schema.org/Dataset

- CSVW Namespace Vocabulary Terms: https://www.w3.org/ns/csvw

- Generating RDF from Tabular Data on the Web (examples on how to use CSVW): https://www.w3.org/TR/csv2rdf/



It’s funny because Google does not use these standards to validate.

I keep getting errors from Google that some of my dataset’s descriptions are over 5,000 characters even though dcat:description does not have a size limit.

Of course it’s impossible for me to report a bug in how they index.


You could submit a dataset containing the bug report :-)


Use cases for such [LD: Linked Data] metadata:

1. #StructuredPremises:

> (How do I indicate that this is a https://schema.org/ScholarlyArticle predicated upon premises including this Dataset and these logical propositions?)

2. #LinkedMetaAnalyses; #LinkedResearch "#StudyGraph"

3. [CSVW (Tabular Data Model),] schema.org/Dataset(s) with per column (per-feature) physical quantity and unit URIs with e.g. QUDT and/or https://schema.org/StructuredValue metadata for maximum data reusability.

4. JupyterLab notebooks:

4a. JupyterLab Metadata Service extension: https://github.com/jupyterlab/jupyterlab-metadata-service :

> - displays linked data about the resources you are interacting with in JuyterLab.

> - enables other extensions to register as linked data providers to expose JSON LD about an entity given the entity's URL.

> - exposes linked data to the user as a Linked Data viewer in the Data Browser pane.

4b. JupyterLab Data Explorer: https://github.com/jupyterlab/jupyterlab-data-explorer :

> - Data changing on you? Use RxJS observables to represent data over time.

> - Have a new way to look at your data? Create React or lumino components to view a certain type.

> - Built-in data explorer UI to find and use available datasets.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: