Validation¶
Validation utilities for CSVW-EO metadata and generated datasets.
Metadata Validation¶
csvw_eo.validate_metadata
¶
Validate metadata file format.
main() -> None
¶
Command-line interface for SHACL validation of CSVW-EO metadata.
This function parses command-line arguments specifying the metadata JSON-LD file and the SHACL shapes file, then runs SHACL validation.
If validation succeeds, a success message is printed. If validation fails, the validation report is printed and the program exits with a non-zero status code.
Source code in csvw-eo-library/src/csvw_eo/validate_metadata.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
validate_metadata(metadata: dict[str, Any]) -> TableMetadata
¶
Validate CSVW-EO metadata against the pydantic model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metadata
|
dict
|
CSVW-EO metadata structure. |
required |
Source code in csvw-eo-library/src/csvw_eo/validate_metadata.py
11 12 13 14 15 16 17 18 19 20 21 | |
SHACL Validation¶
csvw_eo.validate_metadata_shacl
¶
SHACL validation for CSVW-EO metadata.
This module validates CSVW-EO metadata files against a SHACL schema using the pySHACL engine. The metadata is expected to be in JSON-LD format and the SHACL shapes in Turtle format.
Requires
pyshacl rdflib
main() -> None
¶
Command-line interface for SHACL validation of CSVW-EO metadata.
This function parses command-line arguments specifying the metadata JSON-LD file and the SHACL shapes file, then runs SHACL validation.
If validation succeeds, a success message is printed. If validation fails, the validation report is printed and the program exits with a non-zero status code.
Source code in csvw-eo-library/src/csvw_eo/validate_metadata_shacl.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
validate_metadata_shacl(metadata_file: Path, shacl_file: Path) -> tuple[bool, str]
¶
Validate CSVW-EO metadata against a SHACL schema.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metadata_file
|
Path
|
Path to the metadata file in JSON-LD format. |
required |
shacl_file
|
Path
|
Path to the SHACL shapes file in Turtle format. |
required |
Returns:
| Type | Description |
|---|---|
Tuple[bool, str]
|
A tuple containing: - bool : Whether the metadata conforms to the SHACL schema. - str : Textual validation report produced by pySHACL. |
Source code in csvw-eo-library/src/csvw_eo/validate_metadata_shacl.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
Structural Validation¶
csvw_eo.assert_same_structure
¶
Utility script to verify that a generated dummy CSV preserves the structural.
properties of an original CSV dataset.
The script checks: - column names and order - inferred CSVW-EO datatypes - nullability (required vs optional columns) - optional categorical value compatibility
It does NOT check statistical similarity, only structural compatibility.
assert_same_structure(df1: pd.DataFrame, df2: pd.DataFrame, check_categories: bool = True) -> None
¶
Verify that two CSV files share the same structural schema.
The function checks column names/order, inferred datatypes, nullability constraints, and optionally categorical value sets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df1
|
DataFrame
|
Original dataframe. |
required |
df2
|
DataFrame
|
Dummy dataframe. |
required |
check_categories
|
bool
|
Whether to verify that categorical values in the dummy data are subsets of those in the original data. |
True
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If any structural mismatch is detected. |
Source code in csvw-eo-library/src/csvw_eo/assert_same_structure.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | |
main() -> None
¶
Command-line entry point for the CSV structure validator.
Source code in csvw-eo-library/src/csvw_eo/assert_same_structure.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |