Notes for Contributors
This page gives general information about developer workflows valid for the entire project. For more specific information about developing for the client or server part of the project, refer to:
Stable version of the client contributor notes (or in the GitHub repo)
Stable version of the server contributor notes (or in the GitHub repo)
Git Branches
master: This is the stable branch. Release tags are always on this branch and the latest release is always the head of this branch.
develop: This is the main development branch and can be ahead of the master branch. One should never directly merge and push to develop but perform a pull request on GitHub. The PR can only be merged if approved by another developer and all automatic tests pass.
wip_xx: Feature branches for feature number xx start with wip_xx (one can add a short name to the branch name). They always branch off develop, and as explained above, are merged to develop via GitHub pull requests.
release/vx.y.z: These are release branches (for version vx.y.z). They always branch off from develop. Once the release process is complete (see below), the release branch is merged to both master and develop via GitHub pull requests.
Linting and Other Checks
To ensure code quality and consistency, we perform several checks using various tools. Below is a list of the checks that should be performed:
Code Formatting: Use
black
to automatically format the code. Inlomas/server/lomas_server
andlomas/client/lomas_client
:black .
Code Style and Static Analysis: Use flake8 to verify formatting and perform static code analysis. In
lomas/server/lomas_server
andlomas/client/lomas_client
:
flake8 .
Static Type Checking: Use mypy for static type checking. Note that both the server and the client have their own mypi.ini files to ignore specific warnings. In
lomas/server
andlomas/client
:
mypy .
Additional Static Analysis: Use pylint for further static analysis. Note that both the server and the client have their own .pylintrc files to ignore specific warnings. In
lomas/server/lomas_server
andlomas/client/lomas_client
:
pylint .
To streamline the process, you can use the run_linter.sh
script in ``lomas`. The first time you run this script, use the following command to install dependencies:
chmod +x run_linter.sh
./run_linter.sh --install-deps
For subsequent runs, simply execute:
./run_linter.sh
There should be no error or warning, otherwise the linting github action will fail. All configurations are in lomas/server/pyproject.toml
and lomas/client/pyproject.toml
.
As detailed below, we rely on GitHub workflows to automatically run these checks on pull requests, ensuring consistency and quality across all contributions.
GitHub Workflows
This project uses a number of GitHub workflows to automate various CI/CD tasks. These task can also be manually run in a local environment during development. Please refer to the workflow files in .github/workflows/
for further details.
The table below gives an overview of which workflows are triggered by what events.
Workflow / Trigger |
PR to develop |
PR to master |
Push to develop |
Push to release/** |
Push to master |
GitHub release |
---|---|---|---|---|---|---|
Tests and Linters |
Yes |
Yes |
No |
No |
No |
No |
Docker build and push |
No |
No |
Yes (tag = git sha) |
No |
Yes (tag = git sha) |
Yes (tags = latest and semver (x.y.z)) |
Client library push |
No |
No |
No |
No |
No |
Yes (must manually adjust version) |
Helm charts push |
No |
No |
No |
Yes (must manually adjust version) |
No |
No |
Documentation push |
No |
No |
Yes (for latest) |
No |
No |
Yes (for stable, must manually add version) |
Security with CodeQL* |
Yes |
Yes |
No |
No |
No |
No |
Of these workflows, three of them need manual intervention to adjust the version number:
Client library push: The version must be set in
client/setup.py
Helm chart push: The chart version (
version
) and app version (AppVersion
) of the server and the client must be updated inserver/deploy/helm/charts/lomas_server/Chart.yml
andclient/deploy/helm/charts/lomas_client/Chart.yaml
.Documentation push: If a new version is released, it must be added to the
docs/versions.yaml
file. For more details on the generation of the documentation, please refer todocs
and thedocs/build_docs.py
script.
*The Security with CodeQL workflow is also triggered every Monday at 9am.
Release Workflow
The following actions must take place in this order when preparing a new release:
Create a
release/vx.y.z
branch from develop.Fix remaining issues.
Adjust versions for the client library, the helm charts, as well as for the documentation.
Create a GitHub PR from this branch to develop AND master (make sure you are up to date with develop by rebasing on it)
Once merged, manually create a release on GitHub with the tag
vx.y.z
.
The workflows listed in the previous section will take care of building and publishing the different items (docker images, pip packages, etc.).
Note: Helm charts are updated when there is a push on the release/vx.y.z
branch. If you have a specific deployment that rely on the Chart, you can test it before finishing the release. Then, do not forget to update the chart and app versions of your specific deployment.
Adding a DP Library
It is possible to add DP libraries quite seamlessly. Let’s say the new library is named ‘NewLibrary’
Steps:
0. Add the necessary requirements in lomas/lomas_server/requirements.txt
and lomas/lomas_client/requirements.txt
Add the library the the
DPLibraries
StrEnum class inlomas/lomas_server/constants.py
(DPLibraries.NEW_LIBRARY = "new_library"
) and add theNewLibraryQuerier
option in thequerier_factory
(inlomas/lomas_server/dp_queries/dp_libraries/factory.py
).Create a file for your querier in the folder
lomas/lomas_server/dp_queries/dp_libraries/new_library.py
. Inside, create a classNewLibraryQuerier
that inherits fromDPQuerier
(lomas/lomas_server/dp_queries/dp_querier.py
), your class must contain acost
method that return the cost of a query and aquery
method that return a result of a DP query.Add the three associated API endpoints .
a. Add the endpoint handlers in
lomas/lomas_server/routes/routes_dp.py
:/new_library_query
(for queries on the real dataset),/dummy_new_library_query
(for queries on the dummy dataset) and/estimate_new_library_cost
(for estimating the privacy budget cost of a query).b. The endpoints should have predefined pydantic BaselModel types. Aadd BaseModel classes of expected input
NewLibraryModel
,DummyNewLibraryModel
,NewLibraryCostModel
inlomas/lomas_server/utils/query_models.py
and add the link for the archives in the constant dictMODEL_INPUT_TO_LIB
:{"NewLibraryModel": DPLibraries.NEW_LIBRARY}
.c. The endpoints should have predefined default values
example_new_library
,example_dummy_new_library
inlomas/lomas_server/utils/query_examples.py
.
Add tests in
lomas/lomas_server/tests/test_new_library.py
to test all functionnalities and options of the new library.Add the associated method in
lomas-client
library inlomas/client/lomas_client/client.py
. In this case there should benew_library_query
for queries on the private and on the dummy datasets andestimate_new_library_cost
to estimate the cost of a query.Add a notebook
Demo_Client_Notebook_NewLibrary.ipynb
inlomas/client/notebook/
to give example of the use of the library.
External Loggers
Some libraries have ‘custom object’ parameters which are not readily serialisable.
In those cases, a logger
library can be made to serialise the object in the client (before sending them to the server via FastAPI) and then deserialise them in their DPQuerier
class in the server.
Some examples are avalaible here:
opendp_logger
for opendp pipelines: https://github.com/opendp/opendp-loggerdiffprivlib_logger
for diffprivlib pipelines: https://github.com/dscc-admin-ch/diffprivlib-loggersmartnoise_synth_logger
for smartnoise_synth table transformer constraints: https://github.com/dscc-admin-ch/smartnoise-synth-logger
Do not forget to add these libraries in the requirements.txt
files.
Adding a Dataset Store
Here is the explanation of how to add a new dataset store named NewDatasetStore
for the example.
Add the new dataset store the the
DatasetStoreType
StrEnum class inlomas/lomas_server/constants.py
and add theNewDatasetStore
option in thedataset_store_factory
function (inlomas/lomas_server/dataset_store/factory.py
).Create a file for your dataset store in the folder
lomas/lomas_server/dataset_store/new_dataset_store.py
. Inside, create a classNewDatasetStore
that inherits fromDatasetStore
(lomas/lomas_server/dataset_store/dataset_store.py
), your class must contain a_add_dataset
method that handle adding a dataset in memory and aget_querier
method that the querier for the given dataset and library.Add tests in
lomas/lomas_server/tests/
to test all functionnalities of the new dataset store.
Adding a Data Connector (for private dataset in various databases)
Here is the explanation of how to add a new data connector named NewDataConnector
for the example.
Add the new dataset store to the
NewDataConnector
StrEnum class inlomas/lomas_server/constants.py
.Add the
NewDataConnector
option in theprivate_dataset_factory
function (inlomas/lomas_server/private_dataset/factory.py
).Create a file for your dataset store in the folder
lomas/lomas_server/private_dataset/new_data_connector.py
. Inside, create a classNewDataConnector
that inherits fromPrivateDataset
(lomas/lomas_server/private_dataset/private_dataset.py
), your class must contain aget_pandas_df
method that return a dataframe of the dataset.Add tests in
lomas/lomas_server/tests/
to test all functionnalities of the new data connector.