{ "cells": [ { "cell_type": "markdown", "id": "363c238d-5925-4b4b-8f68-8ad84ea4705b", "metadata": {}, "source": [ "# Lomas-server: CLI administration" ] }, { "cell_type": "markdown", "id": "2db1363b-e87e-4d0e-bb3f-9af1a1b72b8d", "metadata": {}, "source": [ "This notebook showcases how data owner could set up the server, add make their data available to certain users. It explains the different steps required." ] }, { "cell_type": "markdown", "id": "de384c88-559e-4384-a49b-1664ffdd6692", "metadata": {}, "source": [ "# Start the server" ] }, { "cell_type": "markdown", "id": "92f3237b-6f13-4c52-a9f2-82d94f0b7e66", "metadata": {}, "source": [ "## Create a docker volume\n", "The first step is to create a docker volume for mongodb, which will hold all the \"admin\" data of the server. Docker volumes are persistent storage spaces that are managed by docker and can be mounted in containers. To create the volume use `docker volume create mongodata`. This must be done only once, and we use bind mounts for the server, so no need to create volumes for that." ] }, { "cell_type": "markdown", "id": "87093f8e-68b1-4f1e-9e66-97c3885b3e48", "metadata": {}, "source": [ "In a terminal run: `docker volume create mongodata`. In output you should see `mongodata` written." ] }, { "cell_type": "markdown", "id": "f6829afb-d822-48e4-ba49-5daf0d79db7e", "metadata": {}, "source": [ "## Start server\n", "The second step is to start the server. Therefore the config file `configs/example_config.yaml` has to be adapted. The data owner must make sure to set the develop mode to False, specify the database type and ports. For this notebook, we will keep the default and use a mongodb on port 27017. Note: Keep in mind that if the configuration file is modified then the `docker-compose` has to be modified accordingly. This is out of scope for this notebook." ] }, { "cell_type": "markdown", "id": "2408425a-13c6-4b89-a1fe-88491850fe10", "metadata": {}, "source": [ "In a terminal run `docker compose up`. This will start the server and the mongodb, each running in its own Docker container. In addition, it will also start a client session container for demonstration purposes and a streamlit container, more on that later." ] }, { "cell_type": "markdown", "id": "244da0f8", "metadata": {}, "source": [ "To check that all containers are indeed running, run `docker ps`. You should be able to see a container for the server (`lomas_server_dev`), for the client (`lomas_client_dev`), for the streamlit (`lomas_streamlit_dev`) and one for the mongo database (`mongodb`)." ] }, { "cell_type": "markdown", "id": "8dbebd54-8deb-46e6-b811-73ac74028569", "metadata": {}, "source": [ "## Access the server to administrate the mongoDB" ] }, { "cell_type": "markdown", "id": "4a8c8115", "metadata": {}, "source": [ "To interact with the mongoDB, we first need to access the server Docker container from where we will run the commands. To do that from inside this Jupyter Notebook, we will need to use the Docker client library. Let's first install it." ] }, { "cell_type": "code", "execution_count": null, "id": "f6863a6d", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!pip install docker" ] }, { "cell_type": "markdown", "id": "b12b414a", "metadata": {}, "source": [ "We can now import the library, create the client allowing us to interact with Docker, and finally, access the server container." ] }, { "cell_type": "code", "execution_count": 1, "id": "112e4156", "metadata": {}, "outputs": [], "source": [ "import docker\n", "client = docker.DockerClient()\n", "server_container = client.containers.get(\"lomas_server_dev\")" ] }, { "cell_type": "markdown", "id": "36b475e6", "metadata": {}, "source": [ "To execute commands inside that Docker container, you can use the `exec_run` method which will return an ExecResult object, from which you can retrieve the output of the command. Let's see in the following example:" ] }, { "cell_type": "code", "execution_count": 2, "id": "dc0349be", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "__init__.py\n", "__pycache__\n", "admin_database\n", "administration\n", "app.py\n", "constants.py\n", "dataset_store\n", "dp_queries\n", "mongodb_admin.py\n", "mongodb_admin_cli.py\n", "data_connector\n", "tests\n", "utils\n", "uvicorn_serve.py\n", "\n" ] } ], "source": [ "response = server_container.exec_run(\"ls\")\n", "print(response.output.decode('utf-8'))" ] }, { "cell_type": "markdown", "id": "9f35fd20-715c-483b-88e4-449c287ba61d", "metadata": {}, "source": [ "Now, you are ready to interact with the database and add users." ] }, { "cell_type": "markdown", "id": "d368d6a6-f1fe-4f65-9ce1-38c0b39584d1", "metadata": {}, "source": [ "# Prepare the database" ] }, { "cell_type": "markdown", "id": "b37c19b8-303d-4fe8-b515-33ed1099c581", "metadata": {}, "source": [ "## Visualise all options\n", "You can visualise all the options offered by the database by running the command `python mongodb_admin_cli.py --help`. We will go through through each of them in the rest of the notebook." ] }, { "cell_type": "markdown", "id": "e70abf6d", "metadata": {}, "source": [ "We prepare the function `run_command` to have a cleaner output of the commands in the notebook." ] }, { "cell_type": "code", "execution_count": 3, "id": "f9277a43", "metadata": {}, "outputs": [], "source": [ "from ast import literal_eval\n", "\n", "def run(command, to_dict=False):\n", " response = server_container.exec_run(command)\n", " output = response.output.decode('utf-8').replace(\"'\", '\"')\n", " if \"] -\" in output:\n", " output = output.split(\"] -\")[1].strip()\n", " if to_dict:\n", " if len(output):\n", " output = literal_eval(output)\n", " return output\n", " return print(output)" ] }, { "cell_type": "code", "execution_count": 4, "id": "fafa4e34", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: mongodb_admin_cli.py [-h]\n", " {add_user,add_user_with_budget,del_user,add_dataset_to_user,del_dataset_to_user,set_budget_field,set_may_query,get_user,add_users_via_yaml,get_archives,get_users,get_user_datasets,add_dataset,add_datasets_via_yaml,del_dataset,get_dataset,get_metadata,get_datasets,drop_collection,get_collection}\n", " ...\n", "\n", "MongoDB administration script for the database\n", "\n", "options:\n", " -h, --help show this help message and exit\n", "\n", "subcommands:\n", " {add_user,add_user_with_budget,del_user,add_dataset_to_user,del_dataset_to_user,set_budget_field,set_may_query,get_user,add_users_via_yaml,get_archives,get_users,get_user_datasets,add_dataset,add_datasets_via_yaml,del_dataset,get_dataset,get_metadata,get_datasets,drop_collection,get_collection}\n", " user database administration operations\n", " add_user add user to users collection\n", " add_user_with_budget\n", " add user with budget to users collection\n", " del_user delete user from users collection\n", " add_dataset_to_user\n", " add dataset with initialized budget values for a user\n", " del_dataset_to_user\n", " delete dataset for user in users collection\n", " set_budget_field set budget field to given value for given user and\n", " dataset\n", " set_may_query set may query field to given value for given user\n", " get_user show all metadata of user\n", " add_users_via_yaml create users collection from yaml file\n", " get_archives show all previous queries from a user\n", " get_users get the list of all users in \"users\" collection\n", " get_user_datasets get the list of all datasets from a user\n", " add_dataset set in which database the dataset is stored\n", " add_datasets_via_yaml\n", " create dataset to database type collection\n", " del_dataset delete dataset and metadata from datasets and metadata\n", " collection\n", " get_dataset show a dataset from the dataset collection\n", " get_metadata show metadata from the metadata collection\n", " get_datasets get the list of all datasets in \"datasets\" collection\n", " drop_collection delete collection from database\n", " get_collection print a collection\n", "\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py --help\")" ] }, { "cell_type": "markdown", "id": "579b9571", "metadata": {}, "source": [ "And finally, let's delete all existing data from database to start clean:" ] }, { "cell_type": "code", "execution_count": 5, "id": "18a3681c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Deleted collection datasets.\n", "Deleted collection metadata.\n", "Deleted collection users.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py drop_collection --collection datasets\")\n", "run(\"python mongodb_admin_cli.py drop_collection --collection metadata\")\n", "run(\"python mongodb_admin_cli.py drop_collection --collection users\")" ] }, { "cell_type": "markdown", "id": "d7edd7d3-20f9-4546-afc8-25661f948d44", "metadata": {}, "source": [ "## Datasets (add and drop)" ] }, { "cell_type": "markdown", "id": "ed1597b3-767f-470c-a7d7-8fe41dd82da5", "metadata": {}, "source": [ "We first need to set the dataset meta-information. For each dataset, 2 informations are required:\n", "- the type of database in which the dataset is stored\n", "- a path to the metadata of the dataset (stored as a yaml file).\n", "\n", "To later perform query on the dataset, metadata are required. In this secure server the metadata information is expected to be in the same format as [SmartnoiseSQL dictionary format](https://docs.smartnoise.org/sql/metadata.html#dictionary-format), where among other, there is information about all the available columns, their type, bound values (see Smartnoise page for more details). It is also expected to be in a `yaml` file.\n", "\n", "These information (dataset name, dataset type and metadata path) are stored in the `datasets` collection. Then for each dataset, its metadata is fetched from its `yaml` file and stored in a collection named `metadata`." ] }, { "cell_type": "markdown", "id": "2678fb3f", "metadata": {}, "source": [ "We then check that there is indeed no data in the dataset and metadata collections yet:" ] }, { "cell_type": "code", "execution_count": 6, "id": "9b7a7fae", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection datasets\")" ] }, { "cell_type": "code", "execution_count": 7, "id": "d36e03ff", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection metadata\")" ] }, { "cell_type": "markdown", "id": "d1d331ea", "metadata": {}, "source": [ "We can add **one dataset** with its name, database type and path to medata file:" ] }, { "cell_type": "code", "execution_count": 8, "id": "53f5787d-e721-43d9-85ce-da842f173381", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added dataset IRIS with database PATH_DB and associated metadata.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_dataset -d IRIS -db PATH_DB -d_path https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv -m_db PATH_DB -mp ../data/collections/metadata/iris_metadata.yaml\")" ] }, { "cell_type": "markdown", "id": "398f8990", "metadata": {}, "source": [ "We can now see the dataset and metadata collection with the Iris dataset:" ] }, { "cell_type": "code", "execution_count": 9, "id": "3005eda2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'dataset_name': 'IRIS',\n", " 'database_type': 'PATH_DB',\n", " 'dataset_path': 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv',\n", " 'metadata': {'database_type': 'PATH_DB',\n", " 'metadata_path': '../data/collections/metadata/iris_metadata.yaml'}}]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection datasets\", to_dict=True)" ] }, { "cell_type": "code", "execution_count": 10, "id": "7527f3f4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'IRIS': {'max_ids': 1,\n", " 'row_privacy': True,\n", " 'columns': {'petal_length': {'type': 'float', 'lower': 0.5, 'upper': 10.0},\n", " 'petal_width': {'type': 'float', 'lower': 0.05, 'upper': 5.0},\n", " 'sepal_length': {'type': 'float', 'lower': 2.0, 'upper': 10.0},\n", " 'sepal_width': {'type': 'float', 'lower': 1.0, 'upper': 6.0},\n", " 'species': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['setosa', 'versicolor', 'virginica']}}}}]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection metadata\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "a0a2076e", "metadata": {}, "source": [ "Or a path to a yaml file which contains all these informations to do **multiple datasets** in one command:" ] }, { "cell_type": "code", "execution_count": 11, "id": "0e42f9cb-3a02-45f5-baee-2e06edda739f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaning done. \n", "\n", "2024-06-05 09:59:46,703 - INFO - [mongodb_admin.py:710 - add_datasets_via_yaml()\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_datasets_via_yaml -yf ../data/collections/dataset_collection.yaml -c\")" ] }, { "cell_type": "markdown", "id": "19b86f6a", "metadata": {}, "source": [ "The argument *-c* or *--clean* allow you to clear the current dataset collection before adding your collection.\n", "\n", "By default, *add_datasets* will only add new dataset found from the collection provided." ] }, { "cell_type": "code", "execution_count": 12, "id": "88bbdcf2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Metadata already exist. Use the command -om to overwrite with new values.\n", "2024-06-05 09:59:48,726 - INFO - [mongodb_admin.py:755 - add_datasets_via_yaml()\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_datasets_via_yaml -yf ../data/collections/dataset_collection.yaml\")" ] }, { "cell_type": "markdown", "id": "3a922c76", "metadata": {}, "source": [ "Arguments :\n", "\n", "*-od* / *--overwrite_datasets* : Overwrite the values for **exisiting datasets** with the values provided in the yaml.\n", "\n", "*-om* / *--overwrite_metadata* : Overwrite the values for **exisiting metadata** with the values provided in the yaml." ] }, { "cell_type": "code", "execution_count": 13, "id": "240928ab", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Existing datasets updated with new collection\n", "2024-06-05 09:59:50,917 - INFO - [mongodb_admin.py:755 - add_datasets_via_yaml()\n" ] } ], "source": [ "# Add new datasets/metadata, update existing datasets\n", "run(\"python mongodb_admin_cli.py add_datasets_via_yaml -yf ../data/collections/dataset_collection.yaml -od\")" ] }, { "cell_type": "code", "execution_count": 14, "id": "80de6b9c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Metadata updated for dataset : IRIS.\n", "2024-06-05 09:59:52,741 - INFO - [mongodb_admin.py:749 - add_datasets_via_yaml()\n" ] } ], "source": [ "# Add new datasets/metadata, update existing metadata\n", "run(\"python mongodb_admin_cli.py add_datasets_via_yaml -yf ../data/collections/dataset_collection.yaml -om\")" ] }, { "cell_type": "code", "execution_count": 15, "id": "b1a9f413", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Existing datasets updated with new collection\n", "2024-06-05 09:59:54,418 - INFO - [mongodb_admin.py:749 - add_datasets_via_yaml()\n" ] } ], "source": [ "# Add new datasets/metadata, update existing datasets & metadata\n", "run(\"python mongodb_admin_cli.py add_datasets_via_yaml -yf ../data/collections/dataset_collection.yaml -od -om\")" ] }, { "cell_type": "markdown", "id": "87d686ae", "metadata": {}, "source": [ "Let's see all the dataset collection:" ] }, { "cell_type": "code", "execution_count": 16, "id": "536b5b35", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'dataset_name': 'IRIS',\n", " 'database_type': 'PATH_DB',\n", " 'metadata': {'database_type': 'PATH_DB',\n", " 'metadata_path': '../data/collections/metadata/iris_metadata.yaml'},\n", " 'dataset_path': 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'},\n", " {'dataset_name': 'PENGUIN',\n", " 'database_type': 'PATH_DB',\n", " 'metadata': {'database_type': 'PATH_DB',\n", " 'metadata_path': '../data/collections/metadata/penguin_metadata.yaml'},\n", " 'dataset_path': 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv'},\n", " {'dataset_name': 'TITANIC',\n", " 'database_type': 'S3_DB',\n", " 'metadata': {'database_type': 'S3_DB',\n", " 'bucket': 'example',\n", " 'key': 'metadata/titanic_metadata.yaml',\n", " 'endpoint_url': 'https://api-lomas-minio.lab.sspcloud.fr',\n", " 'aws_access_key_id': 'admin',\n", " 'aws_secret_access_key': 'admin123'},\n", " 'bucket': 'example',\n", " 'key': 'data/titanic.csv',\n", " 'endpoint_url': 'https://api-lomas-minio.lab.sspcloud.fr',\n", " 'aws_access_key_id': 'admin',\n", " 'aws_secret_access_key': 'admin123'},\n", " {'dataset_name': 'FSO_INCOME_SYNTHETIC',\n", " 'database_type': 'PATH_DB',\n", " 'metadata': {'database_type': 'PATH_DB',\n", " 'metadata_path': '../data/collections/metadata/fso_income_synthetic_metadata.yaml'},\n", " 'dataset_path': '../data/datasets/income_synthetic_data.csv'}]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection datasets\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "0746b382-8692-445f-9ca9-0d2407a25259", "metadata": {}, "source": [ "Finally let's have a look at the stored metadata:" ] }, { "cell_type": "code", "execution_count": 17, "id": "c667dda0-5d0f-48c8-956c-8d8a756b7ff7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'IRIS': {'max_ids': 1,\n", " 'row_privacy': True,\n", " 'columns': {'petal_length': {'type': 'float', 'lower': 0.5, 'upper': 10.0},\n", " 'petal_width': {'type': 'float', 'lower': 0.05, 'upper': 5.0},\n", " 'sepal_length': {'type': 'float', 'lower': 2.0, 'upper': 10.0},\n", " 'sepal_width': {'type': 'float', 'lower': 1.0, 'upper': 6.0},\n", " 'species': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['setosa', 'versicolor', 'virginica']}}}},\n", " {'PENGUIN': {'max_ids': 1,\n", " 'row_privacy': True,\n", " 'censor_dims': False,\n", " 'columns': {'species': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['Adelie', 'Chinstrap', 'Gentoo']},\n", " 'island': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['Torgersen', 'Biscoe', 'Dream']},\n", " 'bill_length_mm': {'type': 'float', 'lower': 30.0, 'upper': 65.0},\n", " 'bill_depth_mm': {'type': 'float', 'lower': 13.0, 'upper': 23.0},\n", " 'flipper_length_mm': {'type': 'float', 'lower': 150.0, 'upper': 250.0},\n", " 'body_mass_g': {'type': 'float', 'lower': 2000.0, 'upper': 7000.0},\n", " 'sex': {'type': 'string',\n", " 'cardinality': 2,\n", " 'categories': ['MALE', 'FEMALE']}}}},\n", " {'TITANIC': {'': {'Schema': {'Table': {'max_ids': 1,\n", " 'PassengerId': {'type': 'int', 'lower': 1},\n", " 'Pclass': {'type': 'int', 'lower': 1, 'upper': 3},\n", " 'Name': {'type': 'string'},\n", " 'Sex': {'type': 'string',\n", " 'cardinality': 2,\n", " 'categories': ['male', 'female']},\n", " 'Age': {'type': 'float', 'lower': 0.1, 'upper': 100.0},\n", " 'SibSp': {'type': 'int', 'lower': 0},\n", " 'Parch': {'type': 'int', 'lower': 0},\n", " 'Ticket': {'type': 'string'},\n", " 'Fare': {'type': 'float', 'lower': 0.0},\n", " 'Cabin': {'type': 'string'},\n", " 'Embarked': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['C', 'Q', 'S']},\n", " 'Survived': {'type': 'boolean'},\n", " 'row_privacy': True}}},\n", " 'engine': 'csv'}},\n", " {'FSO_INCOME_SYNTHETIC': {'max_ids': 1,\n", " 'columns': {'region': {'type': 'int'},\n", " 'eco_branch': {'type': 'int'},\n", " 'profession': {'type': 'int'},\n", " 'education': {'type': 'int'},\n", " 'age': {'type': 'int'},\n", " 'sex': {'type': 'int'},\n", " 'income': {'type': 'float', 'lower': 1000, 'upper': 100000}}}}]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection metadata\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "509d0b26", "metadata": {}, "source": [ "If we are interested in a specific dataset, we can also show its collection:" ] }, { "cell_type": "code", "execution_count": 18, "id": "3db07639", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'dataset_name': 'IRIS',\n", " 'database_type': 'PATH_DB',\n", " 'metadata': {'database_type': 'PATH_DB',\n", " 'metadata_path': '../data/collections/metadata/iris_metadata.yaml'},\n", " 'dataset_path': 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_dataset --dataset IRIS\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "48550826", "metadata": {}, "source": [ "And its associated metadata:" ] }, { "cell_type": "code", "execution_count": 19, "id": "efd9931f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'max_ids': 1,\n", " 'row_privacy': True,\n", " 'columns': {'petal_length': {'type': 'float', 'lower': 0.5, 'upper': 10.0},\n", " 'petal_width': {'type': 'float', 'lower': 0.05, 'upper': 5.0},\n", " 'sepal_length': {'type': 'float', 'lower': 2.0, 'upper': 10.0},\n", " 'sepal_width': {'type': 'float', 'lower': 1.0, 'upper': 6.0},\n", " 'species': {'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['setosa', 'versicolor', 'virginica']}}}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_metadata --dataset IRIS\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "594b83a9", "metadata": {}, "source": [ "We can also get list of all datasets in the 'datasets' collection:" ] }, { "cell_type": "code", "execution_count": 20, "id": "a6e21f16", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['IRIS', 'PENGUIN', 'TITANIC', 'FSO_INCOME_SYNTHETIC']" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_datasets\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "0e0b85d5", "metadata": {}, "source": [ "## Users" ] }, { "cell_type": "markdown", "id": "14ab18db-4b6d-4663-bde0-b5d9d3d3d2ee", "metadata": {}, "source": [ "### Add user\n", "Let's see which users are alreay loaded:" ] }, { "cell_type": "code", "execution_count": 21, "id": "7f450145", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\")" ] }, { "cell_type": "markdown", "id": "2d2ae627", "metadata": {}, "source": [ "And now let's add few users." ] }, { "cell_type": "code", "execution_count": 22, "id": "0f6aa33c-6bd1-4d62-ba06-3533b064340d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to user Mrs. Daisy with dataset IRIS, budget epsilon 10.0 and delta 0.001.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user_with_budget --user 'Mrs. Daisy' --dataset 'IRIS' --epsilon 10.0 --delta 0.001\")" ] }, { "cell_type": "code", "execution_count": 23, "id": "7858f019-8783-4fed-acd8-ff0107d33465", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to user Mr. Coldheart with dataset PENGUIN, budget epsilon 10.0 and delta 0.001.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user_with_budget --user 'Mr. Coldheart' --dataset 'PENGUIN' --epsilon 10.0 --delta 0.001\")" ] }, { "cell_type": "code", "execution_count": 24, "id": "231e7d93-05ba-424a-8329-d96b0bfb4fb9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to user Lord McFreeze with dataset PENGUIN, budget epsilon 10.0 and delta 0.001.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user_with_budget --user 'Lord McFreeze' --dataset 'PENGUIN' --epsilon 10.0 --delta 0.001\")" ] }, { "cell_type": "markdown", "id": "51b0c274-880c-44f9-9182-6cb162a54c55", "metadata": {}, "source": [ "Users must all have different names, otherwise you will have an error and nothing will be done:" ] }, { "cell_type": "code", "execution_count": 25, "id": "6276730e-39c2-47f1-962f-342c1acb7944", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/code/mongodb_admin_cli.py\", line 461, in \n", " function_map[args.func.__name__](args)\n", " File \"/code/mongodb_admin_cli.py\", line 396, in \n", " \"add_user_with_budget\": lambda args: add_user_with_budget(\n", " ^^^^^^^^^^^^^^^^^^^^^\n", " File \"/code/mongodb_admin.py\", line 50, in wrapper_decorator\n", " raise ValueError(\n", "ValueError: User Lord McFreeze already exists in user collection\n", "\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user_with_budget --user 'Lord McFreeze' --dataset 'IRIS' --epsilon 10.0 --delta 0.001\")" ] }, { "cell_type": "markdown", "id": "49f81f7e-e086-412f-8467-89b665e5559a", "metadata": {}, "source": [ "If you want to add another dataset access to an existing user, just use the function `add_dataset_to_user` command." ] }, { "cell_type": "code", "execution_count": 26, "id": "82a5f498-aed1-4779-9d73-b2b71dde4ce0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to dataset IRIS to user Lord McFreeze with budget epsilon 5.0 and delta 0.005.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_dataset_to_user --user 'Lord McFreeze' --dataset 'IRIS' --epsilon 5.0 --delta 0.005\")" ] }, { "cell_type": "markdown", "id": "06170073-49ed-4329-8101-2debdd77eb98", "metadata": {}, "source": [ "Alternatively, you can create a user without assigned dataset and then add dataset in another command." ] }, { "cell_type": "code", "execution_count": 27, "id": "06839270-36cf-4de7-b93c-d143c4866bc8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added user Madame Frostina.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user --user 'Madame Frostina'\")" ] }, { "cell_type": "markdown", "id": "df41cea4-8219-41a1-9ce3-fad5409db299", "metadata": {}, "source": [ "Let's see the default parameters after the user creation:" ] }, { "cell_type": "code", "execution_count": 28, "id": "1dbe0b34-ef3f-49b9-9153-dd5d09b00e4e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'user_name': 'Madame Frostina', 'may_query': True, 'datasets_list': []}" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_user --user 'Madame Frostina'\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "f4c62a55-92cb-47af-90be-80c7d13db1e6", "metadata": {}, "source": [ "Let's give her access to a dataset with a budget:" ] }, { "cell_type": "code", "execution_count": 29, "id": "e83378fe", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to dataset IRIS to user Madame Frostina with budget epsilon 5.0 and delta 0.005.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_dataset_to_user --user 'Madame Frostina' --dataset 'IRIS' --epsilon 5.0 --delta 0.005\")" ] }, { "cell_type": "code", "execution_count": 30, "id": "919b2652", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to dataset PENGUIN to user Madame Frostina with budget epsilon 5.0 and delta 0.005.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_dataset_to_user --user 'Madame Frostina' --dataset 'PENGUIN' --epsilon 5.0 --delta 0.005\")" ] }, { "cell_type": "markdown", "id": "2dab150b-4ad0-410a-b1eb-e448f8f0d79e", "metadata": {}, "source": [ "Now let's see the user Madame Frostina details to check all is in order:" ] }, { "cell_type": "code", "execution_count": 31, "id": "8833f27e-a342-400a-b868-facf9a44dc6f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'user_name': 'Madame Frostina',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]}" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_user --user 'Madame Frostina'\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "0bed2714", "metadata": {}, "source": [ "And we can also modify existing the total budget of a user:" ] }, { "cell_type": "code", "execution_count": 32, "id": "e3b75cca", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added access to user Dr. Antartica with dataset PENGUIN, budget epsilon 10.0 and delta 0.001.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_user_with_budget --user 'Dr. Antartica' --dataset 'PENGUIN' --epsilon 10.0 --delta 0.001\")" ] }, { "cell_type": "code", "execution_count": 33, "id": "87eecb9c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Set budget of Dr. Antartica for dataset PENGUIN of initial_epsilon to 20.0.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py set_budget_field --user 'Dr. Antartica' --dataset 'PENGUIN' --field initial_epsilon --value 20.0\")" ] }, { "cell_type": "markdown", "id": "bbeb5dc2-e91e-4440-8df5-3e9506bf4ee1", "metadata": {}, "source": [ "Let's see the current state of the database:" ] }, { "cell_type": "code", "execution_count": 34, "id": "3b3f61c6-65dc-4b1e-a32e-47cdd2729ab6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'user_name': 'Mrs. Daisy',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Mr. Coldheart',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Lord McFreeze',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Madame Frostina',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Dr. Antartica',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 20.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]}]" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "4e0ae62f-ff80-4234-8102-4dccec0b284f", "metadata": {}, "source": [ "Do not hesitate to re-run this command after every other command to ensure that everything runs as expected." ] }, { "cell_type": "markdown", "id": "9ab1f5ba-68bd-4c96-bacd-b81dfa5d6302", "metadata": {}, "source": [ "### Remove user\n", "You have just heard that the penguin named Coldheart might have malicious intentions and decide to remove his access until an investigation has been carried out. To ensure that he is not allowed to do any more queries, run the following command:" ] }, { "cell_type": "code", "execution_count": 35, "id": "7f341b3d-5a88-4fd9-8c97-cc70145834f1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Set user Mr. Coldheart may query to False.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py set_may_query --user 'Mr. Coldheart' --value False\")" ] }, { "cell_type": "markdown", "id": "4cc56586-f9a9-4e88-abed-51ba36a6e4f1", "metadata": {}, "source": [ "Now, he won't be able to do any query (unless you re-run the query with --value True).\n", "\n", "A few days have passed and the investigation reveals that he was aiming to do unethical research, you can remove his dataset by doing:" ] }, { "cell_type": "code", "execution_count": 36, "id": "9153d9af-b4be-4496-9f80-d140870f60fe", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Remove access to dataset PENGUIN from user Mr. Coldheart.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py del_dataset_to_user --user 'Mr. Coldheart' --dataset 'PENGUIN'\")" ] }, { "cell_type": "markdown", "id": "18d411ae-a211-4997-8984-81281c6275eb", "metadata": {}, "source": [ "Or delete him completely from the codebase:" ] }, { "cell_type": "code", "execution_count": 37, "id": "a54e89eb-1ee1-48ad-9e00-bace8516a3ef", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Deleted user Mr. Coldheart.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py del_user --user 'Mr. Coldheart'\")" ] }, { "cell_type": "markdown", "id": "06a7c17f-da34-472a-ad7f-3ae73a1beb7b", "metadata": {}, "source": [ "Let's see the resulting users:" ] }, { "cell_type": "code", "execution_count": 38, "id": "79fa414a-f097-4207-a628-19fa434a1ad3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'user_name': 'Mrs. Daisy',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Lord McFreeze',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Madame Frostina',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Dr. Antartica',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 20.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]}]" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "90a46a59-70ed-4a26-88cd-6ca8f1d17318", "metadata": {}, "source": [ "### Change budget\n", "You also change your mind about the budget allowed to Lord McFreeze and give him a bit more on the penguin dataset." ] }, { "cell_type": "code", "execution_count": 39, "id": "0909e6c4-141e-4d57-acd2-bdc0a2d92cea", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Set budget of Lord McFreeze for dataset PENGUIN of initial_epsilon to 15.0.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py set_budget_field --user 'Lord McFreeze' --dataset 'PENGUIN' --field initial_epsilon --value 15.0\")" ] }, { "cell_type": "code", "execution_count": 40, "id": "c0e110fe-4297-4559-9a95-bc0ebdfa402c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Set budget of Lord McFreeze for dataset PENGUIN of initial_delta to 0.005.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py set_budget_field --user 'Lord McFreeze' --dataset 'PENGUIN' --field initial_delta --value 0.005\")" ] }, { "cell_type": "markdown", "id": "952d7ed4-ce1d-4a87-9319-6b57968ef20e", "metadata": {}, "source": [ "Let's check all our changes by looking at the state of the database:" ] }, { "cell_type": "code", "execution_count": 41, "id": "2ab46c5d-1553-4925-bd25-61c9c205dc95", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'user_name': 'Mrs. Daisy',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Lord McFreeze',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 15.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Madame Frostina',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0},\n", " {'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Dr. Antartica',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 20.0,\n", " 'initial_delta': 0.001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]}]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "ba7cfa86", "metadata": {}, "source": [ "### Finally all can be loaded fom a file direcly" ] }, { "cell_type": "markdown", "id": "43340fc9", "metadata": {}, "source": [ "Let's delete the existing user collection first:" ] }, { "cell_type": "code", "execution_count": 42, "id": "597cb0b3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Deleted collection users.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py drop_collection --collection users\")" ] }, { "cell_type": "markdown", "id": "81661298", "metadata": {}, "source": [ "Is is now empty:" ] }, { "cell_type": "code", "execution_count": 43, "id": "e1638145", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\")" ] }, { "cell_type": "markdown", "id": "20b3cd2c", "metadata": {}, "source": [ "We add the data based on a yaml file:" ] }, { "cell_type": "code", "execution_count": 44, "id": "87b776f2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added user data from yaml.\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_users_via_yaml -yf ../data/collections/user_collection.yaml\")" ] }, { "cell_type": "markdown", "id": "76263ebd", "metadata": {}, "source": [ "By default, *add_users_via_yaml* will only add new users to the database." ] }, { "cell_type": "code", "execution_count": 45, "id": "7f597f68", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No new users added, they already exist in the server\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_users_via_yaml -yf ../data/collections/user_collection.yaml\")" ] }, { "cell_type": "markdown", "id": "3df278ef", "metadata": {}, "source": [ "If you want to clean the current users collection and replace it, you can use the argument *--clean*. " ] }, { "cell_type": "code", "execution_count": 46, "id": "5a610b9f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaning done. \n", "\n", "2024-06-05 10:00:45,678 - INFO - [mongodb_admin.py:464 - add_users_via_yaml()\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_users_via_yaml -yf ../data/collections/user_collection.yaml --clean\")" ] }, { "cell_type": "markdown", "id": "c933165a", "metadata": {}, "source": [ "If you want to add new users and update the existing ones in your collection, you can use the argument *--overwrite*. This will make sure to add new users if they do not exist and replace values from existing users with the collection provided." ] }, { "cell_type": "code", "execution_count": 47, "id": "fd621ac3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Existing users updated. \n", "2024-06-05 10:00:47,300 - INFO - [mongodb_admin.py:466 - add_users_via_yaml()\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py add_users_via_yaml -yf ../data/collections/user_collection.yaml --overwrite\")" ] }, { "cell_type": "markdown", "id": "63853e73", "metadata": {}, "source": [ "And let's see the resulting collection:" ] }, { "cell_type": "code", "execution_count": 48, "id": "77866f52", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'user_name': 'Alice',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.0001,\n", " 'total_spent_epsilon': 1.0,\n", " 'total_spent_delta': 1e-06},\n", " {'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 5.0,\n", " 'initial_delta': 0.0005,\n", " 'total_spent_epsilon': 0.2,\n", " 'total_spent_delta': 1e-07}]},\n", " {'user_name': 'Dr. Antartica',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'PENGUIN',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Dr. FSO',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'FSO_INCOME_SYNTHETIC',\n", " 'initial_epsilon': 45.0,\n", " 'initial_delta': 0.005,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Bob',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'IRIS',\n", " 'initial_epsilon': 10.0,\n", " 'initial_delta': 0.0001,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]},\n", " {'user_name': 'Jack',\n", " 'may_query': True,\n", " 'datasets_list': [{'dataset_name': 'TITANIC',\n", " 'initial_epsilon': 45.0,\n", " 'initial_delta': 0.2,\n", " 'total_spent_epsilon': 0.0,\n", " 'total_spent_delta': 0.0}]}]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run(\"python mongodb_admin_cli.py get_collection --collection users\", to_dict=True)" ] }, { "cell_type": "markdown", "id": "b9510647", "metadata": {}, "source": [ "To get a list of all users in the 'users' collection:" ] }, { "cell_type": "code", "execution_count": 49, "id": "7e70e971", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\"Alice\", \"Dr. Antartica\", \"Dr. FSO\", \"Bob\", \"Jack\"]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_users\")" ] }, { "cell_type": "markdown", "id": "e559bc1e", "metadata": {}, "source": [ "We can also get a list of all datasets allocated to an user:" ] }, { "cell_type": "code", "execution_count": 50, "id": "81b73bd6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\"IRIS\", \"PENGUIN\"]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_user_datasets --user Alice\")" ] }, { "cell_type": "markdown", "id": "1a946132", "metadata": {}, "source": [ "## Archives of queries" ] }, { "cell_type": "code", "execution_count": 51, "id": "8025ef4d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "run(\"python mongodb_admin_cli.py get_archives --user Alice\")" ] }, { "cell_type": "markdown", "id": "a27be3d3-77a2-43d3-9a7f-87c8466293fe", "metadata": {}, "source": [ "## Stop the server: do not do it now !\n", "To tear down the service, first do `ctrl+C` in the terminal where you had done `docker compose up`. Wait for the command to finish executing and then run `docker compose down`. This will also delete all the containers but the volume will stay in place. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0rc1" } }, "nbformat": 4, "nbformat_minor": 5 }