{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3f18d338",
   "metadata": {},
   "source": [
    "# Lomas Client Side: Using DiffPrivlib"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1582a2ae",
   "metadata": {},
   "source": [
    "This notebook showcases how researcher could use lomas platform with DiffPrivLib. It explains the different functionnalities provided by the `lomas-client` client library to interact with lomas server.\n",
    "\n",
    "The secure data are never visible by researchers. They can only access to differentially private responses via queries to the server.\n",
    "\n",
    "Each user has access to one or multiple projects and for each dataset has a limited budget with $\\epsilon$ and $\\delta$ values."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b73135c",
   "metadata": {},
   "source": [
    "In this notebook the researcher is a penguin researcher named Dr. Antarctica. She aims to do a grounbdbreaking research on various penguins data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01ae30d2",
   "metadata": {},
   "source": [
    "## Step 1: Install the library\n",
    "To interact with the secure server on which the data is stored, Dr.Antartica first needs to install the library `lomas-client` on her local developping environment. \n",
    "\n",
    "It can be installed via the pip command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8b340f40-32c9-487b-bc0c-a76593d43980",
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install lomas_client"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46c4f70b-1491-4162-930c-e0a86406ba69",
   "metadata": {},
   "source": [
    "Or using a local version of the client"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "36d508bf-6cc3-4034-8e11-fffe858552f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "sys.path.append(os.path.abspath(os.path.join('..')))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9535e92e-620e-4df4-92dd-4ea2c653e4ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "from lomas_client import Client\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c63718b",
   "metadata": {},
   "source": [
    "## Step 2: Initialise the client\n",
    "\n",
    "Once the library is installed, a Client object must be created. It is responsible for sending sending requests to the server and processing responses in the local environment. It enables a seamless interaction with the server. \n",
    "\n",
    "To create the client, Dr. Antartica needs to give it a few parameters:\n",
    "- a url: the root application endpoint to the remote secure server.\n",
    "- user_name: her name as registered in the database (Dr. Alice Antartica)\n",
    "- dataset_name: the name of the dataset that she wants to query (PENGUIN)\n",
    "\n",
    "She will only be able to query on the real dataset if the queen Icergina has previously made her an account in the database, given her access to the PENGUIN dataset and has given her some epsilon and delta credit (as is done in the Admin Notebook for Users and Datasets management)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "941991f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "APP_URL = \"http://lomas_server\"\n",
    "USER_NAME = \"Dr. Antartica\"\n",
    "DATASET_NAME = \"PENGUIN\"\n",
    "client = Client(url=APP_URL, user_name = USER_NAME, dataset_name = DATASET_NAME)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ec400c8",
   "metadata": {},
   "source": [
    "And that's it for the preparation. She is now ready to use the various functionnalities offered by `lomas-client`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b9a5f13",
   "metadata": {},
   "source": [
    "## Step 3: Metadata and dummy dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7cb5531",
   "metadata": {},
   "source": [
    "### Getting dataset metadata\n",
    "\n",
    "Dr. Antartica has never seen the data and as a first step to understand what is available to her, she would like to check the metadata of the dataset. Therefore, she just needs to call the `get_dataset_metadata()` function of the client. As this is public information, this does not cost any budget."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "0fdebac9-57fc-4410-878b-5a77425af634",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'max_ids': 1,\n",
       " 'rows': 344,\n",
       " 'row_privacy': True,\n",
       " 'censor_dims': False,\n",
       " 'columns': {'species': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'string',\n",
       "   'cardinality': 3,\n",
       "   'categories': ['Adelie', 'Chinstrap', 'Gentoo']},\n",
       "  'island': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'string',\n",
       "   'cardinality': 3,\n",
       "   'categories': ['Torgersen', 'Biscoe', 'Dream']},\n",
       "  'bill_length_mm': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'float',\n",
       "   'precision': 64,\n",
       "   'lower': 30.0,\n",
       "   'upper': 65.0},\n",
       "  'bill_depth_mm': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'float',\n",
       "   'precision': 64,\n",
       "   'lower': 13.0,\n",
       "   'upper': 23.0},\n",
       "  'flipper_length_mm': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'float',\n",
       "   'precision': 64,\n",
       "   'lower': 150.0,\n",
       "   'upper': 250.0},\n",
       "  'body_mass_g': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'float',\n",
       "   'precision': 64,\n",
       "   'lower': 2000.0,\n",
       "   'upper': 7000.0},\n",
       "  'sex': {'private_id': False,\n",
       "   'nullable': False,\n",
       "   'max_partition_length': None,\n",
       "   'max_influenced_partitions': None,\n",
       "   'max_partition_contributions': None,\n",
       "   'type': 'string',\n",
       "   'cardinality': 2,\n",
       "   'categories': ['MALE', 'FEMALE']}}}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "penguin_metadata = client.get_dataset_metadata()\n",
    "penguin_metadata"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e7ca7ae-bf17-40c8-aa75-2d72fcdd3088",
   "metadata": {},
   "source": [
    "## Step 4: Train Logistic Regression model with DiffPrivLib"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2de1389c-53a7-4098-bc3c-397c12a4b869",
   "metadata": {},
   "source": [
    "We want to train an ML model to guess the species of penguins based on their bill length and depth, flipper length and body mass.\n",
    "\n",
    "Therefore, we use a DiffPrivLib pipeline which:\n",
    "- standard scales the dimensions between the metadata bounds\n",
    "- and then performs a logistic regression\n",
    "to predict the species of penguins."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "2864729f-2ce4-4d81-a446-8e3f2c1493b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.pipeline import Pipeline\n",
    "from diffprivlib import models\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a06365e9-4076-4592-871a-31af91d6a05d",
   "metadata": {},
   "source": [
    "### Classification: Logistic Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea567662-6518-4c10-bd87-0fb6028db263",
   "metadata": {},
   "source": [
    "Dr. Antartica wants to do a logistic regression on the feature columns 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm' and'body_mass_g' to predict penguin species."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "bda4884f-bce2-43b3-875e-dbb135492e79",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']\n",
    "target_columns = ['species']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2fe9db5e-9c57-41f3-a444-c9f67100ba81",
   "metadata": {},
   "source": [
    "#### She starts to write the associated DiffPrivLib pipeline and tries it on the dummy."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eead3541-66e5-4c0f-aa5b-7b97821afe39",
   "metadata": {},
   "source": [
    "If the DiffprivlibCompatibilityWarning is raised by DiffPrivLib library, an warning will be raised the first time (as in DiffPrivLib) then the 'wrong' parameters will be ignored within the server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "804f31cd-f277-47d4-9648-a51872eccf29",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.12/site-packages/diffprivlib/utils.py:71: DiffprivlibCompatibilityWarning: Parameter 'svd_solver' is not functional in diffprivlib.  Remove this parameter to suppress this warning.\n",
      "  warnings.warn(f\"Parameter '{arg}' is not functional in diffprivlib.  Remove this parameter to suppress this \"\n"
     ]
    }
   ],
   "source": [
    "# DiffprivlibCompatibilityWarning Error expected\n",
    "dpl_pipeline = Pipeline([\n",
    "    ('scaler', models.StandardScaler(epsilon = 0.5)),\n",
    "    ('classifier', models.LogisticRegression(epsilon = 1.0, svd_solver='full'))\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15dd0c42-3e20-4b27-95b3-9b55622b4bfd",
   "metadata": {},
   "source": [
    "To resolve the DiffprivlibCompatibilityWarning issue, the svd_solver should not be set as it is incompatible with DiffPrivLib. If these warnings are ignore by the user, the default behaviour of DiffPrivLib will be applied."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c5d7fa4-cfe6-4a4f-88ff-8e88ca31dfed",
   "metadata": {},
   "source": [
    "If PrivacyLeakWarning are encountered, then the query will not be processed by the server and will return an error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "f1c9cffb-8327-400d-ab9d-35c5450fd4d6",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    ('scaler', models.StandardScaler(epsilon = 0.5)),\n",
    "    ('classifier', models.LogisticRegression(epsilon = 1.0))\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "db1ddfbc-de3e-43fe-9958-49f9e6dad89f",
   "metadata": {},
   "outputs": [
    {
     "ename": "ExternalLibraryException",
     "evalue": "('diffprivlib', \"PrivacyLeakWarning: Bounds parameter hasn't been specified, so falling back to determining bounds from the data.\\n This will result in additional privacy leakage.  To ensure differential privacy with no additional privacy loss, specify `bounds` for each valued returned by np.mean().. Lomas server cannot fit pipeline on data, PrivacyLeakWarning is a blocker.\")",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mExternalLibraryException\u001b[0m                  Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[10], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# Expect PrivacyLeakWarning Error\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m dummy_response \u001b[38;5;241m=\u001b[39m \u001b[43mclient\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdiffprivlib\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mquery\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      3\u001b[0m \u001b[43m    \u001b[49m\u001b[43mpipeline\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mdpl_pipeline\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      4\u001b[0m \u001b[43m    \u001b[49m\u001b[43mfeature_columns\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m  \u001b[49m\u001b[43mfeature_columns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      5\u001b[0m \u001b[43m    \u001b[49m\u001b[43mtarget_columns\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mtarget_columns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      6\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdummy\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m      7\u001b[0m \u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m/code/lomas_client/libraries/diffprivlib.py:153\u001b[0m, in \u001b[0;36mDiffPrivLibClient.query\u001b[0;34m(self, pipeline, feature_columns, target_columns, test_size, test_train_split_seed, imputer_strategy, dummy, nb_rows, seed)\u001b[0m\n\u001b[1;32m    150\u001b[0m     r_model \u001b[38;5;241m=\u001b[39m QueryResponse\u001b[38;5;241m.\u001b[39mmodel_validate_json(data)\n\u001b[1;32m    151\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m r_model\n\u001b[0;32m--> 153\u001b[0m \u001b[43mraise_error\u001b[49m\u001b[43m(\u001b[49m\u001b[43mres\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    154\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n",
      "File \u001b[0;32m/code/lomas_client/utils.py:38\u001b[0m, in \u001b[0;36mraise_error\u001b[0;34m(response)\u001b[0m\n\u001b[1;32m     36\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m InvalidQueryException(error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInvalidQueryException\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m     37\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m response\u001b[38;5;241m.\u001b[39mstatus_code \u001b[38;5;241m==\u001b[39m status\u001b[38;5;241m.\u001b[39mHTTP_422_UNPROCESSABLE_ENTITY:\n\u001b[0;32m---> 38\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m ExternalLibraryException(\n\u001b[1;32m     39\u001b[0m         error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlibrary\u001b[39m\u001b[38;5;124m\"\u001b[39m], error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mExternalLibraryException\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m     40\u001b[0m     )\n\u001b[1;32m     41\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m response\u001b[38;5;241m.\u001b[39mstatus_code \u001b[38;5;241m==\u001b[39m status\u001b[38;5;241m.\u001b[39mHTTP_403_FORBIDDEN:\n\u001b[1;32m     42\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m UnauthorizedAccessException(error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mUnauthorizedAccessException\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n",
      "\u001b[0;31mExternalLibraryException\u001b[0m: ('diffprivlib', \"PrivacyLeakWarning: Bounds parameter hasn't been specified, so falling back to determining bounds from the data.\\n This will result in additional privacy leakage.  To ensure differential privacy with no additional privacy loss, specify `bounds` for each valued returned by np.mean().. Lomas server cannot fit pipeline on data, PrivacyLeakWarning is a blocker.\")"
     ]
    }
   ],
   "source": [
    "# Expect PrivacyLeakWarning Error\n",
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns =  feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9d58006-89d5-4110-9657-641256bafaf9",
   "metadata": {},
   "source": [
    "Diffprivlib requests that have **PrivacyLeakWarning** will not be processed in the server. \n",
    "In lomas, the bounds must always be specified. For most model, it is best to use **the standard scaler must always be used as a first step** and fill it based on the metadata values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "87384eb3-6b6c-44f8-b653-af471b234a2d",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_bounds(cols_metadata, columns):\n",
    "    lower = [cols_metadata[col][\"lower\"] for col in columns]\n",
    "    upper = [cols_metadata[col][\"upper\"] for col in columns]\n",
    "    return (lower, upper)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "12d1faa1-f88a-49bb-911a-83f879ca10b6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([30.0, 13.0, 150.0, 2000.0], [65.0, 23.0, 250.0, 7000.0])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)\n",
    "bounds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "1ca11e06-0b1e-4238-a5c1-1e52a6569431",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    ('scaler', models.StandardScaler(epsilon = 0.5, bounds=bounds)),\n",
    "    ('classifier', models.LogisticRegression(epsilon = 1.0))\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "235d7da6-f6bd-4e71-9b07-84cbe283b74e",
   "metadata": {},
   "outputs": [
    {
     "ename": "ExternalLibraryException",
     "evalue": "('diffprivlib', 'PrivacyLeakWarning: Data norm has not been specified and will be calculated on the data provided.  This will result in additional privacy leakage. To ensure differential privacy and no additional privacy leakage, specify `data_norm` at initialisation.. Lomas server cannot fit pipeline on data, PrivacyLeakWarning is a blocker.')",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mExternalLibraryException\u001b[0m                  Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[14], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# Expect PrivacyLeakWarning Error\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m dummy_response \u001b[38;5;241m=\u001b[39m \u001b[43mclient\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdiffprivlib\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mquery\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      3\u001b[0m \u001b[43m    \u001b[49m\u001b[43mpipeline\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mdpl_pipeline\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      4\u001b[0m \u001b[43m    \u001b[49m\u001b[43mfeature_columns\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mfeature_columns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      5\u001b[0m \u001b[43m    \u001b[49m\u001b[43mtarget_columns\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mtarget_columns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      6\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdummy\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m      7\u001b[0m \u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m/code/lomas_client/libraries/diffprivlib.py:153\u001b[0m, in \u001b[0;36mDiffPrivLibClient.query\u001b[0;34m(self, pipeline, feature_columns, target_columns, test_size, test_train_split_seed, imputer_strategy, dummy, nb_rows, seed)\u001b[0m\n\u001b[1;32m    150\u001b[0m     r_model \u001b[38;5;241m=\u001b[39m QueryResponse\u001b[38;5;241m.\u001b[39mmodel_validate_json(data)\n\u001b[1;32m    151\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m r_model\n\u001b[0;32m--> 153\u001b[0m \u001b[43mraise_error\u001b[49m\u001b[43m(\u001b[49m\u001b[43mres\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    154\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n",
      "File \u001b[0;32m/code/lomas_client/utils.py:38\u001b[0m, in \u001b[0;36mraise_error\u001b[0;34m(response)\u001b[0m\n\u001b[1;32m     36\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m InvalidQueryException(error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInvalidQueryException\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m     37\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m response\u001b[38;5;241m.\u001b[39mstatus_code \u001b[38;5;241m==\u001b[39m status\u001b[38;5;241m.\u001b[39mHTTP_422_UNPROCESSABLE_ENTITY:\n\u001b[0;32m---> 38\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m ExternalLibraryException(\n\u001b[1;32m     39\u001b[0m         error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlibrary\u001b[39m\u001b[38;5;124m\"\u001b[39m], error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mExternalLibraryException\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m     40\u001b[0m     )\n\u001b[1;32m     41\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m response\u001b[38;5;241m.\u001b[39mstatus_code \u001b[38;5;241m==\u001b[39m status\u001b[38;5;241m.\u001b[39mHTTP_403_FORBIDDEN:\n\u001b[1;32m     42\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m UnauthorizedAccessException(error_message[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mUnauthorizedAccessException\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n",
      "\u001b[0;31mExternalLibraryException\u001b[0m: ('diffprivlib', 'PrivacyLeakWarning: Data norm has not been specified and will be calculated on the data provided.  This will result in additional privacy leakage. To ensure differential privacy and no additional privacy leakage, specify `data_norm` at initialisation.. Lomas server cannot fit pipeline on data, PrivacyLeakWarning is a blocker.')"
     ]
    }
   ],
   "source": [
    "# Expect PrivacyLeakWarning Error\n",
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c47aec3e-15ea-476d-9dcb-5b2f4888e355",
   "metadata": {},
   "source": [
    "Again, we have a Privacy Leak. For the same reason, the data_norm should be computed based on metadata and given as argument as explained in the error message."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "25a72ad9-d3f3-478a-8d33-c07bedbf4f66",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The max l2 norm of any row of the data. This defines the spread of data that will be protected by differential privacy.\n",
    "data_norm = np.sqrt(np.linalg.norm(bounds[1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "828a743e-960f-4713-a6ed-a0bd243a13e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    ('scaler', models.StandardScaler(epsilon = 0.5, bounds=bounds)),\n",
    "    ('classifier', models.LogisticRegression(epsilon = 1.0, data_norm = data_norm))\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "c9c13dfe-0266-4cd8-b126-2accee6c1136",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ccdb7f5-5968-47e7-99d4-16067b8645a2",
   "metadata": {},
   "source": [
    "The pipeline worked, she can check that she has a dummy model and a dummy score associated. In the case of a Logistic Regression the score is a mean accuracy as specified [here](https://diffprivlib.readthedocs.io/en/latest/modules/models.html#diffprivlib.models.LogisticRegression.score).\n",
    "Each model return an associated score. The associated documentation is in the DiffPrivLib documentation in the `score` method of each model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "3c38f919-7ca1-455c-9b0f-bc2b56f60c0f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-1 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: black;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-1 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-1 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: block;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-1 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-1 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 1ex;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-1 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>Pipeline(steps=[(&#x27;scaler&#x27;,\n",
       "                 StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "                                bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                        array([  65.,   23.,  250., 7000.])),\n",
       "                                epsilon=0.5)),\n",
       "                (&#x27;classifier&#x27;,\n",
       "                 LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                                    data_norm=83.69469642643347))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" ><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;&nbsp;Pipeline<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.4/modules/generated/sklearn.pipeline.Pipeline.html\">?<span>Documentation for Pipeline</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>Pipeline(steps=[(&#x27;scaler&#x27;,\n",
       "                 StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "                                bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                        array([  65.,   23.,  250., 7000.])),\n",
       "                                epsilon=0.5)),\n",
       "                (&#x27;classifier&#x27;,\n",
       "                 LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                                    data_norm=83.69469642643347))])</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" ><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">StandardScaler</label><div class=\"sk-toggleable__content fitted\"><pre>StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "               bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                       array([  65.,   23.,  250., 7000.])),\n",
       "               epsilon=0.5)</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" ><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">LogisticRegression</label><div class=\"sk-toggleable__content fitted\"><pre>LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                   data_norm=83.69469642643347)</pre></div> </div></div></div></div></div></div>"
      ],
      "text/plain": [
       "Pipeline(steps=[('scaler',\n",
       "                 StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "                                bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                        array([  65.,   23.,  250., 7000.])),\n",
       "                                epsilon=0.5)),\n",
       "                ('classifier',\n",
       "                 LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                                    data_norm=83.69469642643347))])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dummy_response.result.model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a61629d-b96f-4114-8bc2-eccf34d69d2b",
   "metadata": {},
   "source": [
    "Now that the pipeline seems to work, she also wants to choose another data imputation method: be default the missing data are dropped but she wants the replace them with the mean. Therefore, she uses the `imputer_strategy` argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "547935ca-0932-4476-8ace-3644c7e0a08d",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    imputer_strategy = \"mean\",\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4587385a-b76e-45f4-a1d9-af0a88a1181f",
   "metadata": {},
   "source": [
    "It also works. It she wanted she could replace by the mean value with `imputer_strategy = \"mean\"` or the most frequent value with `imputer_strategy = \"most_frequent\"` (most_frequent makes more sense in the case of categorical columns). "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "381cd096-bde3-474b-b24d-7fcb4c6ae49f",
   "metadata": {},
   "source": [
    "Finally, she wants to use as much data as possible to train the model so she decides to reduce the `test_size` to 0.1 (meaning that 10% of the data will be used as the test set and 90% and the training set). Also she modifies the seed for the random split between training and testing data `test_train_split_seed` because why not. By default `test_size = 0.2` and `test_train_split_seed = 1`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "73e8b9ab-2d93-4333-ace3-c94c714410dc",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.1,\n",
    "    test_train_split_seed = 4,\n",
    "    imputer_strategy = \"mean\",\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f9c2330-15f3-46f9-94c1-d62ea350e2e8",
   "metadata": {},
   "source": [
    "#### She can now estimated the cost of this pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "e036b55f-6c3d-4212-8d91-9ece8223cf69",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CostResponse(epsilon=1.5, delta=0.0)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res = client.diffprivlib.cost(\n",
    "    dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.1,\n",
    "    test_train_split_seed = 4,\n",
    "    imputer_strategy = \"mean\",\n",
    ")\n",
    "res"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "4398755f-348f-47e2-9329-80c34897e16b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The cost will be 1.5 epsilon and 0.0 delta.'"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f\"The cost will be {res.epsilon} epsilon and {res.delta} delta.\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aad5f86-20f7-42fe-8efd-56c96f6beb41",
   "metadata": {},
   "source": [
    "Now we train the same pipeline on the real dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "b823db85-d993-4a36-9dea-bf5989c543f8",
   "metadata": {},
   "outputs": [],
   "source": [
    "res = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.1,\n",
    "    test_train_split_seed = 4,\n",
    "    imputer_strategy = \"mean\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "6c071425-5364-4d79-8ccf-4f46018ac849",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The accuracy score of the model trained on real data is 0.6.'"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f\"The accuracy score of the model trained on real data is {res.result.score}.\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9cea3fc7-4331-4387-a19f-6afcaa21b6bc",
   "metadata": {},
   "source": [
    "The model is with different trained parameters is also available:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "b646cd08-504b-4846-88ba-a3557ee4b2fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "model = res.result.model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a581f47d-f1c7-4121-9f8f-e496e9023d58",
   "metadata": {},
   "source": [
    "We predict what would be the specie of the smallest possible penguin in all dimension versus to biggest possible penguin in all dimensions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "65413ef0-317b-431d-9ed8-6ffabe85c1b2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>bill_depth_mm</th>\n",
       "      <th>flipper_length_mm</th>\n",
       "      <th>body_mass_g</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>150.0</td>\n",
       "      <td>2000.0</td>\n",
       "      <td>Adelie</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>23.0</td>\n",
       "      <td>250.0</td>\n",
       "      <td>7000.0</td>\n",
       "      <td>Gentoo</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g predictions\n",
       "0            30.0           13.0              150.0       2000.0      Adelie\n",
       "1            65.0           23.0              250.0       7000.0      Gentoo"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bounds[0][0], bounds[1][0]], \n",
    "    'bill_depth_mm': [bounds[0][1], bounds[1][1]] , \n",
    "    'flipper_length_mm': [bounds[0][2], bounds[1][2]], \n",
    "    'body_mass_g': [bounds[0][3], bounds[1][3]]\n",
    "})\n",
    "\n",
    "predictions = model.predict(x_to_predict)\n",
    "x_to_predict[\"predictions\"] = predictions\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "839ba199-3014-4e39-8f0a-aecf15241b11",
   "metadata": {},
   "source": [
    "## Step 5: Train other models with DiffPrivLib"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f161f96-e240-45b2-bb60-002870196158",
   "metadata": {},
   "source": [
    "The logic is always the same for all the models. The `pipeline` and `feature_columns` arguments must always be specified for all models. The `target_columns` must be specified except for Clustering (K-Means) and Dimensinnality reduction (PCA).\n",
    "\n",
    "Here are examples of each on dummy dataframes."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fef35343-cdf6-4e86-a54a-4bcf6568f2ac",
   "metadata": {},
   "source": [
    "### Classification: Gaussian Naive Bayes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "f8a5ddc0-61bb-44c0-a14b-5ac3278ff539",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm']\n",
    "target_columns = ['species']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "4f4b9789-7da1-4abe-9dd6-dd1472a4e217",
   "metadata": {},
   "outputs": [],
   "source": [
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "9dd37857-0ef2-44d5-b206-092262c9cea1",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    ('scaler', models.StandardScaler(epsilon = 0.5, bounds=bounds)),\n",
    "    ('gaussian', models.GaussianNB(epsilon = 1.0, bounds=bounds, priors = (0.3, 0.3, 0.4))),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "8e2c07c6-20f4-473c-80f6-b4b558191530",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.15,\n",
    "    imputer_strategy = \"median\",\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "550868df-5476-496e-849d-8787da9eb27e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CostResponse(epsilon=1.5, delta=0.0)"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cost_res = client.diffprivlib.cost(\n",
    "    dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.15,\n",
    "    imputer_strategy = \"median\",\n",
    ")\n",
    "cost_res"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "934a1237-68d1-4796-838e-e8c370068f5d",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    imputer_strategy = \"median\",\n",
    "    test_size = 0.15,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "81708812-897a-4357-af06-139220538831",
   "metadata": {},
   "outputs": [],
   "source": [
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bounds[0][0], bounds[1][0]], \n",
    "    'bill_depth_mm': [bounds[0][1], bounds[1][1]] , \n",
    "    'flipper_length_mm': [bounds[0][2], bounds[1][2]], \n",
    "})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "ec3c98d3-cb5d-4a96-9ada-ea9904326aec",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>bill_depth_mm</th>\n",
       "      <th>flipper_length_mm</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>150.0</td>\n",
       "      <td>Chinstrap</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>23.0</td>\n",
       "      <td>250.0</td>\n",
       "      <td>Chinstrap</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  bill_depth_mm  flipper_length_mm predictions\n",
       "0            30.0           13.0              150.0   Chinstrap\n",
       "1            65.0           23.0              250.0   Chinstrap"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "predictions = response.result.model.predict(x_to_predict)\n",
    "x_to_predict[\"predictions\"] = predictions\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1224c94-d995-40f1-967d-b7afb21bce48",
   "metadata": {},
   "source": [
    "### Random Forest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "980f4494-4b1f-4cb0-b288-ddc672231b1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'bill_depth_mm', 'body_mass_g']\n",
    "target_columns = ['island']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "e91f9a5c-14ea-4e48-8e42-fa50152767b7",
   "metadata": {},
   "outputs": [],
   "source": [
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "d4f036a7-0405-4f4b-a4a0-3fe1b0b8d4d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    (\n",
    "        'rf', \n",
    "        models.RandomForestClassifier(\n",
    "            n_estimators=10, \n",
    "            epsilon = 2.0, \n",
    "            bounds=bounds, \n",
    "            classes=penguin_metadata['columns']['island']['categories']\n",
    "        )\n",
    "    ),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "c969d488-e1a2-445a-a269-681d96343b9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    imputer_strategy = \"drop\", #default\n",
    "    dummy = True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "de94f0a5-6e9d-4eac-aa77-b98892ed41fc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CostResponse(epsilon=2.0, delta=0.0)"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cost_res = client.diffprivlib.cost(\n",
    "    dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    imputer_strategy = \"drop\", #default\n",
    ")\n",
    "cost_res"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "40f25e33-a644-4938-8adb-a7bbe5306ee2",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    imputer_strategy = \"drop\", #default\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "4f5fad30-991e-45f3-a7e3-1ae04bedd8a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "model = response.result.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "96562928-73e8-48b1-b8a5-97245c7da8d2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>bill_depth_mm</th>\n",
       "      <th>body_mass_g</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>2000.0</td>\n",
       "      <td>Biscoe</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>23.0</td>\n",
       "      <td>7000.0</td>\n",
       "      <td>Torgersen</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  bill_depth_mm  body_mass_g predictions\n",
       "0            30.0           13.0       2000.0      Biscoe\n",
       "1            65.0           23.0       7000.0   Torgersen"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bounds[0][0], bounds[1][0]], \n",
    "    'bill_depth_mm': [bounds[0][1], bounds[1][1]] , \n",
    "    'body_mass_g': [bounds[0][2], bounds[1][2]]\n",
    "})\n",
    "predictions = model.predict(x_to_predict)\n",
    "x_to_predict[\"predictions\"] = predictions\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "020ceeb1-7a78-4036-8ece-0afa88e49342",
   "metadata": {},
   "source": [
    "### Decision Tree Classifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "c5400d29-8303-4714-8617-8a2b26b0aa2e",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'body_mass_g']\n",
    "target_columns = ['species']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "1760f475-6326-41e5-945a-66882183bc00",
   "metadata": {},
   "outputs": [],
   "source": [
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "22b5c5ae-b008-4980-8767-47e7a1ad3301",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    (\n",
    "        'dtc', \n",
    "        models.DecisionTreeClassifier(\n",
    "            epsilon = 2.0, \n",
    "            bounds=bounds, \n",
    "            classes=penguin_metadata['columns']['species']['categories']\n",
    "        )\n",
    "    ),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "70b73ba0-67dc-4594-820c-0128916fafc8",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.2,\n",
    "    test_train_split_seed = 1,\n",
    "    dummy = True,\n",
    "    nb_rows = 100,\n",
    "    seed = 42\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "736e8283-604a-4e45-9e98-7da5cc03e1b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    test_size = 0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "66bfe1a9-7982-4d6b-ac0b-32a4412f29bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "model = response.result.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "a3719359-58a6-4271-9a34-4b49e0d7188e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>body_mass_g</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>2000.0</td>\n",
       "      <td>Gentoo</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>7000.0</td>\n",
       "      <td>Chinstrap</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  body_mass_g predictions\n",
       "0            30.0       2000.0      Gentoo\n",
       "1            65.0       7000.0   Chinstrap"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bounds[0][0], bounds[1][0]], \n",
    "    'body_mass_g': [bounds[0][1], bounds[1][1]] , \n",
    "})\n",
    "x_to_predict[\"predictions\"] = model.predict(x_to_predict)\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e72e9fc-bb5b-49d9-b4e6-543bdaf93f69",
   "metadata": {},
   "source": [
    "### Regression: Linear Regression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "c893cbb0-26b7-4367-ab67-1ccc88f951ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm']\n",
    "target_columns = ['bill_depth_mm']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "c7c66418-74b3-4887-bb99-3d8b6ea7af3d",
   "metadata": {},
   "outputs": [],
   "source": [
    "bill_length_meta = penguin_metadata['columns']['bill_length_mm']\n",
    "bill_depth_meta = penguin_metadata['columns']['bill_depth_mm']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "907e6f6c-a3da-4f7c-8d85-b0a7b8c018d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    (\n",
    "        'lr', \n",
    "        models.LinearRegression(\n",
    "            epsilon = 2.0, \n",
    "            bounds_X=(bill_length_meta['lower'], bill_length_meta['upper']), \n",
    "            bounds_y=(bill_depth_meta['lower'], bill_depth_meta['upper'])\n",
    "        )\n",
    "    ),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "58011a7d-7c03-4872-a032-5d93c55c1d5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    target_columns = target_columns,\n",
    "    dummy = True\n",
    ")\n",
    "model = dummy_response.result.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "e28b1bd0-0e73-49db-a2d3-e7f06dc39d49",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>17.985419</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>17.489243</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  predictions\n",
       "0            30.0    17.985419\n",
       "1            65.0    17.489243"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Dummy model predictions\n",
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bill_length_meta['lower'], bill_length_meta['upper']], \n",
    "})\n",
    "x_to_predict[\"predictions\"] = model.predict(x_to_predict)\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec9e921d-1640-4379-84e5-65b93a1ff203",
   "metadata": {},
   "source": [
    "### Clustering: K-Means"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "2491d06f-3c40-4649-b419-ccd9b39f0764",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "10acc43d-ff2e-4d7a-9c52-361ffb80e0da",
   "metadata": {},
   "outputs": [],
   "source": [
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "1e830a01-7800-4701-aef2-11dbfc517f61",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    ('kmeans', models.KMeans(n_clusters = 8, epsilon = 2.0, bounds=bounds)),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "cb5e3e1f-2d48-4d5e-992a-c0e0cb6add3b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-2 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: black;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-2 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-2 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-2 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-2 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: block;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-2 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-2 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-2 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-2 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-2 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-2 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-2 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 1ex;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-2 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-2 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-2 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>Pipeline(steps=[(&#x27;kmeans&#x27;,\n",
       "                 KMeans(accountant=BudgetAccountant(spent_budget=[(2.0, 0)]),\n",
       "                        bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                array([  65.,   23.,  250., 7000.])),\n",
       "                        epsilon=2.0))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-4\" type=\"checkbox\" ><label for=\"sk-estimator-id-4\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;&nbsp;Pipeline<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.4/modules/generated/sklearn.pipeline.Pipeline.html\">?<span>Documentation for Pipeline</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>Pipeline(steps=[(&#x27;kmeans&#x27;,\n",
       "                 KMeans(accountant=BudgetAccountant(spent_budget=[(2.0, 0)]),\n",
       "                        bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                array([  65.,   23.,  250., 7000.])),\n",
       "                        epsilon=2.0))])</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-5\" type=\"checkbox\" ><label for=\"sk-estimator-id-5\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">KMeans</label><div class=\"sk-toggleable__content fitted\"><pre>KMeans(accountant=BudgetAccountant(spent_budget=[(2.0, 0)]),\n",
       "       bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "               array([  65.,   23.,  250., 7000.])),\n",
       "       epsilon=2.0)</pre></div> </div></div></div></div></div></div>"
      ],
      "text/plain": [
       "Pipeline(steps=[('kmeans',\n",
       "                 KMeans(accountant=BudgetAccountant(spent_budget=[(2.0, 0)]),\n",
       "                        bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                array([  65.,   23.,  250., 7000.])),\n",
       "                        epsilon=2.0))])"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    dummy = True\n",
    ")\n",
    "model = dummy_response.result.model\n",
    "model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "id": "a0612b10-de7a-4355-8268-09a46c30056a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>bill_length_mm</th>\n",
       "      <th>bill_depth_mm</th>\n",
       "      <th>flipper_length_mm</th>\n",
       "      <th>body_mass_g</th>\n",
       "      <th>predictions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30.0</td>\n",
       "      <td>13.0</td>\n",
       "      <td>150.0</td>\n",
       "      <td>2000.0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>23.0</td>\n",
       "      <td>250.0</td>\n",
       "      <td>7000.0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  predictions\n",
       "0            30.0           13.0              150.0       2000.0            3\n",
       "1            65.0           23.0              250.0       7000.0            2"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Dummy model predictions\n",
    "x_to_predict = pd.DataFrame({\n",
    "    'bill_length_mm': [bounds[0][0], bounds[1][0]], \n",
    "    'bill_depth_mm': [bounds[0][1], bounds[1][1]] , \n",
    "    'flipper_length_mm': [bounds[0][2], bounds[1][2]], \n",
    "    'body_mass_g': [bounds[0][3], bounds[1][3]]\n",
    "})\n",
    "x_to_predict[\"predictions\"] = model.predict(x_to_predict)\n",
    "x_to_predict"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6021b034-b15d-4826-8de3-80b99afc838d",
   "metadata": {},
   "source": [
    "### Dimensionality Reduction: PCA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "fd8347be-5951-419d-820c-26655db5ea8c",
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_columns = ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']\n",
    "bounds = get_bounds(penguin_metadata['columns'], columns=feature_columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "f9a4cbaf-b994-41c0-a0e9-81674c6657fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "dpl_pipeline = Pipeline([\n",
    "    (\n",
    "        'pca', \n",
    "        models.PCA(\n",
    "            n_components=None, \n",
    "            epsilon = 1.0, \n",
    "            bounds=bounds, \n",
    "            data_norm=100, \n",
    "            centered=False\n",
    "        )\n",
    "    ),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "id": "4681cc17-6285-4b9b-a26d-837d6a79e1c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy_response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    "    dummy = True\n",
    ")\n",
    "model = dummy_response.result.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "e2d6912e-f97c-4ac6-9085-c7379136628c",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = client.diffprivlib.query(\n",
    "    pipeline = dpl_pipeline,\n",
    "    feature_columns = feature_columns,\n",
    ")\n",
    "model = response.result.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "e2c0cafa-49f6-48de-aa1a-4e84bbbf97cf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-3 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: black;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-3 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-3 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-3 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-3 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: block;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-3 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-3 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-3 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-3 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-3 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-3 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-3 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 1ex;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-3 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-3 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-3 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-3\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>PCA(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "    bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "            array([  65.,   23.,  250., 7000.])),\n",
       "    data_norm=100)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-6\" type=\"checkbox\" checked><label for=\"sk-estimator-id-6\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;PCA<span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>PCA(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "    bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "            array([  65.,   23.,  250., 7000.])),\n",
       "    data_norm=100)</pre></div> </div></div></div></div>"
      ],
      "text/plain": [
       "PCA(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "    bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "            array([  65.,   23.,  250., 7000.])),\n",
       "    data_norm=100)"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model = model.steps[0][1]\n",
    "pca_model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "b1ac9784-bd0e-41c6-bb54-8201d9ab13ad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.08124548, -0.11603131, -0.06907006,  0.98750455],\n",
       "       [-0.37988112,  0.74377432,  0.54189149,  0.09404104],\n",
       "       [-0.11526053, -0.62023805,  0.77538936, -0.0281267 ],\n",
       "       [ 0.91422345,  0.22054765,  0.31678744,  0.12328802]])"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.components_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "id": "d7cf2790-1938-48b1-83b3-3b17b06953fc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([8914.48046055, 1029.95283494,  241.10575291,   94.79455338])"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.explained_variance_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "91992588-2eb5-4b6b-9f30-ccaa272c593a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.86713922, 0.10018671, 0.02345311, 0.00922096])"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.explained_variance_ratio_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "id": "9e76a632-114b-428f-ad78-d83fdcd00d56",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1536.98969484,  252.77069553,  158.4946581 ,  522.43420759])"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.singular_values_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "id": "fc13910b-b892-4a46-bff9-c3df4cc6ed72",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  43.71211178,   16.5604082 ,  189.09819947, 4237.9468197 ])"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.mean_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "id": "bd11a051-1452-4b59-8c18-622737986125",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.n_components_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "id": "282d55e9-0968-4528-b2d4-4d87d64d4c1a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pca_model.noise_variance_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94eaf59b-c108-424c-8978-b1c86e141ccb",
   "metadata": {},
   "source": [
    "## Step 6: See archives of queries"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64003c53-de56-4bdc-a3c2-0c3e40031919",
   "metadata": {},
   "source": [
    "She now wants to verify all the queries that she did on the real data. It is possible because an archive of all queries is kept in a secure database. With a function call she can see her queries, budget and associated responses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "id": "008fd230-cdfd-4e03-91ce-5a60b06c106d",
   "metadata": {},
   "outputs": [],
   "source": [
    "previous_queries = client.get_previous_queries()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "id": "1795a54b-d04e-4687-8649-93982c84ad30",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'user_name': 'Dr. Antartica',\n",
       " 'dataset_name': 'PENGUIN',\n",
       " 'dp_librairy': 'diffprivlib',\n",
       " 'client_input': {'dataset_name': 'PENGUIN',\n",
       "  'diffprivlib_json': '{\"module\": \"diffprivlib\", \"version\": \"0.6.4\", \"pipeline\": [{\"type\": \"_dpl_type:StandardScaler\", \"name\": \"scaler\", \"params\": {\"with_mean\": true, \"with_std\": true, \"copy\": true, \"epsilon\": 0.5, \"bounds\": {\"_tuple\": true, \"_items\": [[30.0, 13.0, 150.0, 2000.0], [65.0, 23.0, 250.0, 7000.0]]}, \"random_state\": null, \"accountant\": \"_dpl_instance:BudgetAccountant\"}}, {\"type\": \"_dpl_type:LogisticRegression\", \"name\": \"classifier\", \"params\": {\"tol\": 0.0001, \"C\": 1.0, \"fit_intercept\": true, \"random_state\": null, \"max_iter\": 100, \"verbose\": 0, \"warm_start\": false, \"n_jobs\": null, \"epsilon\": 1.0, \"data_norm\": 83.69469642643347, \"accountant\": \"_dpl_instance:BudgetAccountant\"}}]}',\n",
       "  'feature_columns': ['bill_length_mm',\n",
       "   'bill_depth_mm',\n",
       "   'flipper_length_mm',\n",
       "   'body_mass_g'],\n",
       "  'target_columns': ['species'],\n",
       "  'test_size': 0.1,\n",
       "  'test_train_split_seed': 4,\n",
       "  'imputer_strategy': 'mean'},\n",
       " 'response': {'epsilon': 1.5,\n",
       "  'delta': 0.0,\n",
       "  'requested_by': 'Dr. Antartica',\n",
       "  'result': {'res_type': 'diffprivlib',\n",
       "   'score': 0.6,\n",
       "   'model': Pipeline(steps=[('scaler',\n",
       "                    StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "                                   bounds=(array([  30.,   13.,  150., 2000.]),\n",
       "                                           array([  65.,   23.,  250., 7000.])),\n",
       "                                   epsilon=0.5)),\n",
       "                   ('classifier',\n",
       "                    LogisticRegression(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                                       data_norm=83.69469642643347))])}},\n",
       " 'timestamp': 1728464751.4143507}"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_1 = previous_queries[0]\n",
    "query_1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "id": "ef251e47-67d8-426b-9655-c16d32778579",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'user_name': 'Dr. Antartica',\n",
       " 'dataset_name': 'PENGUIN',\n",
       " 'dp_librairy': 'diffprivlib',\n",
       " 'client_input': {'dataset_name': 'PENGUIN',\n",
       "  'diffprivlib_json': '{\"module\": \"diffprivlib\", \"version\": \"0.6.4\", \"pipeline\": [{\"type\": \"_dpl_type:StandardScaler\", \"name\": \"scaler\", \"params\": {\"with_mean\": true, \"with_std\": true, \"copy\": true, \"epsilon\": 0.5, \"bounds\": {\"_tuple\": true, \"_items\": [[30.0, 13.0, 150.0], [65.0, 23.0, 250.0]]}, \"random_state\": null, \"accountant\": \"_dpl_instance:BudgetAccountant\"}}, {\"type\": \"_dpl_type:GaussianNB\", \"name\": \"gaussian\", \"params\": {\"priors\": {\"_tuple\": true, \"_items\": [0.3, 0.3, 0.4]}, \"var_smoothing\": 1e-09, \"epsilon\": 1.0, \"bounds\": {\"_tuple\": true, \"_items\": [[30.0, 13.0, 150.0], [65.0, 23.0, 250.0]]}, \"random_state\": null, \"accountant\": \"_dpl_instance:BudgetAccountant\"}}]}',\n",
       "  'feature_columns': ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm'],\n",
       "  'target_columns': ['species'],\n",
       "  'test_size': 0.15,\n",
       "  'test_train_split_seed': 1,\n",
       "  'imputer_strategy': 'median'},\n",
       " 'response': {'epsilon': 1.5,\n",
       "  'delta': 0.0,\n",
       "  'requested_by': 'Dr. Antartica',\n",
       "  'result': {'res_type': 'diffprivlib',\n",
       "   'score': 0.17307692307692307,\n",
       "   'model': Pipeline(steps=[('scaler',\n",
       "                    StandardScaler(accountant=BudgetAccountant(spent_budget=[(0.5, 0)]),\n",
       "                                   bounds=(array([ 30.,  13., 150.]),\n",
       "                                           array([ 65.,  23., 250.])),\n",
       "                                   epsilon=0.5)),\n",
       "                   ('gaussian',\n",
       "                    GaussianNB(accountant=BudgetAccountant(spent_budget=[(1.0, 0)]),\n",
       "                               bounds=(array([ 30.,  13., 150.]),\n",
       "                                       array([ 65.,  23., 250.])),\n",
       "                               priors=(0.3, 0.3, 0.4)))])}},\n",
       " 'timestamp': 1728464776.4250495}"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_2 = previous_queries[1]\n",
    "query_2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "id": "b2fa8943-f0e0-4902-9001-bd10fb22a653",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'user_name': 'Dr. Antartica',\n",
       " 'dataset_name': 'PENGUIN',\n",
       " 'dp_librairy': 'diffprivlib',\n",
       " 'client_input': {'dataset_name': 'PENGUIN',\n",
       "  'diffprivlib_json': '{\"module\": \"diffprivlib\", \"version\": \"0.6.4\", \"pipeline\": [{\"type\": \"_dpl_type:RandomForestClassifier\", \"name\": \"rf\", \"params\": {\"n_estimators\": 10, \"n_jobs\": 1, \"random_state\": null, \"verbose\": 0, \"warm_start\": false, \"max_depth\": 5, \"epsilon\": 2.0, \"bounds\": {\"_tuple\": true, \"_items\": [[30.0, 13.0, 2000.0], [65.0, 23.0, 7000.0]]}, \"classes\": [\"Torgersen\", \"Biscoe\", \"Dream\"], \"shuffle\": false, \"accountant\": \"_dpl_instance:BudgetAccountant\"}}]}',\n",
       "  'feature_columns': ['bill_length_mm', 'bill_depth_mm', 'body_mass_g'],\n",
       "  'target_columns': ['island'],\n",
       "  'test_size': 0.2,\n",
       "  'test_train_split_seed': 1,\n",
       "  'imputer_strategy': 'drop'},\n",
       " 'response': {'epsilon': 2.0,\n",
       "  'delta': 0.0,\n",
       "  'requested_by': 'Dr. Antartica',\n",
       "  'result': {'res_type': 'diffprivlib',\n",
       "   'score': 0.417910447761194,\n",
       "   'model': Pipeline(steps=[('rf',\n",
       "                    RandomForestClassifier(accountant=BudgetAccountant(spent_budget=[(2.0, 0)]),\n",
       "                                           bounds=(array([  30.,   13., 2000.]),\n",
       "                                                   array([  65.,   23., 7000.])),\n",
       "                                           classes=['Torgersen', 'Biscoe',\n",
       "                                                    'Dream'],\n",
       "                                           epsilon=2.0))])}},\n",
       " 'timestamp': 1728464804.2231035}"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_3 = previous_queries[2]\n",
    "query_3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2c8d40d-94b3-4d69-af99-0ec2936f233e",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}