{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# S3 example" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## Step 1: Install the library\n", "To interact with the secure server on which the data is stored, one first needs to install the library `lomas-client` on her local developping environment. \n", "\n", "It can be installed via the pip command:" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": {}, "outputs": [], "source": [ "#!pip install lomas-client" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "from lomas_client.client import Client\n", "import numpy as np" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## Step 2: Initialise the client\n", "\n", "Once the library is installed, a Client object must be created. It is responsible for sending sending requests to the server and processing responses in the local environment. It enables a seamless interaction with the server. \n", "\n", "The client needs a few parameters to be created. Usually, these would be set in the environment by the system administrator (queen Icebergina) and be transparent to lomas users. In this instance, the following code snippet sets a few of these parameters that are specific to this notebook. " ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "# The following would usually be set in the environment by a system administrator\n", "# and be tranparent to lomas users. We reset these ones because they are specific to this notebook.\n", "\n", "# Note that all client settings can also be passed as keyword arguments to the Client constructor.\n", "# eg. client = Client(client_id = \"Dr.Antartica\") takes precedence over setting the \"LOMAS_CLIENT_CLIENT_ID\"\n", "# environment variable.\n", "\n", "import os\n", "\n", "USER_NAME = \"Jack\"\n", "os.environ[\"LOMAS_CLIENT_CLIENT_ID\"] = USER_NAME\n", "os.environ[\"LOMAS_CLIENT_CLIENT_SECRET\"] = USER_NAME.lower()\n", "os.environ[\"LOMAS_CLIENT_DATASET_NAME\"] = \"TITANIC\"" ] }, { "cell_type": "code", "execution_count": null, "id": "6", "metadata": {}, "outputs": [], "source": [ "client = Client()" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "## Step 3: Understand the functionnalities of the library" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "### Getting dataset metadata" ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'max_ids': 1,\n", " 'rows': 887,\n", " 'row_privacy': True,\n", " 'censor_dims': False,\n", " 'columns': {'Pclass': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'int',\n", " 'precision': 32,\n", " 'lower': 1,\n", " 'upper': 3},\n", " 'Name': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'string'},\n", " 'Sex': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'string',\n", " 'cardinality': 2,\n", " 'categories': ['male', 'female']},\n", " 'Age': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'float',\n", " 'precision': 64,\n", " 'lower': 0.1,\n", " 'upper': 100.0},\n", " 'SibSp': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'int',\n", " 'precision': 32,\n", " 'lower': 0,\n", " 'upper': 10},\n", " 'Parch': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'int',\n", " 'precision': 32,\n", " 'lower': 0,\n", " 'upper': 10},\n", " 'Ticket': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'string'},\n", " 'Fare': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'float',\n", " 'precision': 64,\n", " 'lower': 0.0,\n", " 'upper': 1000.0},\n", " 'Cabin': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'string'},\n", " 'Embarked': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'string',\n", " 'cardinality': 3,\n", " 'categories': ['C', 'Q', 'S']},\n", " 'Survived': {'private_id': False,\n", " 'nullable': False,\n", " 'max_partition_length': None,\n", " 'max_influenced_partitions': None,\n", " 'max_partition_contributions': None,\n", " 'type': 'boolean'}}}" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "titanic_metadata = client.get_dataset_metadata()\n", "titanic_metadata" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "### Get a dummy dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": {}, "outputs": [], "source": [ "NB_ROWS = 200\n", "SEED = 0" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(200, 11)\n" ] }, { "data": { "text/html": [ "
\n", " | Pclass | \n", "Name | \n", "Sex | \n", "Age | \n", "SibSp | \n", "Parch | \n", "Ticket | \n", "Fare | \n", "Cabin | \n", "Embarked | \n", "Survived | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "3 | \n", "o | \n", "female | \n", "89.690443 | \n", "6 | \n", "6 | \n", "2 | \n", "858.435326 | \n", "U | \n", "S | \n", "True | \n", "
1 | \n", "2 | \n", "D | \n", "male | \n", "58.373673 | \n", "0 | \n", "0 | \n", "Z | \n", "620.908898 | \n", "a | \n", "C | \n", "True | \n", "
2 | \n", "2 | \n", "u | \n", "female | \n", "4.117800 | \n", "2 | \n", "4 | \n", "h | \n", "193.917948 | \n", "G | \n", "S | \n", "True | \n", "
3 | \n", "1 | \n", "o | \n", "male | \n", "71.177534 | \n", "9 | \n", "7 | \n", "a | \n", "687.914521 | \n", "Z | \n", "Q | \n", "True | \n", "
4 | \n", "1 | \n", "3 | \n", "male | \n", "56.945683 | \n", "4 | \n", "10 | \n", "1 | \n", "758.999002 | \n", "W | \n", "S | \n", "True | \n", "