lomas_server.dp_queries.dp_libraries package
Submodules
lomas_server.dp_queries.dp_libraries.diffprivlib module
- class lomas_server.dp_queries.dp_libraries.diffprivlib.DiffPrivLibQuerier(data_connector: DataConnector, admin_database: AdminDatabase)[source]
Bases:
DPQuerier[DiffPrivLibRequestModel,DiffPrivLibQueryModel,DiffPrivLibQueryResult]Concrete implementation of the DPQuerier ABC for the DiffPrivLib library.
- cost(query_json: DiffPrivLibRequestModel) tuple[float, float][source]
Estimate cost of query.
- Parameters:
query_json (DiffPrivLibRequestModel) – The request model object.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
- Returns:
- The tuple of costs, the first value
is the epsilon cost, the second value is the delta value.
- Return type:
tuple[float, float]
- fit_model_on_data(query_json: DiffPrivLibRequestModel) tuple[Pipeline, DataFrame, DataFrame][source]
Perform necessary steps to fit the model on the data.
- Parameters:
query_json (BaseModel) – The JSON request object for the query.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
- Returns:
the fitted model on the training data x_test (pd.DataFrame): test data feature y_test (pd.DataFrame): test data target
- Return type:
dpl_pipeline (dpl model)
- query(query_json: DiffPrivLibQueryModel) DiffPrivLibQueryResult[source]
Perform the query and return the response.
- Parameters:
query_json (DiffPrivLibQueryModel) – The request model object.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
InvalidQueryException – If the budget values are too small to perform the query.
- Returns:
The dictionary encoding of the resulting pd.DataFrame.
- Return type:
dict
- lomas_server.dp_queries.dp_libraries.diffprivlib.split_train_test_data(df: DataFrame, query_json: DiffPrivLibRequestModel) tuple[DataFrame, DataFrame, DataFrame, DataFrame][source]
Split the data between train and test set.
- Parameters:
df (pd.DataFrame) – dataframe with the data
query_json (DiffPrivLibRequestModel) – user input query indication feature_columns (list[str]): columns from data to use as features target_columns (list[str]): columns from data to use as target (to predict) test_size (float): proportion of data in the test set test_train_split_seed (int): seed for the random train-test split
- Returns:
training data features x_test (pd.DataFrame): testing data features y_train (pd.DataFrame): training data target y_test (pd.DataFrame): testing data target
- Return type:
x_train (pd.DataFrame)
lomas_server.dp_queries.dp_libraries.factory module
lomas_server.dp_queries.dp_libraries.opendp module
lomas_server.dp_queries.dp_libraries.smartnoise_sql module
- class lomas_server.dp_queries.dp_libraries.smartnoise_sql.SmartnoiseSQLQuerier(data_connector: DataConnector, admin_database: AdminDatabase)[source]
Bases:
DPQuerier[SmartnoiseSQLRequestModel,SmartnoiseSQLQueryModel,SmartnoiseSQLQueryResult]Concrete implementation of the DPQuerier ABC for the SmartNoiseSQL library.
- cost(query_json: SmartnoiseSQLRequestModel) tuple[float, float][source]
Estimate cost of query.
- Parameters:
query_json (SmartnoiseSQLModelCost) – JSON request object for the query.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
- Returns:
- The tuple of costs, the first value
is the epsilon cost, the second value is the delta value.
- Return type:
tuple[float, float]
- query(query_json: SmartnoiseSQLQueryModel) SmartnoiseSQLQueryResult[source]
Performs the query and returns the response.
- Parameters:
query_json (SmartnoiseSQLQueryModel) – The request model object.
- Returns:
The dictionary encoding of the result pd.DataFrame.
- Return type:
dict
- query_with_iter(query_json: SmartnoiseSQLQueryModel, nb_iter: int = 0) SmartnoiseSQLQueryResult[source]
Perform the query and return the response.
- Parameters:
query_json (SmartnoiseSQLQueryModel) – Request object for the query.
nb_iter (int, optional) – Number of trials if output is Nan. Defaults to 0.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
InvalidQueryException – If the budget values are too small to perform the query.
- Returns:
The dictionary encoding of the resulting pd.DataFrame.
- Return type:
- lomas_server.dp_queries.dp_libraries.smartnoise_sql.convert_to_smartnoise_metadata(metadata: Metadata) dict[source]
Convert Lomas metadata to smartnoise metadata format (for SQL).
- Parameters:
metadata (Metadata) – Dataset metadata from admin database
- Returns:
metadata of the dataset in smartnoise-sql format
- Return type:
dict
- lomas_server.dp_queries.dp_libraries.smartnoise_sql.set_mechanisms(privacy: Privacy, mechanisms: dict[str, str]) Privacy[source]
Set privacy mechanisms on the Privacy object.
For more information see: https://docs.smartnoise.org/sql/advanced.html#overriding-mechanisms
- Parameters:
privacy (Privacy) – Privacy object.
mechanisms (dict[str, str]) – Mechanisms to set.
- Returns:
The updated Privacy object.
- Return type:
Privacy
lomas_server.dp_queries.dp_libraries.smartnoise_synth module
- class lomas_server.dp_queries.dp_libraries.smartnoise_synth.SmartnoiseSynthQuerier(data_connector: DataConnector, admin_database: AdminDatabase)[source]
Bases:
DPQuerier[SmartnoiseSynthRequestModel,SmartnoiseSynthQueryModel,SmartnoiseSynthSamples|SmartnoiseSynthModel]Concrete implementation of the DPQuerier ABC for the SmartNoiseSynth library.
- cost(query_json: SmartnoiseSynthRequestModel) tuple[float, float][source]
Return cost of query_json.
- Parameters:
query_json (SmartnoiseSynthRequestModel) – JSON request object for the query.
- Returns:
- The tuple of costs, the first value
is the epsilon cost, the second value is the delta value.
- Return type:
tuple[float, float]
# TODO: verify and model.rho
- query(query_json: SmartnoiseSynthQueryModel) SmartnoiseSynthSamples | SmartnoiseSynthModel[source]
Perform the query and return the response.
- Parameters:
query_json (SmartnoiseSynthQueryModel) – The request object for the query.
- Raises:
ExternalLibraryException – For exceptions from libraries external to this package.
InvalidQueryException – If the budget values are too small to perform the query.
- Returns:
The resulting pd.DataFrame samples.
- Return type:
pd.DataFrame
lomas_server.dp_queries.dp_libraries.utils module
- lomas_server.dp_queries.dp_libraries.utils.handle_missing_data(df: DataFrame, imputer_strategy: str) DataFrame[source]
Impute missing data based on given imputation strategy for NaNs.
- Parameters:
df (pd.DataFrame) – dataframe with the data
imputer_strategy (str) – string to indicate imputatation for NaNs “drop”: will drop all rows with missing values “mean”: will replace values by the mean of the column values “median”: will replace values by the median of the column values “most_frequent”: : will replace values by the most frequent values
- Raises:
InvalidQueryException – If the “imputer_strategy” does not exist
- Returns:
dataframe with the imputed data
- Return type:
df (pd.DataFrame)