lomas_server.private_dataset package

Submodules

lomas_server.private_dataset.in_memory_dataset module

class lomas_server.private_dataset.in_memory_dataset.InMemoryDataset(metadata: Dict[str, int | bool | Dict[str, str | int]], dataset_df: DataFrame)[source]

Bases: PrivateDataset

PrivateDataset for a dataset created from an in-memory pandas DataFrame.

get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format

Returns:

pandas dataframe of dataset (a copy)

Return type:

pd.DataFrame

lomas_server.private_dataset.path_dataset module

class lomas_server.private_dataset.path_dataset.PathDataset(metadata: Dict[str, int | bool | Dict[str, str | int]], dataset_path: str)[source]

Bases: PrivateDataset

PrivateDataset for dataset located at constant path.

Path can be local or remote (http).

get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format

Raises:

InternalServerException – If the file format is not supported.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

lomas_server.private_dataset.private_dataset module

class lomas_server.private_dataset.private_dataset.PrivateDataset(metadata: dict)[source]

Bases: ABC

Overall access to sensitive data

df: DataFrame | None = None
get_memory_usage() int[source]

Returns the memory usage of this dataset, in MiB.

The number returned only takes into account the memory usage of the pandas DataFrame “cached” in the instance.

Returns:

The memory usage, in MiB.

Return type:

int

get_metadata() dict[source]

Get the metadata for this dataset

Returns:

The metadata dictionary.

Return type:

dict

abstract get_pandas_df(dataset_name: str) DataFrame[source]

Get the data in pandas dataframe format

Parameters:

dataset_name (str) – name of the private dataset

Returns:

The pandas dataframe for this dataset.

Return type:

pd.DataFrame

subscribe_for_memory_usage_updates(dataset_observer: PrivateDatasetObserver) None[source]

Add the PrivateDatasetObserver to the list of dataset_observers.

Parameters:

dataset_observer (PrivateDatasetObserver) – The observer of this dataset.

lomas_server.private_dataset.s3_dataset module

class lomas_server.private_dataset.s3_dataset.S3Dataset(metadata: dict, s3_parameters: dict)[source]

Bases: PrivateDataset

PrivateDataset for dataset in S3 storage.

get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format

Raises:

InternalServerException – If the dataset cannot be read.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

lomas_server.private_dataset.utils module

lomas_server.private_dataset.utils.private_dataset_factory(dataset_name: str, admin_database: AdminDatabase) PrivateDataset[source]

Returns the appropriate dataset class based on dataset storage location

Parameters:
  • dataset_name (str) – The dataset name.

  • admin_database (AdminDatabase) – An initialized instance of AdminDatabase.

Raises:

InternalServerException – If the dataset type does not exist.

Returns:

The PrivateDataset instance for this dataset.

Return type:

PrivateDataset

Module contents