lomas_server.data_connector package

Submodules

lomas_server.data_connector.data_connector module

class lomas_server.data_connector.data_connector.DataConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None)[source]

Bases: BaseModel, ABC

Overall access to sensitive data.

property datetime_columns: list[str]
df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None
property dtypes: dict[str, str]
abstract get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Returns:

The pandas dataframe for this dataset.

Return type:

pd.DataFrame

get_polars_lf() LazyFrame[source]

Get the data in polars lazyframe format.

Returns:

The polars lazyframe for this dataset.

Return type:

pl.LazyFrame

metadata: Metadata
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lomas_server.data_connector.data_connector.get_column_dtypes(metadata: Metadata) tuple[dict[str, str], list[str]][source]

Extracts and returns the column types from the metadata.

Parameters:

metadata (Metadata) – The metadata.

Returns:

dict: The dictionary of the column type.

list: The list of columns of datetime type

Return type:

Tuple[Dict[str, str], List[str]]

lomas_server.data_connector.factory module

lomas_server.data_connector.in_memory_connector module

class lomas_server.data_connector.in_memory_connector.InMemoryConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['InMemoryConnector'] = 'InMemoryConnector')[source]

Bases: DataConnector

DataConnector for a dataset created from an in-memory pandas DataFrame.

get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Returns:

pandas dataframe of dataset (a copy)

Return type:

pd.DataFrame

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['InMemoryConnector']

lomas_server.data_connector.path_connector module

class lomas_server.data_connector.path_connector.PathConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['PathConnector'] = 'PathConnector', dataset_path: Annotated[Path, PathType(path_type=file)] | HttpUrl)[source]

Bases: DataConnector

DataConnector for dataset located at constant path.

Path can be local or remote (http).

dataset_path: Annotated[Path, PathType(path_type=file)] | HttpUrl
get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Raises:

InternalServerException – If the file format is not supported.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['PathConnector']

lomas_server.data_connector.s3_connector module

class lomas_server.data_connector.s3_connector.S3Connector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['S3Connector'] = 'S3Connector', credentials: DSS3Access)[source]

Bases: DataConnector

DataConnector for dataset in S3 storage.

property bucket: str
credentials: DSS3Access
get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Raises:

InternalServerException – If the dataset cannot be read.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

property key: str
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['S3Connector']

Module contents

class lomas_server.data_connector.DataConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None)[source]

Bases: BaseModel, ABC

Overall access to sensitive data.

property datetime_columns: list[str]
df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None
property dtypes: dict[str, str]
abstract get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Returns:

The pandas dataframe for this dataset.

Return type:

pd.DataFrame

get_polars_lf() LazyFrame[source]

Get the data in polars lazyframe format.

Returns:

The polars lazyframe for this dataset.

Return type:

pl.LazyFrame

metadata: Metadata
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lomas_server.data_connector.InMemoryConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['InMemoryConnector'] = 'InMemoryConnector')[source]

Bases: DataConnector

DataConnector for a dataset created from an in-memory pandas DataFrame.

df: Annotated[pd.DataFrame, PlainSerializer(dataframe_to_dict)] | None
get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Returns:

pandas dataframe of dataset (a copy)

Return type:

pd.DataFrame

metadata: Metadata
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['InMemoryConnector']
class lomas_server.data_connector.PathConnector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['PathConnector'] = 'PathConnector', dataset_path: Annotated[Path, PathType(path_type=file)] | HttpUrl)[source]

Bases: DataConnector

DataConnector for dataset located at constant path.

Path can be local or remote (http).

dataset_path: Annotated[Path, PathType(path_type=file)] | HttpUrl
df: Annotated[pd.DataFrame, PlainSerializer(dataframe_to_dict)] | None
get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Raises:

InternalServerException – If the file format is not supported.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

metadata: Metadata
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['PathConnector']
class lomas_server.data_connector.S3Connector(*, metadata: Metadata, df: Annotated[DataFrame, PlainSerializer(func=dataframe_to_dict, return_type=PydanticUndefined, when_used=always)] | None = None, type: Literal['S3Connector'] = 'S3Connector', credentials: DSS3Access)[source]

Bases: DataConnector

DataConnector for dataset in S3 storage.

property bucket: str
credentials: DSS3Access
df: Annotated[pd.DataFrame, PlainSerializer(dataframe_to_dict)] | None
get_pandas_df() DataFrame[source]

Get the data in pandas dataframe format.

Raises:

InternalServerException – If the dataset cannot be read.

Returns:

pandas dataframe of dataset

Return type:

pd.DataFrame

property key: str
metadata: Metadata
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['S3Connector']