dataset#

class acore_df.dataset.BaseDataset(name: str, id_col: str, orm_model: Type[T_BASE_ORM_MODEL], orm_table: Table, data_class: Type[T_BASE_DATA_CLASS], engine: Engine)[source]#

Base class for a dataset that is backed by a SQL table.

Parameters:
  • name – the dataset name

  • id_col – the primary key column name

  • orm_model – sqlalchemy orm model

  • orm_table – sqlalchemy table

  • data_class – dataclasses data class

  • engine – sqlalchemy engine

property df: DataFrame#

Read the entire dataset into a polars DataFrame. This will be cached.

property row_map: Dict[Union[int, str], T_BASE_DATA_CLASS]#

Create a dictionary mapping the primary key to the data class instance.

get(id: Union[int, str]) Optional[T_BASE_DATA_CLASS][source]#

Get a data class instance by id value.

Note

If you have concern about mutability, you can do the following:

dataset = BaseDataset(...)
data = dataset.data_class(**dataclasses.asdict(dataset.get(id=1)))
Parameters:

id – the primary key value.

get_by_kvs(kvs: Dict[str, Any]) List[T_BASE_DATA_CLASS][source]#

Get a list of data class instances by key-value pairs.

Parameters:

kvs – key-value pairs, key is the column name, value is the value to match.

acore_df.dataset.download_sqlite(path_sqlite: Path = PosixPath('/home/docs/acore_df.sqlite'))[source]#

Download the sqlite database file from the GitHub release page.