Module leapyear.admin

Administrative objects for LeapYear.

Database class

class leapyear.admin.Database(name, *, description=None, privacy_profile=None, privacy_limit_window_days=None, db_id=None)

Database object.

classmethod all()

Get all databases.

Returns

Iterator over all of the databases available on the server.

Return type

all_databases

property id

Get the ID.

Return type

int

property tables

Get the tables of the database.

Return type

Mapping[str, Table]

property views

Get the views of the database.

This property can only be seen by admins

Return type

Mapping[str, View]

property privacy_params

Get the privacy parameter of the database.

Return type

PrivacyProfileParams

property description

Get the database’s description.

Return type

Optional[str]

property privacy_profile

Get the database’s Privacy Profile.

Return type

str

set_privacy_profile(privacy_profile)

Set the database’s privacy profile asynchronously.

Parameters

privacy_profile (PrivacyProfile) – The new privacy profile.

Example

>>> db = c.databases["db1"]
>>> pp = c.privacy_profiles["Custom profile 1"]
>>> db.set_privacy_profile(pp)
Return type

None

get_privacy_limit()

Get the database’s privacy limit.

Return type

PrivacyLimit

set_privacy_limit_window(privacy_limit_window_days)

Set the database’s privacy limit window (provided in days).

load()

Load the database.

create(*, ignore_if_exists=False)

Create the database.

Return type

Database

drop(*, ignore_missing=False)

Drop the database.

get_access(subject=None)

Get the access level of the given subject.

Parameters

subject (Union[User, Group, None]) – A User or Group object. If none is provided, use the currently logged in user.

Return type

DatabaseAccessType

set_access(subject, access)

Grant the given access level to a subject.

Parameters
  • subject (Union[User, Group]) – A User or Group object.

  • access (DatabaseAccessType) – The access level to grant.

Return type

None

property name

Get the name.

Return type

str

Table class

class leapyear.admin.Table(name, *, database, columns=None, credentials=None, description=None, public=None, table_id=None, watch_folder=None, **kwargs)

Table object.

__init__(name, *, database, columns=None, credentials=None, description=None, public=None, table_id=None, watch_folder=None, **kwargs)

Initialize a Table object.

Parameters
  • name – The table name.

  • columns – The columns to create the Table with. If no columns provided, the schema will be auto detected from the data.

  • credentials – The credentials to the first data slice to be added to the Table.

  • description – The table’s description.

  • database – The database this table belongs to.

  • public – Whether this table should be a public table.

  • watch_folder – When True, the ‘credentials’ parameter should point to a directory of parquet files that will be watched for automatic data slice uploads. Only applicable for table creation.

property id

Get the ID.

Return type

int

property status

Get the status of the table.

Return type

str

property status_with_error

Get the status of the table and potentially the error information.

Return type

TableStatus

property columns

Get the columns of the table.

Return type

Mapping[str, TableColumn]

property description

Get the table’s description.

property database

Get the database the table belongs to.

Return type

str

property public

Identify whether the table is a public table.

Return type

bool

get_privacy_limit()

Show the privacy limit associated with the table.

Returns a value of None when the table is public.

set_privacy_limit(privacy_limit)

Set the privacy limit associated with the table.

Throws an error when the table is public.

Return type

None

property privacy_spent

Show the privacy spent for the current user on this table as a percentage.

Returns the privacy spent (𝜀) associated with all the information disclosed so far by the LeapYear platform to the current user working with this table. The value is represented as a percentage of the privacy limit (0, 10, 20, …100) set by the administrator. The value can exceed 100% if the admin forcibly lowers the privacy limit below the current user’s privacy spent. No queries can be run on a table where the privacy spent is at or above 100%.

If the table is public, returns None instead.

Returns

Privacy exposure, expressed as a percentage of the limit.

Return type

float

Examples

  1. Review the current level of privacy spent.

>>> from leapyear.admin import Database, Table
>>> db = Database('db')
>>> t = Table('table', database=db)
>>> print(t.privacy_spent)
50
get_user_privacy_spent(user)

Show the privacy spent for a user on this table.

Returns the privacy spent (𝜀), as a float, associated with all the information disclosed so far by the LeapYear platform to a user working with this table, and the privacy limit as an (𝜀, 𝛿) pair in a PrivacyLimit object.

Returns None instead, if the table is public.

This method is only available to authorized administrators, or to a user attempting to retrieve their own privacy spent.

set_user_privacy_limit(user, privacy_limit)

Allow the administrator to set the privacy limit for a user on this table.

Sets the privacy limit as a (𝜀, 𝛿) pair in a PrivacyLimit object for the user, on this table, that is considered acceptable by the administrator. If this method is not called, the user uses the privacy limit from the table.

If this is called with a public table, nothing happens.

This method is only available to authorized users with system admin privileges.

Return type

None

load()

Load the table.

create_async()

Create the table asynchronously.

Return type

AsyncJob

drop(*, ignore_missing=False)

Drop the table.

set_all_columns_access(subject, access)

Set the given access for all columns in the table.

If the table is public, the only legal access levels are full access and no access. Setting any other value will result in an error.

property slices

Show Data Slices for the table.

Return type

List[DataSlice]

add_data_slice(*args, **kwargs)

Add a data slice like add_data_slice_async, except runs synchronously.

Return type

None

add_data_slice_async(file_credentials, *, update_column_bounds=False)

Add a file to the list of data slices of the table.

Return type

AsyncJob

create(*, ignore_if_exists=False)

Create the object synchronously.

Functionally equivalent to .create_async().wait(max_timeout_sec=None).

Return type

AsyncCreateable

property name

Get the name.

Return type

str

ColumnDefinition class

class leapyear.admin.ColumnDefinition(name, *, type, bounds=None, nullable=False, description=None, infer_bounds=False)

The definition of a column for creating a Table with an explicit schema.

Example usage:

>>> table = Table(
...     columns=[ColumnDefinition("col1", type="INT", bounds=(0, 10))],
...     ...
... )
>>> table.create()

Changing values in a ColumnDefinition has no effect after a table is created. See the TableColumn documentation for functions to update column attributes after creating a table.

name

str

type

ColumnType

bounds

ColumnBounds

nullable

bool

description

str | None

infer_bounds

bool

__new__(**kwargs)

Create and return a new object. See help(type) for accurate signature.

TableColumn class

class leapyear.admin.TableColumn(*, database, table, id, name, type, bounds, nullable, description)

A column in a table.

property id

Get the id of the column.

Return type

int

property table

Get the table that the column belongs to.

Return type

str

property database

Get the database that the column belongs to.

Return type

str

property type

Get the type of the column.

Return type

ColumnType

property bounds

Get the bounds of the column.

Return type

Union[None, Tuple[int, int], Tuple[float, float], Tuple[date, date], Tuple[datetime, datetime], Set[str]]

property nullable

Get the nullability of the column.

Return type

bool

property description

Get the description of the column.

Return type

str

update(**kwargs)

Update the Column’s type, bounds, or nullable.

All of the parameters are optional. If anything is not provided, it’s left unchanged.

Parameters
  • type (Union[ColumnType, str]) –

  • bounds (ColumnBounds) –

  • nullable (bool) –

  • infer_bounds (bool) –

Return type

None

set_description(description)

Set the description of the Column.

Return type

None

get_access(subject=None)

Get the access level of the given subject.

Parameters

subject – A User or Group object. If none is provided, use the currently logged in user.

set_access(subject, access)

Grant the given access level to a subject.

If this is a column of a public table, only Full Access and No Access are legal values. Setting any other value will result in an error.

Parameters
  • subject (Union[User, Group]) – A User or Group object.

  • access (ColumnAccessType) – The access level to grant.

Return type

None

class leapyear.admin.ColumnType(value)

A column type.

BOOL = 'BOOL'

A BOOL column has no bounds.

INT = 'INT'

An INT column whose bounds should be a (int, int) pair.

REAL = 'REAL'

A REAL column whose bounds should be a (float, float) pair.

FACTOR = 'FACTOR'

A FACTOR column whose bounds should be a list of strings.

TEXT = 'TEXT'

A TEXT column has no bounds.

DATE = 'DATE'

A DATE column whose bounds should be a (date, date) pair, containing dates of the form 1970-01-31.

DATETIME = 'DATETIME'

A DATETIME column whose bounds should be a (datetime, datetime) pair, containing datetimes of the form 1970-01-31T00﹕00﹕00.

ID = 'ID'

An ID column has no bounds.

leapyear.admin.ColumnBounds

A type alias representing the union of all possible column bounds described in ColumnType

View class

class leapyear.admin.View(name, *, database, dataset, num_partitions=1, partitioning_columns=[], sort_within_partitions_by_columns=[], nominal_partitioning_columns=[], description=None, **kwargs)

View object.

A view is a dataset that can be persisted on disk (materialized), across restarts of the LeapYear application. Analysts referencing a materialized view will be using the dataset that is on disk, instead of re-calculating any transformations defined on the dataset.

A guide on how to use views can be found here.

Analysts should load views either from Database.views or using the database.view notation; for example:

>>> db = client.databases['db1']
>>> view1 = db.views['view1']
>>> ds1 = DataSet.from_view(view1)
>>> ds2 = DataSet.from_view('db1.view1')
Parameters
  • name (str) – The view’s name. Views must have unique names, including de-materialized views. View names cannot include any of these characters: ,;{}()=", or newlines (\n), or tabs (\t)

  • database (Union[str, Database]) – The database that the view belongs to. This should be the database that the tables referenced in the DataSet belong to.

  • dataset (DataSet) – The DataSet that will be stored as a view.

  • num_partitions – The number of partitions that the view will be split into. This will only be used if partitioning_columns is also set.

  • partitioning_columns – The columns by which to bucket (cluster) the view into partitions. This must be used with num_partitions. The view will have num_partitions number of partitions, and records with the same values for the partitioning_columns will be in the same partition.

  • sort_within_partitions_by_columns – The columns used to sort rows within each partition.

  • nominal_partitioning_columns – The columns by which to partition the view. This should be used by itself, without any other partition parameters.

  • description (Optional[str]) – Description of the view.

property database

Get the database associated to the view.

Return type

str

property description

Get the description associated to the view.

Return type

str

dematerialize()

Dematerialize the view.

This is the preferred method to free disk space used by a view.

load()

Load the view.

create_async()

Create the view asynchronously.

Return type

AsyncCreateJob

drop(*, ignore_missing=False)

Drop (and unregister) the view.

Admins should NOT drop a view unless they wish to also discard the entries in the analysis cache associated with that view. Instead, admins should use the dematerialize method.

create(*, ignore_if_exists=False)

Create the object synchronously.

Functionally equivalent to .create_async().wait(max_timeout_sec=None).

Return type

AsyncCreateable

property name

Get the name.

Return type

str

User class

class leapyear.admin.User(username, password=None, *, is_root=None, enabled=None, user_id=None, subj_id=None)

User object.

classmethod all()

All Users.

Returns

All users on the LeapYear server.

Return type

Iterator[User]

property id

Get the ID.

Return type

int

property subj_id

Get the subject ID.

Return type

int

property username

Get the username.

Return type

str

property is_root

Whether the user is a root user.

Return type

bool

property enabled

Whether the user is enabled.

Return type

bool

property groups

Get the groups of a user.

Returns

All groups of the user on the LeapYear server.

Return type

List[Group]

load()

Load the information for the user.

Return type

User

create(*, ignore_if_exists=False)

Create the user.

Return type

User

update(*, password=None, enabled=None)

Update the user.

Return type

User

property name

Get the name.

Return type

str

Privacy Profile class

class leapyear.admin.PrivacyProfile(name, *, params=None, hidden=None, verified=None, description=None, profile_id=None)

PrivacyProfile object.

classmethod get_latest_verified()

Get the latest verified privacy profile.

Return type

PrivacyProfile

classmethod all()

Get all privacy profiles.

Return type

Iterator[PrivacyProfile]

property id

Get the ID.

Return type

int

property description

Get the privacy profile description.

Return type

str

property hidden

Get whether the profile is hidden in the Data Manager.

Return type

bool

property verified

Get whether the profile is verified.

Return type

bool

property params

Get the parameters of the privacy profile.

Return type

PrivacyProfileParams

load()

Load the privacy profile.

create(*, ignore_if_exists=False)

Create the privacy profile.

Return type

PrivacyProfile

update(params=None, hidden=None)

Update the privacy profile’s params.

Parameters
  • params – The parameters to be updated.

  • hidden – Whether or not the privacy profile should be hidden in Data Manager.

Permission objects

class leapyear.admin.DatabaseAccessType(value)

AccessType for Databases.

NO_ACCESS_TO_DB = 'NO_ACCESS_TO_DB'

Prevents user from accessing database

SHOW_DATABASE = 'SHOW_DATABASE'

Allows a user to see this database and the tables it contains, including their public metadata

ADMINISTER_DATABASE = 'ADMINISTER_DATABASE'

Allows a user to administer this database - e.g. add data sources, grant user access

class leapyear.admin.ColumnAccessType(value)

AccessType for Columns.

NO_ACCESS = 'NO_ACCESS'

Prevents user from accessing column

COMPUTE = 'COMPUTE'

Allows a user to run randomized computations

FULL_ACCESS = 'FULL_ACCESS'

Allows a user to run randomized computations and view and retrieve raw data

COMPARE = 'COMPARE'