Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZenML API Keys and Service Accounts #1840

Merged
merged 19 commits into from
Nov 10, 2023

Conversation

stefannica
Copy link
Contributor

@stefannica stefannica commented Sep 26, 2023

Describe changes

ZenML Service Accounts

Service accounts are similar to users, with the difference that they are used to authenticate automated workloads like CI/CD jobs where the user-based web login isn't possible.

Internally, service accounts and users are represented using the same legacy user models and database table. This is required for backwards compatibility reasons. At the same time, new models and REST API routes are implemented for use exclusively with service accounts.

ZenML API Keys

Add support to generate and rotate long-lived API keys associated with a service account that can be used to authenticate to the ZenML Server.

  • API keys can be created only through the CLI at the moment (dashboard support will follow)
  • the API keys are associated with a service account
  • API keys can be used to authenticate the CLI instead of a username/password.
  • when an API key is rotated, the user has the option to configure a grace period during which both keys continue to be valid
  • the implementation is done in a way that only exposes the API key value when generated or rotated, but otherwise the (hashed) API key value is kept hidden in the DB layer and never exposed through the REST API.

TODO

  • documentation
  • unit tests

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to docs, I have updated the documentation accordingly.
  • If I have added an integration, I have updated the integrations table and the corresponding website section.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@github-actions github-actions bot added internal To filter out internal PRs and issues enhancement New feature or request labels Sep 26, 2023
@stefannica stefannica force-pushed the feature/cloud-368-service-accounts branch 2 times, most recently from 90bd0ca to 356bce5 Compare November 3, 2023 21:45
@stefannica stefannica marked this pull request as ready for review November 3, 2023 21:45
Copy link
Contributor

@AlexejPenner AlexejPenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts, will dive deeper tomorrow :)

src/zenml/cli/service_accounts.py Show resolved Hide resolved
)
def create_service_account(
service_account_name: str,
skip_api_key: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_api_key instead of skip, seems more natural to me imo. Less mental gymnastics required to understand whats going on ( don’t skip being a sort of double negative )

@stefannica stefannica force-pushed the feature/cloud-368-service-accounts branch from 8d9fb6d to 0ea5033 Compare November 6, 2023 10:18
Copy link
Contributor

@schustmi schustmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing job @stefannica, this looks very nice already and I only have a few minor comments. Can't wait to integrate this in my RBAC branch 🎉

src/zenml/zen_stores/schemas/user_schemas.py Show resolved Hide resolved
@@ -726,6 +735,7 @@ def list_users(
email: Optional[str] = None,
active: Optional[bool] = None,
email_opted_in: Optional[bool] = None,
is_service_account: Optional[bool] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion this should not be exposed to users but should instead always be set to False in the internal SQLZenStore query internally, so that this method only returns actual users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I already removed it based on our conversations, even before this review was posted.

src/zenml/models/service_account_models.py Show resolved Hide resolved
src/zenml/zen_server/auth.py Show resolved Hide resolved
src/zenml/zen_stores/sql_zen_store.py Show resolved Hide resolved
@stefannica stefannica force-pushed the feature/cloud-368-service-accounts branch from 0ea5033 to 04198be Compare November 7, 2023 12:53
@stefannica stefannica force-pushed the feature/cloud-368-service-accounts branch from 6cc6295 to 3fe8b3a Compare November 8, 2023 15:25
Copy link
Contributor

@AlexejPenner AlexejPenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦭 from my side

src/zenml/zen_stores/sql_zen_store.py Outdated Show resolved Hide resolved
@@ -413,6 +413,7 @@ def sample_user_model() -> UserResponseModel:
name="axl",
created=datetime.now(),
updated=datetime.now(),
is_service_account=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be false?

Comment on lines +1900 to +1980
@classmethod
@lru_cache(maxsize=1)
def _get_resource_references(
cls,
) -> List[Tuple[Type[SQLModel], str]]:
"""Get a list of all other table columns that reference the user table.

Given that this list doesn't change at runtime, we cache it.

Returns:
A list of all other table columns that reference the user table
as a list of tuples of the form
(<sqlmodel-schema-class>, <attribute-name>).
"""
from zenml.zen_stores import schemas as zenml_schemas

# Get a list of attributes that represent relationships to other
# resources
resource_attrs = [
attr
for attr in UserSchema.__sqlmodel_relationships__.keys()
if not attr.startswith("_") and attr not in
# These are not resources owned by the user or are resources that
# are deleted automatically when the user is deleted. Secrets in
# particular are left out because they are automatically deleted
# even when stored in an external secret store.
["teams", "assigned_roles", "api_keys", "auth_devices", "secrets"]
]

# This next part is crucial in preserving scalability: we don't fetch
# the values of the relationship attributes, because this would
# potentially load a huge amount of data into memory through
# lazy-loading. Instead, we use a DB query to count resources
# associated with the user for each individual resource attribute.

# To create this query, we need a list of all tables and their foreign
# keys that point to the user table.
foreign_keys: List[Tuple[Type[SQLModel], str]] = []
for resource_attr in resource_attrs:
# Extract the target schema from the annotation
annotation = UserSchema.__annotations__[resource_attr]

# The annotation must be of the form
# `typing.List[ForwardRef('<schema-class>')]`
# We need to recover the schema class from the ForwardRef
assert annotation._name == "List"
assert annotation.__args__
schema_ref = annotation.__args__[0]
assert isinstance(schema_ref, ForwardRef)
# We pass the zenml_schemas module as the globals dict to
# _evaluate, because this is where the schema classes are
# defined
if sys.version_info < (3, 9):
# For Python versions <3.9, leave out the third parameter to
# _evaluate
target_schema = schema_ref._evaluate(vars(zenml_schemas), {})
else:
target_schema = schema_ref._evaluate(
vars(zenml_schemas), {}, frozenset()
)
assert target_schema is not None
assert issubclass(target_schema, SQLModel)

# Next, we need to identify the foreign key attribute in the
# target table
table = UserSchema.metadata.tables[target_schema.__tablename__]
foreign_key_attr = None
for fk in table.foreign_keys:
if fk.column.table.name != UserSchema.__tablename__:
continue
if fk.column.name != "id":
continue
assert fk.parent is not None
foreign_key_attr = fk.parent.name
break

assert foreign_key_attr is not None

foreign_keys.append((target_schema, foreign_key_attr))

return foreign_keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😰

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added plenty of unit tests for this, no worries

stefannica and others added 2 commits November 10, 2023 14:36
Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com>
@stefannica stefannica merged commit 4076cee into develop Nov 10, 2023
14 of 21 checks passed
@stefannica stefannica deleted the feature/cloud-368-service-accounts branch November 10, 2023 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants