Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote git server as source for plans and devices #363

Open
callumforrester opened this issue Feb 5, 2024 · 22 comments
Open

Remote git server as source for plans and devices #363

callumforrester opened this issue Feb 5, 2024 · 22 comments
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request rest api Potential REST API changes worker Relates to worker code

Comments

@callumforrester
Copy link
Collaborator

Remote Git Repository as Source for Plans and Devices

Background

We introduced a scratch area in #313 so that plan/device code could be loaded from external files on startup. Typically we check out these files on a shared filesystem for prototyping. We have found several issues with this approach:

  • Longer startup times as the code is loaded in
  • Blueapi is difficult to hot-reload (hopefully fixed with Run worker in subprocess #317)
  • The shared files may get conflicting edits by multiple people
    • Especially confusing if one person left uncomitted changes
    • May also lead to permissions issues if someone creates, and owns, a new file
  • It can be hard to restore blueapi to a known state without deleting WIP code in the scratch area

We are therefore looking for a more flexible, long-term solution that will enable more robust deployments at the facility level.

Proposed Solution

Outline

Add the option to configure blueapi to pull down and install git repositories into its environment on startup. Define a "workspace" i.e. a group of remote git repositories that blueapi uses. On startup/reload, it will pull these as-configured (latest main, latest tag, some specific branch or hash). It can be configured via REST endpoints. It will also cache what it pulls in case the remote goes down.

blueapi-code-upload-simple(1)

Example Workflow for Prototyping Changes

An example workflow for prototyping changes to plans/devices is therefore:

  • Checkout the relevent repo
  • Make a new branch
  • Make some experimental changes and push
  • Point blueapi at the new branch

This adds some extra overhead to get experimental changes in, however they can be addressed outside of blueapi. For example, we have considered an auto-commit program for less advanced users. From their perspective they click "save" and a few seconds later the changes are live.

Benefits

  • We can track ongoing changes to deployments at the facility level
  • No shared checkouts are being edited
  • There is always the option of easily restoring to a working state (main branch)
  • Caching reduces startup times most of the time

Outstanding Concerns

  • Is this too complex?
  • It's risky to depend on remote github/gitlab repositories for runtime operation (solvable by hosting an intermediate git server, but that adds yet more complexity)
  • Are we just using git as a database?
@callumforrester callumforrester added enhancement New feature or request dependencies Pull requests that update a dependency file rest api Potential REST API changes worker Relates to worker code labels Feb 5, 2024
@stan-dot stan-dot self-assigned this Mar 8, 2024
@stan-dot
Copy link
Collaborator

re: 'point blueapi at the new branch'
how would this be done? with a config value? there is no 'config' area in the current CLI.

re: risk
an intermediate server could be on the local gitlab with bidirectional mirroring. an opinion from the cloud team would be necessary. Definitely relying on github only without redundancy would be bad.

re: Are we just using git as a database?
is there something wrong with that?

re: self hosted vscode

we should spin a container there with devcontainers with the auto commit extension
https://marketplace.visualstudio.com/items?itemName=vsls-contrib.gitdoc

re: making own editor from scratch

I had a brief discussion about this. Main conclusion - it's very difficult.
intellisense would be a huge block and there is no spare capacity to make it a good product. there are some libraries like

https://editorjs.io/
or
https://github.com/tinymce/tinymce

but those are not well-suited for code. Even still we could manage file updates - https://fastapi.tiangolo.com/tutorial/request-files/ .

@stan-dot
Copy link
Collaborator

re: self-hosted but diff than vscode

I hope each of those could be pulled up with a simple helm chart.
The one for redhat doesn't seem the best as the beamline workstations have windows not redhat.

then the search for a self-hosted solution starts with awesome-selfhosted README.

there we find coder

eclipse Theia
https://theia-ide.org/

gitpod
https://www.gitpod.io/

code-server
https://docs.linuxserver.io/images/docker-code-server/#usage
https://code.visualstudio.com/docs/remote/vscode-server

and also some more niche projects for comparison, that are likely not mature enough for our use case:

@stan-dot
Copy link
Collaborator

action recommendation:
outline a brief plan to get a devcontainer setup for a specific beamline - like i20-1 - define a set of extensions, and create a github repo with a gitlab mirror. Then take this to the cloud team to get the thing running and then add 'read from repo' endpoint refresh into blueapi config.

this proof of concept is not an MVP yet but every day that the scripts aren't version-controlled is a risk.

Expected outcome:
in 2 weeks the proof of concept is built and then the effort and cloud resources to create an MVP can be estimated, and subsequent migration can be propagated throughout the beamlines.

Once deployed presumably this would require little maintainance. One shared repository for all the plans across beamlines could be added as a nice feature.

@callumforrester
Copy link
Collaborator Author

A few thoughts:

  • We do not want github or gitlab as an operational dependency, perhaps the internal mirror should be a more lightweight git server that we can deploy alongside blueapi?
  • Why don't we want to self-host vscode? Probably best if we have the same UI as the scientists for plan development
  • Why a devcontainer per beamline?
  • One shared repo of all plans is an interesting idea, would need other opinions.
  • What do we need the cloud team for?

A couple of other things to consider:

  • We want to automate git committing, which will lead to a messy git history. Do we also want to stage to a separate, tidy git history? E.g. via PRs
  • Do we want to track the history of each plan? If so, how do we do this if we're just committing them to a git repo, git has okay-but-not-perfect tools for tracking the history of a function, and if a plan is renamed then it may well be a case-by-case decision as to whether it is the same "plan" afterwards.

@stan-dot
Copy link
Collaborator

I mean our local gitlab instance maintained by the cloud team.
that seems easier that making another mirror from scratch
https://docs.gitlab.com/ee/user/project/repository/mirror/bidirectional.html

I mean for 1 beamline as an MVP, to try out there their specific plans. limiting the scope.

messy git history is better than no history.

why would we want to track history of each plan? that's a wild requirement. arguably we could cache a git ref or text snapshot in the plan metadata instead of just plan name. not sure what is the use case here

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 2, 2024

testing the linuxserver /code-server image now https://hub.docker.com/r/linuxserver/code-server

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 2, 2024

let's consider for a moment what if the experimental plans lived at the head, people just pushing to production all the time. of course the more long-lived plans would be in a different directory

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 2, 2024

an intermediate server could be on the local gitlab with bidirectional mirroring. an opinion from the cloud team would be necessary. Definitely relying on github only without redundancy would be bad.

with that of course to not get stuck in the event of github coing down

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 2, 2024

and the workspace dir could be mounted in /dls so that it's always available locally if needed

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 2, 2024

gitdocs seems to work fine. next I'll try to commit a new ophyd device from there

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 3, 2024

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 3, 2024

one issue seems to be about the 'writing to local filesystem' and also then for multi-user access.

perhaps the use of coder would be needed https://coder.com/docs/v2/latest/install/kubernetes

update: our current cloud config should be ok to try out coder

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 3, 2024

update: I have no idea how to deploy coder with helm in my namespace at the argus cluster

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 4, 2024

MVP idea - just use the provided and managed module load vscode with a loaded devcontainer with pre-loaded extension for autosave as well as a minimalistic vscode settings profile. Users that'd like more control could customize their profile further.

https://code.visualstudio.com/docs/editor/profiles#_python-profile-template

we could save the profiles as github gists
https://code.visualstudio.com/docs/editor/profiles#_save-as-a-github-gist

we could hard-code script like this:
module load vscode && code bluesky_plans --profile https://gist.github.com/diego3g/b1b189063d21b96d6144ca896755be64

the profile indicated here is the one from the github gist with the most stars: https://gist.github.com/search?l=JSON&o=desc&q=vscode+profile&s=stars

@callumforrester
Copy link
Collaborator Author

Where does the local copy of the code live?

@stan-dot
Copy link
Collaborator

stan-dot commented Apr 4, 2024

ah I forgot to address this key question. GitDoc is a Visual Studio Code extension that allows you to automatically commit/push/pull changes on save local copy could live on the workstation scratch dir. Or somewhere else, I don't think it'd make much of a difference if the push and pull is frequent

@callumforrester
Copy link
Collaborator Author

From discussion: the core problem is to put git as the barrier between changes being written and read from blueapi. The target that blueapi is pointing at MUST BE beyond the local filesystem. Where the edits happen is secondary - that can be in a local vim or preconfigured vscode editor, or in the cloud with a tool like coder.

A ligtweight git server could be tested to investigate this further.

@stan-dot
Copy link
Collaborator

@stan-dot stan-dot removed their assignment Apr 22, 2024
@stan-dot
Copy link
Collaborator

stumbled across another solution for this
https://github.com/rgrove/synchrotron

@callumforrester
Copy link
Collaborator Author

Continued in #509

@callumforrester callumforrester closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2024
@stan-dot
Copy link
Collaborator

@callumforrester this could still be kept here as part of the epic afaik

@callumforrester
Copy link
Collaborator Author

@stan-dot Good point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request rest api Potential REST API changes worker Relates to worker code
Projects
None yet
Development

No branches or pull requests

2 participants