Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a linstor storage driver #564

Open
6 tasks
stgraber opened this issue Feb 29, 2024 · 27 comments · May be fixed by #1621
Open
6 tasks

Add a linstor storage driver #564

stgraber opened this issue Feb 29, 2024 · 27 comments · May be fixed by #1621
Assignees
Labels
API Changes to the REST API Documentation Documentation needs updating
Milestone

Comments

@stgraber
Copy link
Member

stgraber commented Feb 29, 2024

Adding this as its own feature request from discussions in #344.

There's been some interest recently in seeing Linstor added as a remote storage driver alongside Ceph and clustered LVM.

The initial things to look at are:

  • Simplest way to deploy a minimal Linstor setup over 3 machines
  • CLI & API way to create or delete what we call a storage pool
  • CLI & API way to create or delete a basic block volume of a defined size
  • CLI & API way to create or delete a snapshot on top of a volume
  • CLI & API way to create or delete a new volume from a snapshot of another volume
  • CLI & API way to handle moving a volume between systems (if any operation is needed for that, usually things like locking, primary host affinity, that kind of stuff)

Go client for Linstor: https://github.com/LINBIT/golinstor

@stgraber stgraber added Documentation Documentation needs updating Feature API Changes to the REST API labels Feb 29, 2024
@stgraber stgraber added this to the soon milestone Mar 8, 2024
@sona78
Copy link
Contributor

sona78 commented Apr 2, 2024

Hi I'm interested in working on this issue. May I be assigned to this?

@stgraber
Copy link
Member Author

stgraber commented Apr 2, 2024

Hey there,

I wouldn't recommend taking on this issue as this is the kind of work that I'm currently estimating at more than a month of full time dedicated work to get this done right

@sona78
Copy link
Contributor

sona78 commented Apr 2, 2024

That makes sense, thank you for letting me know

@sharathsivakumar
Copy link
Contributor

@stgraber Can I take a stab at this? I would not be able to work full time on this. If there is no urgent deadline for the delivery, I can commit to working on this for the next 2~3 months and getting it done.

@stgraber
Copy link
Member Author

stgraber commented Apr 4, 2024

Sure, you can start looking into this one.

The first stage doesn't actually involve Incus too much. You should basically get a few systems (I'd do 3), give them some extra disks for use with Linstor and then deploy both an Incus cluster across those systems but with no storage configured at this stage.

Then deploy Linstor on the same machines and after that, start figuring out what needs to be done through the Linstor Go client to create a pool, create volumes within that pool, create snapshots on those volumes, what are the main configuration options we'll want to expose on both pools and volumes and what needs to be done when an instance is moved from one system to another.

That all should happen outside of Incus. Once you have some example Go code using Linstor's Go client which can handle those basic cases, then we can start putting all that into a new Incus storage driver, figure out what needs to be done to be able to do basic tests of Linstor in the Github Actions environment and then get it reviewed and merged.

@stgraber
Copy link
Member Author

stgraber commented Apr 4, 2024

Frequent updates within this issue as you make progress through that will be key so anyone else who knows about Linstor can assist with the best ways to set things up and to run it.

@sharathsivakumar
Copy link
Contributor

sharathsivakumar commented Apr 5, 2024

@stgraber Sounds good. Thanks for the detailed explanation. I will start working on setting up the required instances and post updates here based on it.
For the incus and linstor instances, VMs will be sufficient right?

@sharathsivakumar
Copy link
Contributor

@stgraber Could you please assign this to me? I will start work on this from this week.

@stgraber
Copy link
Member Author

stgraber commented Apr 8, 2024

Done

@thomasdba
Copy link

any updates? thanks

@stgraber
Copy link
Member Author

stgraber commented Jul 7, 2024

@sharathsivakumar

@vic-t
Copy link

vic-t commented Sep 10, 2024

@sharathsivakumar can we get an update?

@sharathsivakumar
Copy link
Contributor

Hi @vic-t , @stgraber
I was away for a while due to personal commitments and health issues. I am picking up things again this week. Will update the progress here in a week.

@stgraber
Copy link
Member Author

Hey @sharathsivakumar, how's it going with this one?

@winiciusallan
Copy link
Contributor

Hi @sharathsivakumar! I'm very interested in this feature so if you need some help with Linstor or something and I know how to answer it, let me know.

@luissimas
Copy link
Contributor

Hello folks! I'm also quite interested in this feature, so if there's anything I can help with let me know. For a minimal Linstor deploy for testing I think the Ansible playbooks provided by LINBIT might be a good start. A few months ago I had to do a demonstration of Linstor so I ended up making a very simple playbook that could also be used as a starting point.

@stgraber
Copy link
Member Author

Cleared assignee as he's clearly inactive.

@luissimas
Copy link
Contributor

Hey @stgraber! I'll be able to work on this feature part time. @winiciusallan is also interested in it, so we agreed to collaborate on the development. As a first step, we want to get a basic automation setup to quickly deploy an Linstor cluster for development. We're thinking about doing something similar to incus-deploy, in which we provision some VMs using Incus itself and then deploy Linstor using Ansible.

Once that's done, we'll probably start to discuss the requirements for making Incus communicate with Linstor over it's API to reach a minimal set of config options for that integration. We'll probably need a list of endpoints and optional client certificates for mTLS, much like how it's done for OVN today.

If it sounds good to you, could you assign the issue to me?

@bensmrs
Copy link
Contributor

bensmrs commented Jan 17, 2025

Well we figured out today that Linstor may be a good fit for a new cluster we’re deploying at $job. If splitting (and coordinating) the workload between 3 people works for you two, you can count me in. I’ll be able to work on it on my office time: the sooner we can get our cluster up and running, the better :)
What do you think?

@stgraber
Copy link
Member Author

Great to see all the interest!

I think the first step is definitely to agree on minimal steps to get Linstor deployed and working on its own across 2-3 VMs. That way we can focus on the Incus integration with it.

We should try to aim for something pretty simple and lightweight that can be reviewed and merged pretty quickly, that way it's then easier for multiple people to contribute improvements and missing features on top of that afterwards.

@luissimas
Copy link
Contributor

@bensmrs that sounds great! @winiciusallan is working on automating a minimal Linstor deployment. We'll reuse this Ansible playbook for the deployment/configuration part, the only thing missing is the Terraform side of things to provision some VMs with Incus itself (like it's done with linstor-deploy). We should get that done in the next few days.

I'll be working on the integration part-time at my job starting next week. Do you think we could discuss the next steps together at the start of the next week?

We should try to aim for something pretty simple and lightweight that can be reviewed and merged pretty quickly, that way it's then easier for multiple people to contribute improvements and missing features on top of that afterward.

Agreed. We could discuss the basic shape of the integration together, and then implement an MVP with the basic features. @stgraber what set of features would be considered the bare minimal so we can start paralelizing the work? I think the config options for connecting with Linstor controller are an obvious candidate here, but I'm not sure about how much of the Driver interface should be implemented on this first PR.

@stgraber
Copy link
Member Author

So at the pool level, we obviously need to be able to create and delete pools.
Then for volumes, the bare minimum viable would be support for creating, renaming and deleting volumes of either block or filesystem.

That would leave things like snapshots, volume resize, migration, backup, ... to be implemented through follow-up PRs. Though some of those (migration, backups, ...) have generic helpers for non-optimized code paths that should just work out of the box.

@stgraber stgraber removed the Feature label Jan 17, 2025
@bensmrs
Copy link
Contributor

bensmrs commented Jan 17, 2025

I'll be working on the integration part-time at my job starting next week. Do you think we could discuss the next steps together at the start of the next week?

Sure! Feel free to ping me :)

@luissimas
Copy link
Contributor

While doing the Terraform side of things to provision the VMs for Linstor, @winiciusallan and I figured that it would make more sense to just use incus-deploy as a base. With lxc/incus-deploy#20 we now have a way to deploy a basic but functional Linstor cluster alongside Incus.

That should be enough for a development environment so we can start working on the integration itself.

@winiciusallan
Copy link
Contributor

winiciusallan commented Jan 18, 2025

Hi @bensmrs ! I've sent to you an invite in Linkedin so we can discuss the teamwork and the decisions that we have made until now.

@bensmrs
Copy link
Contributor

bensmrs commented Jan 20, 2025

Oh sorry, that’s not a network I often check… See you there!

@luissimas
Copy link
Contributor

luissimas commented Jan 27, 2025

Hello folks! Just sharing a quick status update. I've been colaborating closely with @winiciusallan to get our MVP running as soon as possible. Here's a quick demo of creating a storage pool with the linstor driver. In this particular case, Incus creates a new resource group in Linstor. We can also specify a source parameter to make it use an existing resource group.

We'll now move on to the basics of volume creation. Once we get that working reasonably well, we'll open the initial PR with the basic implementation.

root@server01:~# incus config get storage.linstor.controller_connection
http://10.172.117.3:3370
root@server01:~# linstor rg l
╭──────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter  ┊ VlmNrs ┊ Description ┊
╞══════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2 ┊        ┊             ┊
╰──────────────────────────────────────────────────────╯
root@server01:~# incus storage create test linstor --target server01
Storage pool test pending on member server01
root@server01:~# incus storage create test linstor --target server02
Storage pool test pending on member server02
root@server01:~# incus storage create test linstor --target server03
Storage pool test pending on member server03
root@server01:~# incus storage create test linstor linstor.resource_group.storage_pool=incus linstor.resource_group.name=test-demo linstor.resource_group.place_count=3
Storage pool test created
root@server01:~# incus storage ls
+--------+------------+------------------------------------+---------+---------+
|  NAME  |   DRIVER   |            DESCRIPTION             | USED BY |  STATE  |
+--------+------------+------------------------------------+---------+---------+
| local  | zfs        | Local storage pool                 | 0       | CREATED |
+--------+------------+------------------------------------+---------+---------+
| shared | lvmcluster | Shared storage pool (cluster-wide) | 1       | CREATED |
+--------+------------+------------------------------------+---------+---------+
| test   | linstor    |                                    | 0       | CREATED |
+--------+------------+------------------------------------+---------+---------+
root@server01:~# linstor rg l
╭───────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter          ┊ VlmNrs ┊ Description                              ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2         ┊        ┊                                          ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ test-demo     ┊ PlaceCount: 3         ┊        ┊ Resource group used for Incus instances. ┊
┊               ┊ StoragePool(s): incus ┊        ┊                                          ┊
╰───────────────────────────────────────────────────────────────────────────────────────────╯
root@server01:~# incus storage show test
config:
  linstor.resource_group.name: test-demo
  linstor.resource_group.place_count: "3"
  linstor.resource_group.storage_pool: incus
  volatile.pool.pristine: "true"
description: ""
name: test
driver: linstor
used_by: []
status: Created
locations:
- server01
- server02
- server03

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Development

Successfully merging a pull request may close this issue.

8 participants