Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs-only] Dev docs about collaborative storage #9317

Merged
merged 11 commits into from
Jun 17, 2024
Merged

[docs-only] Dev docs about collaborative storage #9317

merged 11 commits into from
Jun 17, 2024

Conversation

dragotin
Copy link
Contributor

@dragotin dragotin commented Jun 4, 2024

This documentation should explain technical background, design options and opportunities and show how to set up at this stage.

Later, the setup section might be moved to other area of doc.

@aduffeck @mmattel @hodyroff

Copy link

update-docs bot commented Jun 4, 2024

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@dragotin dragotin changed the title Dev docs about collaborative storage [docs-only] Dev docs about collaborative storage Jun 4, 2024
docs/architecture/posixfs.md Outdated Show resolved Hide resolved

Posix FS is a backend component that manages files on the server utilizing a "real" file tree that represents the data with folders and files in the file system as users are used to it. That is the big difference compared to Decomposed FS which is the default storage driver in Infinite Scale.

This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.
This does not mean that Infinite Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to caches and search indexes, and it also features the full spaces concept as before, just to name a few examples.


This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.

The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS.
The architecture of Infinite Scale allows configuring different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS.


The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS.

However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.
However, the clarity of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.

Maybe you mean something like "clarity"?


However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.

The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.
For the first time ever with feature rich open source file sync & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.


### Monitoring

To get information about changes such as new files added, files edited or removed, Infinte Sale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale (see [GPFS Specifics](#gpfs-specifics) for more details on GPFS file systems).
Copy link
Contributor

@phil-davis phil-davis Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To get information about changes such as new files added, files edited or removed, Infinte Sale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale (see [GPFS Specifics](#gpfs-specifics) for more details on GPFS file systems).
To get information about changes such as new files added, files edited or removed, Infinite Scale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as, for example, in Spectrum Scale (see [GPFS Specifics](#gpfs-specifics) for more details on GPFS file systems).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another typo here Infinite Sale


Based on the information transmitted by the watching service, Infinite Scale is able to "register" new or changed files into its own caches and internal management structures. This enables Infinite Scale to deliver resource changes through the "traditional" channels such as APIs and clients.

Since the most important metadata is the file tree structure itself, the "split brain" situation between data and metadata is impossible to cause trouble.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Since the most important metadata is the file tree structure itself, the "split brain" situation between data and metadata is impossible to cause trouble.
Since the most important metadata is the file tree structure itself, it is impossible for the "split brain" situation between data and metadata to cause trouble.


### File Id Resolution

Infinite Scale uses an Id based approach to work with resources, rather than a file path based mechanism. The reason for that is that Id based lookups can be done way more efficient compared to tree traversals, just to name one reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Infinite Scale uses an Id based approach to work with resources, rather than a file path based mechanism. The reason for that is that Id based lookups can be done way more efficient compared to tree traversals, just to name one reason.
Infinite Scale uses an Id based approach to work with resources, rather than a file path based mechanism. The reason for that is that Id based lookups can be done way more efficiently compared to tree traversals, just to name one reason.


The tech preview comes with the following limitations:

1. User Management: Manipulations in the file system have to be done by the same user that runs Infinte Scale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. User Management: Manipulations in the file system have to be done by the same user that runs Infinte Scale
1. User Management: Manipulations in the file system have to be done by the same user that runs Infinite Scale

1. There must be storage available to store meta data and blobs, available under a root path
2. When using inotify, the storage must be local on the same machine. Network mounts do not work with Inotify
3. The storage root path must be writeable and executable by the same user Infinite Scale is running under
4. An appropiate version of Infinte Scale is installed, version number 5.0.5 and later
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. An appropiate version of Infinte Scale is installed, version number 5.0.5 and later
4. An appropiate version of Infinite Scale is installed, version number 5.0.5 and later

Copy link
Collaborator

@kobergj kobergj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 Just some nagging as always...


Posix FS is a backend component that manages files on the server utilizing a "real" file tree that represents the data with folders and files in the file system as users are used to it. That is the big difference compared to Decomposed FS which is the default storage driver in Infinite Scale.

This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.
This does not mean that Infinte Scale is trading any of its benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example.


However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.

The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.
The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, its APIs or even directly in the underlying file system on the server.


The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server.

That is another powerful vector for integration and enables a new spectrum of use cases across all domains. Just imagine how many software can write files, and can now directly make them accessible real time in a convenient, secure and efficient way.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just imagine how many software can write files, and can now directly make them accessible real time in a convenient, secure and efficient way.

This I don't understand. Are you talking about other competitors who can't do this?


## Advanced Features

Depending on the capabilities of the underlying file system, the Infinite Scale PosixFS can benefit from more advanced funcitonality described here.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Depending on the capabilities of the underlying file system, the Infinite Scale PosixFS can benefit from more advanced funcitonality described here.
Depending on the capabilities of the underlying file system, the Infinite Scale PosixFS can benefit from more advanced functionality described here.

I knew I could find a typo if I only looked hard enough!

2. When using inotify, the storage must be local on the same machine. Network mounts do not work with Inotify
3. The storage root path must be writeable and executable by the same user Infinite Scale is running under
4. An appropiate version of Infinte Scale is installed, version number 5.0.5 and later
5. Either redis or nats-js-kv cache service
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't mention redis. It is not working properly. We want to use nats-js-kv as only suggested distributed cache solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you commit this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question: Was this commited intentionally?


{{< toc >}}

Posix FS is the working name for the collaborative storage driver for Infinite Scale.
Copy link
Contributor

@mmattel mmattel Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a personal opinion, I am not happy with the name Posix FS.
This is imho misleading as we also have POSIX for metadata and this will cause confusion.
I know that we have a new service recently added named collaboration but when thinking about, we could use Collaborative FS as alternative for Posix FS to make things more clear and distinctive.

Note that the envvar name should then also be corrected (hopefully I catched them all):
STORAGE_USERS_POSIX_ROOT --> STORAGE_USERS_COLLABORATIVE_FS_ROOT
STORAGE_USERS_DRIVER --> collaborative as value
STORAGE_USERS_POSIX_WATCH_TYPE --> STORAGE_USERS_COLLABORATIVE_FS_WATCH_TYPE
STORAGE_USERS_POSIX_WATCH_PATH --> STORAGE_USERS_COLLABORATIVE_FS_WATCH_PATH

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX is IMHO the correct technical term. It makes the difference between Decomposed vs. POSIX


## Introduction

Posix FS is a backend component that manages files on the server utilizing a "real" file tree that represents the data with folders and files in the file system as users are used to it. That is the big difference compared to Decomposed FS which is the default storage driver in Infinite Scale.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into the dev docs and searching for decomposed, I only find Decomposed FS on NFS. We should add a new documentation about Decomposed FS, what it is about and link that document here for reference so one can compare. This is especially true as Decomposed FS uses POSIX (eg NFS) for metadata stored via messagepack and this new FS driver uses XATTRs as part of the files stored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: Decomposed FS is "decomposed".

A POSIX compatible filesystem has some requirements. https://www.quobyte.com/storage-explained/posix-filesystem/

POSIX is not NFS.


The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS.

However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale.
Copy link
Contributor

@mmattel mmattel Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe to add an example usecase to make things clear:
This new technology allows users to manipulate the data directly in the file system like when using a scanner storing the output on a filesystem.
I know that there are many other usecases, but such an example makes the idea clear right at the beginning without counteracting the Decomposed FS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a sentence with a real life example (scanner)


The Posix FS technology uses a few features of the underlying file system, which are mandatory and directly contributing to the performance of the system.

While the simplest form of Posix FS runs with default file systems of every modern Linux system, the full power of this unfolds with more capable file systems such as IBM Storage Scale or Ceph. These are recommended as reliable foundations for big installations of Infinite Scale.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do "default filesystems" include mounted shared file systems like NFS or SMB ?
If yes, this should be added.
I remember, that NFS needs a minimum version (v4.1 or 4.2?) to support xattrs which was some time ago one important reason to move to messagepack. If a minimum version of a mounted FS is still valid, we should add a note about that. Else one thinks NFSv3 is ok to use. No need to go into a detail, but a note about a min versions would be necessary - also for us...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. It only works with local mounted fs - made that more clear in the text.


### Automatic ETag Propagation

The ETag of a resource can be understood as a content fingerprint of any file- or folder resource in Infinite Scale. It is mainly used by clients to detect changes of resources. The rule is that if the content of a file changed the ETag has to change as well, as well as the ETag of all parent folders up to the root of the space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The ETag of a resource can be understood as a content fingerprint of any file- or folder resource in Infinite Scale. It is mainly used by clients to detect changes of resources. The rule is that if the content of a file changed the ETag has to change as well, as well as the ETag of all parent folders up to the root of the space.
The ETag of a resource can be understood as a content fingerprint of any file- or folder resource in Infinite Scale. It is mainly used by clients to detect changes of resources. The rule is, that if the content of a file changed, the ETag has to change as well, as well as the ETag of all parent folders up to the root of the space.


Infinite Scale uses a built in mechanism to maintain the ETag for each resource in the file meta data, and also propagates it automatically.

In the future a sophisticated underlying file system could provide an attribute that fulfills this requirement and changes whenever content or metadata of a resource changes, and - which is most important - also changes the attribute of the parent resource and the parent of the parent etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the future a sophisticated underlying file system could provide an attribute that fulfills this requirement and changes whenever content or metadata of a resource changes, and - which is most important - also changes the attribute of the parent resource and the parent of the parent etc.
In the future, a sophisticated underlying file system could provide an attribute that fulfills this requirement and changes whenever content or metadata of a resource changes, and - which is most important - also changes the attribute of the parent resource and the parent of the parent etc.


Similar to the ETag propagation described before, Infinite Scale also tracks the accumulated tree size in all nodes of the file tree. A change to any file requires a re-calculation of the size attribute in all parent folders.

In the future Infinite Scale could benefit from file systems with native tree size propagation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the future Infinite Scale could benefit from file systems with native tree size propagation.
In the future, Infinite Scale could benefit from file systems with native tree size propagation.


Other systems store quota data in the metadata storage and implement propagation of used quota similar to the tree size propagation.

### File Id Resolution
Copy link
Contributor

@mmattel mmattel Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but should we write File ID Resolution (ID in capital letters, consequently in all occurrences).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


### User Management

With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is an important question.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is an important question.
With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is an important question.

There are a few possible ways for user management:
1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by, this user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by, this user.
1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by this user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


There are a few possible ways for user management:
1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by, this user.
2. Group based: All users who should be able to manipulate files have to be in a unix group. The Infinite Scale user has also to be in there. The default umask in the directory used has to allow group writing all over the place.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Group based: All users who should be able to manipulate files have to be in a unix group. The Infinite Scale user has also to be in there. The default umask in the directory used has to allow group writing all over the place.
2. Group based: All users who should be able to manipulate files have to be in a unix group. The Infinite Scale user has also to be member of that group. The default umask in the directory used has to allow group writing all over the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 118 to 120
If the underlying file system is able to create versions of single resources (imagine a git based file system) this functionality could directly be used by Infinite Scale.

In the current state of the PosixFS, versioning is not supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the underlying file system is able to create versions of single resources (imagine a git based file system) this functionality could directly be used by Infinite Scale.
In the current state of the PosixFS, versioning is not supported.
In the current state of the PosixFS, versioning is not supported.
If the underlying file system is able to create versions of single resources (imagine a git based file system) this functionality could directly be used by Infinite Scale provided in a later version.

Switching the order. imho no candy and then currently not possible...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I have the same sequence in other locations, so I'd like to stay with it.

Comment on lines 124 to 128
If the underlying file system handles deleted files in a trash bin that allows restoring of previously removed files, this functionality could directly be used by Infinite Scale.

If not available it will follow the [the Free Desktop Trash specificaton](https://specifications.freedesktop.org/trash-spec/trashspec-latest.html).

In the current state of the PosixFS, trash bin is not supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the underlying file system handles deleted files in a trash bin that allows restoring of previously removed files, this functionality could directly be used by Infinite Scale.
If not available it will follow the [the Free Desktop Trash specificaton](https://specifications.freedesktop.org/trash-spec/trashspec-latest.html).
In the current state of the PosixFS, trash bin is not supported.
In the current state of the PosixFS, trash bin is not supported.
Infinite Scale could, if provided by a later version:
- If the underlying file system handles deleted files in a trash bin that allows restoring of previously removed files, this functionality could directly be used by Infinite Scale.
- If not available it will follow the [Free Desktop Trash specificaton](https://specifications.freedesktop.org/trash-spec/trashspec-latest.html).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here


## Limitations

As of Q2/2024 the PosixFS is in technical preview state which means that it is not officially supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As of Q2/2024 the PosixFS is in technical preview state which means that it is not officially supported.
As of Q2/2024 the PosixFS is not officially supported and in technical preview state.

@tbsbdr
Copy link

tbsbdr commented Jun 12, 2024

Naming proposal for this feature:

  • Joint Access Storage Driver

@mmattel
Copy link
Contributor

mmattel commented Jun 13, 2024

@dragotin pls exclude changes in env_vars.yaml and enxtended_vars.yaml.
I need to update the files in a separate PR which needs some care, also see our documentation for this process.

Note that the files get always updated when using make docs generate manually...

@dragotin dragotin requested a review from kobergj June 13, 2024 15:08

While the simplest form of Joint Access Storage Driver runs with default file systems of every modern Linux system which are directly mounted and thus support inotify, the full power of this unfolds with more capable file systems such as IBM Storage Scale or Ceph. These are recommended as reliable foundations for big installations of Infinite Scale.

This chapter describes some technical aspects of the storage driver.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This chapter describes some technical aspects of the storage driver.
This chapter describes some technical aspects of this storage driver.


### File ID Resolution

Infinite Scale uses an ID based approach to work with resources, rather than a file path based mechanism. The reason for that is that ID based lookups can be done way more efficiently compared to tree traversals, just to name one reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Infinite Scale uses an ID based approach to work with resources, rather than a file path based mechanism. The reason for that is that ID based lookups can be done way more efficiently compared to tree traversals, just to name one reason.
Infinite Scale uses an ID based approach to work with resources, rather than a file path based mechanism. The reason for that is, that ID based lookups can be done way more efficiently compared to tree traversals, just to name one reason.


Infinite Scale uses an ID based approach to work with resources, rather than a file path based mechanism. The reason for that is that ID based lookups can be done way more efficiently compared to tree traversals, just to name one reason.

The most important component of the ID is a unique file ID that identifies the resource within a space. IDeally the Inode of a file could be used here. However, some file systems re-use inodes which must be avoided. Infinite Scale thus does not use the file Inode, but generates a UUID instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The most important component of the ID is a unique file ID that identifies the resource within a space. IDeally the Inode of a file could be used here. However, some file systems re-use inodes which must be avoided. Infinite Scale thus does not use the file Inode, but generates a UUID instead.
The most important component of the ID is a unique file ID that identifies the resource within a space. Ideally, the Inode of a file could be used here. However, some file systems re-use inodes which must be avoided. Infinite Scale thus does not use the file Inode, but generates a UUID instead.


### User Management

With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is important.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is important.
With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which UID the manipulation happens is important.


One for all, it seems reasonable to use LDAP to manage users which is the base for the Infinite Scale IDP as well as the system login system via PAM.

### GID based space access
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### GID based space access
### GID Based Space Access

Comment on lines 156 to 160
1. There must be storage available to store meta data and blobs, available under a root path
1. When using inotify, the storage must be local on the same machine. Network mounts do not work with inotify. `inotifywait` needs to be installed.
1. The storage root path must be writeable and executable by the same user Infinite Scale is running under
1. An appropiate version of Infinite Scale is installed, version number 5.0.5 and later
1. Nats-js-kv as cache service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. There must be storage available to store meta data and blobs, available under a root path
1. When using inotify, the storage must be local on the same machine. Network mounts do not work with inotify. `inotifywait` needs to be installed.
1. The storage root path must be writeable and executable by the same user Infinite Scale is running under
1. An appropiate version of Infinite Scale is installed, version number 5.0.5 and later
1. Nats-js-kv as cache service
1. There must be storage available to store meta data and blobs, available under a root path.
2. When using inotify, the storage must be local on the same machine. Network mounts do not work with inotify. `inotifywait` needs to be installed.
3. The storage root path must be writeable and executable by the same user Infinite Scale is running under.
4. An appropiate version of Infinite Scale is installed, version number 5.0.5 and later.
5. `nats-js-kv` as cache service.

Comment on lines +168 to +172
export STORAGE_USERS_DRIVER="posix"
export STORAGE_USERS_POSIX_ROOT="/home/kf/tmp/posix-storage"
export STORAGE_USERS_POSIX_WATCH_TYPE="inotifywait"
export STORAGE_USERS_ID_CACHE_STORE="nats-js-kv"
export STORAGE_USERS_ID_CACHE_STORE_NODES="localhost:9233"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would'nt it be good, to be inline with the name of the driver to have all envvars respectively key-values use the driver name? Like STORAGE_USERS_POSIX_ROOT --> STORAGE_USERS_JASD_ROOT This would directly match its purpose and would match the name when documenting. Readers immediately have an identification. Joint Access Storage Driver and POSIX do not match when reading...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. That requires code changes, and once these are done, we will bring the changes to this doc in the same PR.

@mmattel
Copy link
Contributor

mmattel commented Jun 14, 2024

@dragotin as discussed, I have updated doc relevant files that get possibly changed when running make docs-generate, see #9383 ([docs-only] [chore] Update helper generated envvar yamls).

When this one is merged, do the following:

  1. Rebase your PR
  2. Rerun make -C docs docs-generate from the ocis root
  3. Commit and push the changes (the two envvar files in this PR will go away by the steps taken)

@mmattel
Copy link
Contributor

mmattel commented Jun 14, 2024

@dragotin there is still an open suggestion (gpfswatchfolder) which most likely has been missed to commit 🤣

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@dragotin dragotin merged commit aaea16d into master Jun 17, 2024
2 checks passed
@phil-davis phil-davis deleted the doc-posixfs branch June 17, 2024 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants