Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control/discovery: add discovery controller #127

Merged
merged 10 commits into from
Sep 11, 2023

Conversation

CongMinYin
Copy link
Contributor

@CongMinYin CongMinYin commented May 9, 2023

The discovery contrller implement the basic function. Use command "python3 -m control.discovery" to start discovery controller. Client can use command "nvme discover -t tcp -a ip -s port" to get log pages.

The configuration is in ceph-nvmeof.conf [discovery] part.

Fixes: #108

@CongMinYin CongMinYin changed the title control/discovery: add discovery controller [RFC]control/discovery: add discovery controller May 9, 2023
@CongMinYin
Copy link
Contributor Author

Hi @optimistyzy, please help review. I have left some question that need to be discussed with you. I added 'Q: ' in comments, about version and IO command set.

cc @sdpeters @PepperJo

ceph-nvmeof.conf Outdated Show resolved Hide resolved
@CongMinYin
Copy link
Contributor Author

Is the review of the discovery controller still ongoing?

@sdpeters
Copy link
Contributor

sdpeters commented May 23, 2023 via email

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks overall good @CongMinYin, but I'm a bit concerned about maintainability/debuggability of this, since in my opinion this doesn't look not much more readable than a C implementation, as it's not fully leveraging Python syntactic sugar & libraries.

Additionally, networking code is usually the most vulnerable vector attack as it involves lots of parsing and branching that can potentially lead to exploits (e.g.: DDoS to the underlying Ceph cluster by performing mass queries).

In that sense, we should perhaps add some local TTL cache to the omap object in order to avoid that every discovery query results in a Ceph read.

control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
ceph-nvmeof.conf Outdated Show resolved Hide resolved
control/discovery.py Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from a862a99 to a18bb9c Compare June 1, 2023 06:52
@CongMinYin CongMinYin changed the title [RFC]control/discovery: add discovery controller control/discovery: add discovery controller Jun 1, 2023
@CongMinYin
Copy link
Contributor Author

CongMinYin commented Jun 1, 2023

Hi @epuertat , Thank you for your feedback and guidance. To be honest, I haven't written much Python code before. I made some simple modifications. Many places may still require more detailed discussions to arrive at a definitive decision. Perhaps we can incorporate some feature enhancements into subsequent patches, such as local TTL cache. Regarding the connection, there are some subsequent commitments to enhance it. I will discuss them under the corresponding comments.
We can write down TODO issues and enhanced discussions here #108

@idryomov
Copy link
Contributor

idryomov commented Jun 1, 2023

Perhaps we can incorporate some feature enhancements into subsequent patches, such as local TTL cache.

Caching the gateway group config state locally can definitely be deferred. It may turn out to be more trouble than it's worth even in the long run.

@PepperJo
Copy link

PepperJo commented Jun 1, 2023

Caching the gateway group config state locally can definitely be deferred. It may turn out to be more trouble than it's worth even in the long run.

Why don't we reuse the same mechanism we use for the GWs in a group to keep them up-to-date, i.e. watch/notify and polling. This way if there are no changes we don't read the OMAP over and over again. I believe all of the update code could be reused with little modification.

control/discovery.py Outdated Show resolved Hide resolved
@epuertat epuertat linked an issue Jun 23, 2023 that may be closed by this pull request
@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from a18bb9c to f37a278 Compare July 4, 2023 08:35
@CongMinYin
Copy link
Contributor Author

CongMinYin commented Jul 4, 2023

Hi all, sorry for the late reply. I have spent a lot of time to made significant changes recently.

  1. The basic data structure has been changed to ctypes.
  2. if elif elif are changed to function pointer. I will take a look at this comment later, thanks @kasserater.
  3. Fixed an bug caused by multiple network packets arriving together. In some cases, async network packets may stick together with other network packets.
  4. Introduce the selectors library and use the epoll model connection instead.
  5. Added support for persistent connections, command is nvme discover -t tcp -a 10.239.241.67 -s 4420 -p -k 15. In nvme-cli version 1.x, the - p parameter is required. In version 2.x, this parameter has been removed and only - k is required.
  6. For persistent connections, new features such as async, notify, keep alive, timeout have been added.

For some other things, I think it may be necessary to add some content for the discovery controller in README. It is better to change nqn in the listener to subsystem_nqn to avoid confusion

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from f37a278 to f440704 Compare July 5, 2023 08:36
@CongMinYin
Copy link
Contributor Author

Sorry, yesterday's push was not the latest version. Now I am re pushing, which includes local cache content.

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from f440704 to f274bbf Compare July 6, 2023 10:20
@CongMinYin
Copy link
Contributor Author

Added GET_FEATURE function, which is optional in spec(for persistent connection). I added after trying the discover command in ESXi 8.0. Currently, I have tried nvme-cli 1.x, nvme-cli 2.x, and ESXi8.0. Currently, the support for persistent connections is not very comprehensive, and more initiators for persistent connections need to be tried to improve

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from f274bbf to e596bde Compare July 10, 2023 09:45
@CongMinYin
Copy link
Contributor Author

Made a small change. When trying to discover persistent connections in ESXi 7.0, I found a timeout bug and reluctantly added a lock, which may have an impact on performance. Fortunately, selectors are not a multi threaded concurrent mode.

@CongMinYin
Copy link
Contributor Author

Hi, @epuertat , please take a look.

@caroav caroav added the squid label Jul 27, 2023
@CongMinYin
Copy link
Contributor Author

ping

@epuertat
Copy link
Member

epuertat commented Aug 1, 2023

ping

Sorry @CongMinYin , I completely missed this one 😅 . Will have a look now ;-)

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring @CongMinYin. As long as this works as expected, I'm fine with this! 👍🏼

My comments just focused on maintenance and debuggability of this code. ctypes might do a good job hiding byte-alignment complexities....

I still think that it'd be a good idea to use some higher-level Python TCP server off the shelf.

control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
control/discovery.py Outdated Show resolved Hide resolved
@CongMinYin
Copy link
Contributor Author

CongMinYin commented Aug 31, 2023

@baum Thanks for these changes. both 3e6a162 and 77d973b are OK. The use of tuples avoids type conversion errors, and the use of the enable field removes the flag, which is great.

As I explained earlier, the solution for this commit e60eb54 is incorrect. We must obtain a certain length and write the content of the response. Unknown variable lengths are dangerous.

Perhaps I didn't explain it clearly. Please obtain the nvme version in your error state. I tried 1.16, but its request will not cause the current discovery control to error. Please remove the commit (e60eb54) and print the version before making an error. I need to determine which version of the request is different from 1.9 and 2.3. After testing, 1.16 is the same and will not make any errors.

After solving this issue, I will make a check-pick. Thanks.

@baum
Copy link
Collaborator

baum commented Aug 31, 2023

As I explained earlier, the solution for this commit e60eb54 is incorrect. We must obtain a certain length and write the content of the response. Unknown variable lengths are dangerous.

Perhaps I didn't explain it clearly. Please obtain the nvme version in your error state. I tried 1.16, but its request will not cause the current discovery control to error. Please remove the commit (e60eb54) and print the version before making an error. I need to determine which version of the request is different from 1.9 and 2.3. After testing, 1.16 is the same and will not make any errors.

After solving this issue, I will make a check-pick. Thanks.

@CongMinYin 🖖

I could do that test. I am not sure though about Please obtain the nvme version in your error state. Could you elaborate?

Thank you!

@CongMinYin
Copy link
Contributor Author

I could do that test. I am not sure though about Please obtain the nvme version in your error state. Could you elaborate?

Thank you!

#127 (comment) You said you encountered a length error, and the requested log page length is not known as 16 or 1024. Please confirm the version of the nvme client that issued this request.

@baum
Copy link
Collaborator

baum commented Sep 3, 2023

#127 (comment) You said you encountered a length error, and the requested log page length is not known as 16 or 1024. Please confirm the version of the nvme client that issued this request.

@CongMinYin 🖖

The length error #127 (comment) is generated using SPDK/bdevperf SPDK_VERSION="23.01.1" as nvme client. SPDK is a cool project initiated in Intel. SPDK is implemented in user space and does not utilize Linux nvme cli or kernel module. You could reproduce this test, following discovery test steps of CI example run. Added network traffic capture of the test capture.zip. Let me know if you need any help there or additional information is required.

We must obtain a certain length and write the content of the response. Unknown variable lengths are dangerous.

Agree on that. Added length validation, according to spec. See the amended patch (branch)

SHA Description Comment
284cbb3 get log page: spdk/bdevperf compatibility Fixes get_log_page with SPDK/bdevperf 23.01.1

I brushed trough NVME specs, and checked Linux kernel discovery implementation. The data length validation should check that data length defined by SGL equals to NUMD, see Linux kernel pointers:

  • check transfer len See line 999 'len != req->transfer_len)`
  • len corresponds to nvme_data_len, aka NUMD calculated by Linux kernel here
  • transfer_len corresponds to nvme_sgl_len, calculated by Linux kernel here, line 398

Please share your thoughts.

Copy link
Collaborator

@baum baum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CongMinYin Great job 👍

@CongMinYin
Copy link
Contributor Author

CongMinYin commented Sep 5, 2023

@baum Sorry for the late reply. I originally attempted to replicate the bdevperf scene locally. But I encountered some difficulties when building the environment. It doesn't matter, I reviewed the code implementation of spdk. I think now your changes have no issues. The arrangement of its log page reply content is fixed, and a portion is truncated based on the length of the request to reply. Refer to:

struct spdk_nvmf_discovery_log_page {
	uint64_t	genctr;
	uint64_t	numrec;
	uint16_t	recfmt;
	uint8_t		reserved0[1006];
	struct spdk_nvmf_discovery_log_page_entry entries[0];
};
nvmf_get_discovery_log_page(...) 
nvmf_generate_discovery_log(...)

What I was originally worried about was that the content of the response to nvme-cli requests was different for different versions, and I was overly concerned. But you misunderstood that I wanted you to check nvme_data_len(from nvme_get_logpage_numd) and nvme_sgl_len. But being cautious doesn't matter.

I am preparing for cherry-pick, but I have noticed that your based commit(c17c0c3) is somewhat different from the current 3b6b2c6. And there are some conflicts. I am checking on the situation. Also, you previously changed property_data from str to tuple, and there are several other values that need to be changed to tuple. And I want to merge part of code in reply_get_log_page():"
elif nvme_data_len == 1024 and nvme_logpage_offset == 0 this code block can be merged toif nvme_data_len <= 1024

I believe these will be done soon.

@CongMinYin
Copy link
Contributor Author

Hi @baum, coul you please rebase your branch based on the latest devel. There are some conflicts.

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from 3b6b2c6 to 9d553f9 Compare September 5, 2023 09:00
@CongMinYin
Copy link
Contributor Author

I rebase to the latest version and add a very simple readme section. Do I need to add this in this PR and what improvements are needed? This command python3 -m control.discovery is only for my own testing and I don't know how to write it in the container environment,

@baum
Copy link
Collaborator

baum commented Sep 5, 2023

@CongMinYin 🖖

Hi @baum, coul you please rebase your branch based on the latest devel. There are some conflicts.

Rebased the branch

I rebase to the latest version and add a very simple readme section. Do I need to add this in this PR and what improvements are needed? This command python3 -m control.discovery is only for my own testing and I don't know how to write it in the container environment,

Added commit with a small description of running discovery as container 2ac7eaf

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from 9d553f9 to d986822 Compare September 6, 2023 08:56
CongMinYin and others added 7 commits September 6, 2023 16:59
The discovery contrller implement the basic function.
Use command "python3 -m control.discovery" to start
discovery controller. Client can use command
"nvme discover -t tcp -a ip -s port" to get log pages.

The configuration is in ceph-nvmeof.conf [discovery] part.

feature: ceph#108

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
add the method of starting discovery service

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
  - container
  - CI test

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
@baum baum force-pushed the wip-discovery-controller branch from d986822 to d0cd63b Compare September 6, 2023 14:01
@baum
Copy link
Collaborator

baum commented Sep 6, 2023

@CongMinYin 🖖 rebased the branch, tests are green, you could merge it. Thank you!

@CongMinYin CongMinYin force-pushed the wip-discovery-controller branch from d0cd63b to d65c6b2 Compare September 7, 2023 02:49
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
@CongMinYin
Copy link
Contributor Author

@baum Thank you! I made a minor change by modifying the README number.

@CongMinYin
Copy link
Contributor Author

Could we merge this PR?

@caroav
Copy link
Collaborator

caroav commented Sep 11, 2023

@CongMinYin yes from my perspective.

@baum baum merged commit 9412c87 into ceph:devel Sep 11, 2023
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in my environment the container name is "ceph-nvmeof-nvmeof-1" and not "ceph-nvmeof_nvmeof_1". Using dashes and not underscores. So, when running "make demo" I get an error from the create_listener command as the gateway name is blank.

Also, notice that the default container name uses the directory from which the "docker-compose up" was run. Shen I use my forked repo, which is in a different subdir than "ceph-nvmeof" I get a different name. This can be fixed by passing "--project-name ceph-nvmeof" to the "docker-compose up" line. Or, even better, by adding "COMPOSE_PROJECT_NAME="ceph-nvmeof" to the .env file. When this is done, no matter which directory was used the container name will be "ceph-nvmeof-nvmeof-1".

Last issue, notice that in case scale=2 is used and we start two instances of the gateway, the CLI commands of "make demo" work on the second gateway. So, when we get the id for "ceph-nvmeof-nvmeof-1" and pass it to the create_listener command we get an error about the wrong gateway being used. In such a case we should use a container name of "ceph-nvmeof-nvmeof-2".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

implement discovery service using python
10 participants