Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support large scale for ads mode #610

Merged
merged 22 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
432 changes: 388 additions & 44 deletions bpf/deserialization_to_bpf_map/deserialization_to_bpf_map.c

Large diffs are not rendered by default.

18 changes: 16 additions & 2 deletions bpf/deserialization_to_bpf_map/deserialization_to_bpf_map.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,21 @@
#ifndef __DESERIALIZATION_TO_BPF_MAP_H__
#define __DESERIALIZATION_TO_BPF_MAP_H__

#include <stdbool.h>

/* equal MAP_SIZE_OF_OUTTER_MAP */
#define MAX_OUTTER_MAP_ENTRIES (8192)
#define MAX_OUTTER_MAP_ENTRIES (1 << 20)
#define OUTTER_MAP_USAGE_HIGH_PERCENT (0.7)
#define OUTTER_MAP_USAGE_LOW_PERCENT (0.3)
#define TASK_SIZE (512)

// 32,768
#define OUTTER_MAP_SCALEUP_STEP (1 << 15)
// 8,192
#define OUTTER_MAP_SCALEIN_STEP (1 << 13)

#define ELASTIC_SLOTS_NUM \
((OUTTER_MAP_SCALEUP_STEP > OUTTER_MAP_SCALEIN_STEP) ? OUTTER_MAP_SCALEUP_STEP : OUTTER_MAP_SCALEIN_STEP)

struct element_list_node {
void *elem;
Expand All @@ -20,6 +33,7 @@ void deserial_free_elem_list(struct element_list_node *head);
int deserial_delete_elem(void *key, const void *msg_desciptor);

int deserial_init();
void deserial_uninit();
void deserial_uninit(bool persist);
int inner_map_mng_persist();

#endif /* __DESERIALIZATION_TO_BPF_MAP_H__ */
2 changes: 1 addition & 1 deletion bpf/include/bpf_common.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
/* Ip(0.0.0.2 | ::2) used for control command, e.g. KmeshControl */
#define CONTROL_CMD_IP 2

#define MAP_SIZE_OF_OUTTER_MAP 8192
#define MAP_SIZE_OF_OUTTER_MAP (1 << 20)

#define BPF_DATA_MAX_LEN \
192 /* this value should be \
Expand Down
17 changes: 8 additions & 9 deletions bpf/kmesh/ads/include/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,15 @@
#define MAP_SIZE_OF_PER_CLUSTER 32
#define MAP_SIZE_OF_PER_ENDPOINT 64

#define MAP_SIZE_OF_MAX 8192
#define MAP_SIZE_OF_OUTTER_MAP 8192
#define MAP_SIZE_OF_MAX 8192

#define MAP_SIZE_OF_LISTENER BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_LISTENER)
#define MAP_SIZE_OF_FILTER_CHAIN BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_FILTER_CHAIN *MAP_SIZE_OF_LISTENER)
#define MAP_SIZE_OF_FILTER BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_FILTER *MAP_SIZE_OF_FILTER_CHAIN)
#define MAP_SIZE_OF_VIRTUAL_HOST BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_VIRTUAL_HOST *MAP_SIZE_OF_FILTER)
#define MAP_SIZE_OF_ROUTE BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_ROUTE *MAP_SIZE_OF_VIRTUAL_HOST)
#define MAP_SIZE_OF_CLUSTER BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_CLUSTER *MAP_SIZE_OF_ROUTE)
#define MAP_SIZE_OF_ENDPOINT BPF_MIN(MAP_SIZE_OF_MAX, MAP_SIZE_OF_PER_ENDPOINT *MAP_SIZE_OF_CLUSTER)
#define MAP_SIZE_OF_LISTENER (1 << 10)
#define MAP_SIZE_OF_FILTER_CHAIN (MAP_SIZE_OF_PER_FILTER_CHAIN * MAP_SIZE_OF_LISTENER)
#define MAP_SIZE_OF_FILTER (MAP_SIZE_OF_PER_FILTER * MAP_SIZE_OF_FILTER_CHAIN)
#define MAP_SIZE_OF_VIRTUAL_HOST (MAP_SIZE_OF_PER_VIRTUAL_HOST * MAP_SIZE_OF_FILTER)
#define MAP_SIZE_OF_ROUTE (1 << 14)
#define MAP_SIZE_OF_CLUSTER (1 << 14)
#define MAP_SIZE_OF_ENDPOINT (1 << 17)

// rename map to avoid truncation when name length exceeds BPF_OBJ_NAME_LEN = 16
#define map_of_listener kmesh_listener
Expand Down
69 changes: 69 additions & 0 deletions docs/proposal/map-in-map_management_enhancement-en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: map-in-map management enhancement
authors:
- "@nlgwcy"
reviewers:
- "@hzxuzhonghu"
- "@supercharge-xsy"
- "@bitcoffeeiux"
approvers:
- "@robot"
- TBD

creation-date: 2024-07-20

---

## map-in-map management enhancement

### Summary

In ads mode, elastic scaling based on map-in-map records is supported to meet the traffic management requirements of large-scale clusters.

### Motivation

As mentioned in [optimizing_bpf_map_update_in_xDS_mode](https://github.com/kmesh-net/kmesh/blob/main/docs/proposal/optimizing_bpf_map_update_in_xDS_mode-en.md), to solve the problem of slow update of map-in-map records, Kmesh creates all records at a time during startup by exchanging space for time. This problem does not occur in small-scale cluster scenarios, however, when a large-scale cluster (for example, 5000 services and 100,000 pods) is supported, the size defined in the map-in-map table is very large, and the map of the `BPF_MAP_TYPE_ARRAY_OF_MAPS` type does not support `BPF_F_NO_PREALLOC`, which causes a great waste of memory. Elastic scaling of map-in-map records must be supported to meet the traffic management requirements of large-scale clusters.

#### Goals

- Supports traffic management in large-scale clusters.
- Consider the configuration restoration scenario.

### Proposal

Kmesh manages the usage of map-in-map in user mode. To support elastic scaling, the management structure is extended as follows:

```c
struct inner_map_mng {
int inner_fd;
int outter_fd;
struct bpf_map_info inner_info;
struct bpf_map_info outter_info;
struct inner_map_stat inner_maps[MAX_OUTTER_MAP_ENTRIES];
int elastic_slots[OUTTER_MAP_ELASTIC_SIZE];
int used_cnt; // real used count
int alloced_cnt; // real alloced count
int max_alloced_idx; // max alloced index, there may be holes.
int init;
sem_t fin_tasks;
int elastic_task_exit; // elastic scaling thread exit flag
};

struct inner_map_stat {
int map_fd;
unsigned int used : 1;
unsigned int alloced : 1;
unsigned int resv : 30;
};
```

Map-in-map scaling process:

![map-in-map-elastic-process](pics/map-in-map-elastic-process.svg)

The following is an example of map-in-map scale-in and scale-out:

![map-in-map-elastic](pics/map-in-map-elastic.svg)



4 changes: 4 additions & 0 deletions docs/proposal/pics/map-in-map-elastic-process.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/proposal/pics/map-in-map-elastic.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 9 additions & 2 deletions pkg/bpf/bpf.go
Original file line number Diff line number Diff line change
Expand Up @@ -173,21 +173,23 @@

func (l *BpfLoader) Stop() {
var err error
if GetExitType() == Restart {
if GetExitType() == Restart && l.config.WdsEnabled() {
C.deserial_uninit(true)
log.Infof("kmesh restart, not clean bpf map and prog")
return
}

closeMap(l.versionMap)

if l.config.AdsEnabled() {
C.deserial_uninit()
C.deserial_uninit(false)

Check warning on line 185 in pkg/bpf/bpf.go

View check run for this annotation

Codecov / codecov/patch

pkg/bpf/bpf.go#L185

Added line #L185 was not covered by tests
if err = l.obj.Detach(); err != nil {
CleanupBpfMap()
log.Errorf("failed detach when stop kmesh, err:%s", err)
return
}
} else if l.config.WdsEnabled() {
C.deserial_uninit(false)
if err = l.workloadObj.Detach(); err != nil {
CleanupBpfMap()
log.Errorf("failed detach when stop kmesh, err:%s", err)
Expand Down Expand Up @@ -311,6 +313,11 @@

func closeMap(m *ebpf.Map) {
var err error

if m == nil {
return

Check warning on line 318 in pkg/bpf/bpf.go

View check run for this annotation

Codecov / codecov/patch

pkg/bpf/bpf.go#L318

Added line #L318 was not covered by tests
}

err = m.Unpin()
if err != nil {
log.Errorf("Failed to unpin kmesh_version: %v", err)
Expand Down
Loading