Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easier way to do time partitionning for thanos store #326

Closed
VannTen opened this issue Nov 25, 2024 · 2 comments
Closed

Easier way to do time partitionning for thanos store #326

VannTen opened this issue Nov 25, 2024 · 2 comments

Comments

@VannTen
Copy link

VannTen commented Nov 25, 2024

I'm in the process of setting up our Thanos infra, on a pull based mod (query + sidecars in K8s clusters + stores).

I'd like to have two level of partitioning, prod/pre-prod and time based.

However, using thanos.store for that purpose results in jsonnet which does not look super readable to me:

WIP jsonnet (only stores)
local thanos = import 'kube-thanos/thanos.libsonnet';

local sharedConfig = {
  config+:: {
    local cfg = self,
    namespace: 'thanos',
    version: 'v0.36.0',
    image: 'quay.io/thanos/thanos:' + cfg.version,
    imagePullPolicy: 'IfNotPresent',
    serviceMonitor: true,
    replicas: 2,
    logFormat: 'json',
  },
};

local store = {
    name: 'store',
    objectStorageConfig: {
        name: 'thanos-s3-config',
        key: 'config.yml',
    },
};

local time_partitions = [
    {
     minTime: '-8d',
     name+: '-now-to-8d-ago'
    },
    {
     maxTime: '-6d',
     minTime: '-6w',
     name+: '-6d-ago-to-6w-ago'
    },
    {
     maxTime: '-5w4d',
     name+: '-after-5w4d'
    },
];

local prlvl_partitions = [
    {
     name+: '-hp',
     objectStorageConfig+: {name+:'-hp'}
     },
    {
     name+: '-prod',
     objectStorageConfig+: {name+:'-prod'}
    }
];

local shared_accross_times = ["serviceAccount"];

local stores(prlvls=[{}], times=[{}]) = [
    thanos.store(sharedConfig.config + store + prlvl + time)
    for prlvl in prlvls
    for time in times
];

[
    if k == 'statefulSet' then std.mergePatch(s[k], {spec:{template:{spec:{serviceAccountName: 'thanos-store'}}}}) else s[k]
    for k in std.setDiff(std.set(std.objectFields(stores()[0])), shared_accross_times)
    for s in stores(prlvl_partitions, time_partitions)
]
+
[
    s[k]
    for k in std.setInter(std.set(std.objectFields(stores()[0])), shared_accross_times)
    for s in stores(prlvl_partitions)
]

I think the problem I'm having is I can't find an simple way to decouple the object produced from thanos.store(). Some things needs to be per invocation (like the statefulset), while some would be best shared (for instance, there is no sense in having 1 service account per time partition, but 1 for prod and 1 for pre-prod would make sense).

I've seen there is kube-thanos-store-shards which implement this for sharding with hashmod. From what I read the ServiceAccount and ServiceMonitor are shared between the shards in that.

I'm not sure if something should be implemented for time partitioning (I could work on that) ; maybe a better way would be to make the existing implementation of thanos.store easier to compose ? In particular the overriding with std.mergePatch feels a bit hacky, and I'm not sure how to have 2 differents SA here.

It's also very possible that I'm holding this wrong and there is an easy to achieve my setup, I'm pretty new to jsonnet.

Thanks for reading me :)

@VannTen
Copy link
Author

VannTen commented Nov 25, 2024

@VannTen
Copy link
Author

VannTen commented Dec 6, 2024

So, I ended up with a quite more readable jsonnet file after some thinking. It's not super simple, but much more than my previous version IMO:
Leaving this here in case anyone has the same needs:

local thanos = import 'kube-thanos/thanos.libsonnet';

local sharedConfig = {
  config+:: {
    local cfg = self,
    namespace: 'thanos',
    version: 'v0.36.0',
    image: 'quay.io/thanos/thanos:' + cfg.version,
    imagePullPolicy: 'IfNotPresent',
    serviceMonitor: true,
    replicas: 2,
    logFormat: 'json',
  },
};

local store = sharedConfig.config {
  local st = self,
  name: 'store-%s' % st.prlvl,
  objectStorageConfig: {
    name: 'thanos-s3-config-%s' % st.prlvl,
    key: 'config.yml',
  },
};

local time_partitions = [
  {
    minTime: '-8d',
    name+: '-now-to-8d-ago',
  },
  {
    maxTime: '-6d',
    minTime: '-6w',
    name+: '-6d-ago-to-6w-ago',
  },
  {
    maxTime: '-5w4d',
    name+: '-after-5w4d',
  },
];

local stores = [
  thanos.store(store { prlvl:: _prlvl } + time) {
    // overrides the SA to have one per prlvl
    // duplicate resources are handled below
    serviceAccount+: {
      metadata+: {
        name: 'store-' + _prlvl,
        labels+: {
          'app.kubernetes.io/instance': 'store-' + _prlvl,
        },
      },
    },
  }
  for _prlvl in ['hp', 'prod']
  for time in time_partitions
];

// set allow us to filter out duplicates
std.set([
  resources[k]
  for resources in stores
  for k in std.objectFields(resources)
], function(r) r.kind + '/' + r.metadata.name)

@VannTen VannTen closed this as completed Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant