Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the topic name configurable in storage paths #127

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

hariprasad-k
Copy link

@hariprasad-k hariprasad-k commented Mar 5, 2020

Improvements to provide ability to exclude topic name in partition by configuration

@ghost
Copy link

ghost commented Mar 5, 2020

@confluentinc It looks like @hariprasad-k just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@hariprasad-k
Copy link
Author

hariprasad-k commented Mar 5, 2020

Fixed #126

@OneCricketeer
Copy link

Question: What happens when you have topics as a list or topics.regex capture more than one topic?

Then all files in the directory will be of multiple topics?

@@ -101,6 +101,20 @@
public static final String TIMESTAMP_FIELD_NAME_DEFAULT = "timestamp";
public static final String TIMESTAMP_FIELD_NAME_DISPLAY = "Record Field for Timestamp Extractor";

public static final String TIMESTAMP_SCALING_FACTOR_CONFIG = "timestamp.scaling.factor";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you break this out as a separate PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OneCricketeer Sure, I will break into separate PR.

@OneCricketeer
Copy link

The patch also fixes #121 too

So does #122

@hariprasad-k
Copy link
Author

hariprasad-k commented Sep 9, 2020

Question: What happens when you have topics as a list or topics.regex capture more than one topic?

Then all files in the directory will be of multiple topics?

@OneCricketeer Yes, all files of multiple topics will be in the same directory in this case.

But, this behavior will only happen if you use topics as a list or topics.regex in combination with path.include.topicname set as false. Further, as path.include.topicname is by default true, it will preserve the existing behavior by default.

@hariprasad-k
Copy link
Author

hariprasad-k commented Sep 9, 2020

@OneCricketeer This PR does not have any changes related to Timestamp scaling factor anymore, which is covered by #160

Can you please review this changes ?

@OneCricketeer
Copy link

I'm not a Confluent employee, so my reviews don't do anything

@dosvath dosvath changed the title Improvements to storage partitioning scheme to be flexible and configurable at runtime. Make the topic name configurable in storage paths Mar 2, 2021
@dosvath dosvath self-requested a review March 2, 2021 22:32
@dosvath
Copy link

dosvath commented Mar 2, 2021

Thanks @hariprasad-k for the contribution! This change will add flexibility to our connectors that use storage common to optionally exclude the topic name from the path (seems to be a popular request confluentinc/kafka-connect-hdfs#544). I'd like to get second view from @levzem before proceeding.

@dosvath dosvath requested a review from levzem March 2, 2021 23:17
@hariprasad-k
Copy link
Author

@dosvath @levzem Any thoughts, or plans to review this contribution ?

@awebneck
Copy link

Would be very much interested in having this feature integrated. The topic name being forcefully appended to the path limits us severely when sinking to already-established directory structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants