Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete csv #47

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Delete csv #47

wants to merge 2 commits into from

Conversation

tony-wave
Copy link

remove csv file whick create from bucket

delete csv file which create from df_to_s3()
@rjrudin
Copy link

rjrudin commented Jan 31, 2022

I was just about to create an issue for this; I think this is a near-must-have feature as I can't imagine users want these files to stay around. It's useful for debugging, so an arg that defaults to deleting the file with the option to disable that feature is the way to go, as this PR is doing.

@@ -237,11 +237,17 @@ def s3_to_redshift(redshift_table_name, csv_name, delimiter=',', quotechar='"',
raise


def delete_csv(csv_name):
print("Delete s3 bucket's csv file: {0}/{1}".format(s3_subdirectory_var, csv_name))
s3.Bucket(s3_bucket_var).objects.filter(Prefix=s3_subdirectory_var+csv_name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this actually does a delete - I did this instead:

s3.Object(s3_bucket_var, s3_subdirectory_var+csv_name).delete()

@agawronski Are you still performing updates to this project? I'd be happy to submit a new PR that provides this "optionally delete S3 file at end" feature. My team would definitely want a 2.1 release that includes this enhancement.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't be able to review and release until the weekend, but sure, we can add this feature. In the meantime I would suggest that you consider checking out https://github.com/jucyai/red-panda which is a very similar package created by @yaojiach who also has been helping maintain pandas_redshift. I believe that it already has this functionality and more, along with several improvements.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we'll use red-panda, no need to address this PR then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants