Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARKNLP-823 Adding streaming functionality for seq2seq components #13899

Conversation

danilojsl
Copy link
Contributor

Description

This change adds a streamer argument to LightPipeline that will allow streaming data from both Scala and Python when using seq2seq features.

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Add a familiar functionality for users in Seq2Seq components

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@danilojsl danilojsl added DON'T MERGE Do not merge this PR enhancement labels Jul 20, 2023
@maziyarpanahi
Copy link
Member

Thanks @danilojsl for making this possible. Just to make this similar to all the other streamers, can you please:

  • Ignore/skip streaming for .fullAnnotate() 0 no need to stream Annotation
  • Only support .annotate()
  • Only support string input and not a list in .annotate()
  • Since it's a string and not a list, no need to out put the list and name of the column

This

generation: [' My name is Leonardo. I am a man of letters. I have been a man for many years. I was born in the year 1776. I came to the United States in 1776, and I have lived in the United Kingdom since 1776']

can turn into

My name is Leonardo. I am a man of letters. I have been a man for many years. I was born in the year 1776. I came to the United States in 1776, and I have lived in the United Kingdom since 1776

This way it is similar to all the other streamers out there where they output only text (token after token)

image

@maziyarpanahi maziyarpanahi added new-feature Introducing a new feature and removed enhancement labels Jul 20, 2023
@danilojsl
Copy link
Contributor Author

Thanks @danilojsl for making this possible. Just to make this similar to all the other streamers, can you please:

  • Ignore/skip streaming for .fullAnnotate() 0 no need to stream Annotation
  • Only support .annotate()
  • Only support string input and not a list in .annotate()
  • Since it's a string and not a list, no need to out put the list and name of the column

This

generation: [' My name is Leonardo. I am a man of letters. I have been a man for many years. I was born in the year 1776. I came to the United States in 1776, and I have lived in the United Kingdom since 1776']

can turn into

My name is Leonardo. I am a man of letters. I have been a man for many years. I was born in the year 1776. I came to the United States in 1776, and I have lived in the United Kingdom since 1776

This way it is similar to all the other streamers out there where they output only text (token after token)

image

Done

@github-actions github-actions bot added the Stale label Jan 17, 2024
@github-actions github-actions bot closed this Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DON'T MERGE Do not merge this PR new-feature Introducing a new feature Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants