-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sq): implement missing methods #53
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- change methods to references
irfanghat
pushed a commit
to irfanghat/spark-connect-rs
that referenced
this pull request
Aug 27, 2024
…o8#53) - Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`. - Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs. - Updated DataFrameReader to include `csv` method that accepts CsvOptions.
sjrusso8
pushed a commit
that referenced
this pull request
Oct 11, 2024
* feat: Implement CSV Options Configuration for DataFrameReader (#53) - Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`. - Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs. - Updated DataFrameReader to include `csv` method that accepts CsvOptions. * feat: Implement CSV Options Configuration for DataFrameReader (#54) - Added documentation for the CsvOptions struct. * test(readwriter): Implement test_dataframe_read_csv_with_options (#54) * refactor: Improve CSV method to handle multiple paths (#54) - Updated the csv method in DataFrameReader to support both single string slices and arrays of string slices as input paths. * feat: Added implementations for JSON Options struct (#54) * feat: Implement JSON Options Configuration for DataFrameReader (#54) - Added JsonOptions struct to support JSON read options like `schema`, `multi_line`, `encoding`, and more. - Implemented ConfigOpts trait for JsonOptions to convert options into key-value pairs. - Updated DataFrameReader to include `json` method that accepts JsonOptions. - Documented all available JSON options, including example usage for setting options when reading JSON files. [TO DO] - Write tests to validate JSON options functionality. * feat: Implement ORC Options Configuration for DataFrameReader (#54) - Example usage provided for setting ORC options when reading files. - Write tests to validate ORC options functionality. * feat: Implement Parquet Options Configuration for DataFrameReader (#54) - Added ParquetOptions struct to support Parquet read options like `mergeSchema`, `pathGlobFilter`, and `recursiveFileLookup`. - Implemented ConfigOpts trait for ParquetOptions to convert options into key-value pairs. - Updated DataFrameReader to include `parquet` method that accepts ParquetOptions. - Example usage provided for setting Parquet options when reading files. - Write tests to validate Parquet options functionality. * feat: Implement Text Options Configuration for DataFrameReader (#54) - Added TextOptions struct to support text read options like `wholetext`, `lineSep`, and `pathGlobFilter`. - Implemented ConfigOpts trait for TextOptions to convert options into key-value pairs. - Updated DataFrameReader to include `text` method that accepts TextOptions. - Example usage provided for setting text options when reading files. - Write tests to validate text options functionality. * feat: Implement Text and Parquet Options Configuration for DataFrameWriter (#54) - Added TextOptions struct to support text write options such as `whole_text` and `line_sep`. - Added ParquetOptions struct to support Parquet write options like `merge_schema`, `path_glob_filter`, and `datetime_rebase_mode`. - Implemented `write` method in DataFrameWriter to handle configuration for text and Parquet file formats. - Example usage provided for setting text and Parquet options when writing DataFrames. - Write tests to validate text and Parquet file writing functionality. * Added rustdocs to method implementations. * feat: Implement initial methods for file format reader and writer (#54) - Added support for reading and writing .csv, .json, .orc, .parquet, and .text file formats. - Created `ConfigOpts` trait for each file type to manage options in a structured way. - Added example method signatures for file reading using a configurable options object passed into methods. * Add missing csv options to CsvOptions. * feat: Implement Configuration Options for DataFrameReader and Writer (#54) - Implemented additional fields in ParquetOptions compression. - Updated test_dataframe_read_parquet_with_options to ensure valid compression codec usage. - Enhanced test_dataframe_read_text_with_options to properly read lines by setting line_sep and disabling whole_text. - Implemented the #[derive(Debug, Clone)] traits for all Option structs. - Updated expected path_glob_filter type to string. - Added the compression field to ParquetOptions, OrcOptions, and JsonOptions. - Updated documentation for all Options structs to include descriptions for new and existing fields. * feat: Refactor file format options with shared CommonFileOptions (#54) - Introduced CommonFileOptions to handle common configuration fields such as: - path_glob_filter - recursive_file_lookup - ignore_corrupt_files - ignore_missing_files - modified_before - modified_after - Updated CsvOptions, JsonOptions, OrcOptions, ParquetOptions, and TextOptions to use CommonFileOptions for the shared fields. - Updated the new() constructors for each file format options struct to initialize CommonFileOptions. - Refactored tests for each file format (e.g., ORC, CSV) to utilize the new CommonFileOptions, ensuring that both format-specific and shared options are properly tested. - Updated and verified tests for DataFrame reading and writing operations with updated options. * Updated rustdocs. * Updated typo in rustdocs: /// - - Common file options... * Updated README - DataFrameReader/Writer section. --------- Co-authored-by: lexara-prime-ai <irfanghta@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
feat(sq): implement missing methods
StreamingQuery
methods to&self