Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sq): implement missing methods #53

Merged
merged 2 commits into from
Jun 14, 2024
Merged

feat(sq): implement missing methods #53

merged 2 commits into from
Jun 14, 2024

Conversation

sjrusso8
Copy link
Owner

Description

feat(sq): implement missing methods

  • change StreamingQuery methods to &self
  • add missing methods

@sjrusso8 sjrusso8 merged commit 42c88e4 into main Jun 14, 2024
3 checks passed
irfanghat pushed a commit to irfanghat/spark-connect-rs that referenced this pull request Aug 27, 2024
…o8#53)

- Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`.
- Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs.
- Updated DataFrameReader to include `csv` method that accepts CsvOptions.
sjrusso8 pushed a commit that referenced this pull request Oct 11, 2024
* feat: Implement CSV Options Configuration for DataFrameReader (#53)

- Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`.
- Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs.
- Updated DataFrameReader to include `csv` method that accepts CsvOptions.

* feat: Implement CSV Options Configuration for DataFrameReader (#54)

- Added documentation for the CsvOptions struct.

* test(readwriter): Implement test_dataframe_read_csv_with_options (#54)

* refactor: Improve CSV method to handle multiple paths (#54)

    - Updated the csv method in DataFrameReader to support both single string slices and arrays of string slices as input paths.

* feat: Added implementations for JSON Options struct (#54)

* feat: Implement JSON Options Configuration for DataFrameReader (#54)

- Added JsonOptions struct to support JSON read options like `schema`, `multi_line`, `encoding`, and more.
- Implemented ConfigOpts trait for JsonOptions to convert options into key-value pairs.
- Updated DataFrameReader to include `json` method that accepts JsonOptions.
- Documented all available JSON options, including example usage for setting options when reading JSON files. [TO DO]
- Write tests to validate JSON options functionality.

* feat: Implement ORC Options Configuration for DataFrameReader (#54)

- Example usage provided for setting ORC options when reading files.
- Write tests to validate ORC options functionality.

* feat: Implement Parquet Options Configuration for DataFrameReader (#54)

- Added ParquetOptions struct to support Parquet read options like `mergeSchema`, `pathGlobFilter`, and `recursiveFileLookup`.
- Implemented ConfigOpts trait for ParquetOptions to convert options into key-value pairs.
- Updated DataFrameReader to include `parquet` method that accepts ParquetOptions.
- Example usage provided for setting Parquet options when reading files.
- Write tests to validate Parquet options functionality.

* feat: Implement Text Options Configuration for DataFrameReader (#54)

- Added TextOptions struct to support text read options like `wholetext`, `lineSep`, and `pathGlobFilter`.
- Implemented ConfigOpts trait for TextOptions to convert options into key-value pairs.
- Updated DataFrameReader to include `text` method that accepts TextOptions.
- Example usage provided for setting text options when reading files.
- Write tests to validate text options functionality.

* feat: Implement Text and Parquet Options Configuration for DataFrameWriter (#54)

- Added TextOptions struct to support text write options such as `whole_text` and `line_sep`.
- Added ParquetOptions struct to support Parquet write options like `merge_schema`, `path_glob_filter`, and `datetime_rebase_mode`.
- Implemented `write` method in DataFrameWriter to handle configuration for text and Parquet file formats.
- Example usage provided for setting text and Parquet options when writing DataFrames.
- Write tests to validate text and Parquet file writing functionality.

* Added rustdocs to method implementations.

* feat: Implement initial methods for file format reader and writer (#54)

- Added support for reading and writing .csv, .json, .orc, .parquet, and .text file formats.
- Created `ConfigOpts` trait for each file type to manage options in a structured way.
- Added example method signatures for file reading using a configurable options object passed into methods.

* Add missing csv options to CsvOptions.

* feat: Implement Configuration Options for DataFrameReader and Writer (#54)

    - Implemented additional fields in ParquetOptions compression.
    - Updated test_dataframe_read_parquet_with_options to ensure valid compression codec usage.
    - Enhanced test_dataframe_read_text_with_options to properly read lines by setting line_sep and disabling whole_text.
    - Implemented the #[derive(Debug, Clone)] traits for all Option structs.
    - Updated expected path_glob_filter type to string.
    - Added the compression field to ParquetOptions, OrcOptions, and JsonOptions.
    - Updated documentation for all Options structs to include descriptions for new and existing fields.

* feat: Refactor file format options with shared CommonFileOptions (#54)

    - Introduced CommonFileOptions to handle common configuration fields such as:
    - path_glob_filter
    - recursive_file_lookup
    - ignore_corrupt_files
    - ignore_missing_files
    - modified_before
    - modified_after

    - Updated CsvOptions, JsonOptions, OrcOptions, ParquetOptions, and TextOptions
    to use CommonFileOptions for the shared fields.

    - Updated the new() constructors for each file format options struct to initialize
    CommonFileOptions.

    - Refactored tests for each file format (e.g., ORC, CSV) to utilize the new
    CommonFileOptions, ensuring that both format-specific and shared options
    are properly tested.

    - Updated and verified tests for DataFrame reading and writing operations with updated options.

* Updated rustdocs.

* Updated typo in rustdocs:  /// -  - Common file options...

* Updated README - DataFrameReader/Writer section.

---------

Co-authored-by: lexara-prime-ai <irfanghta@gmail.com>
@sjrusso8 sjrusso8 deleted the feat/streaming_query branch October 30, 2024 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant