Skip to content

Spark DataFrame Writer for Cobol datafiles #415

@mark-weghorst

Description

@mark-weghorst

Background

I work for a credit card company in the retail sector, and we are currently utilizing Cobrix to acquire data from our credit card transaction processor and produce business events to Kafka for our event driven architecture and analytic platform.

Thanks to @yruslan and his work with #338 Cobrix is now fully functional for our data ingest use case, however, our electronic data interchange with this business partner is bidirectional.

For example we receive mainframe data transmissions for things like customer purchases, and account status. But we also have to transmit monetary data to our mainframe based partner for things like credits and adjustments, and non-monetary data for account configuration changes including but not limited to change of address.

Additionally, we also believe that such a feature could also be used to simplify the process of creating test data for our system.

Feature

Implement a Spark DataFrame writer for Cobol data, the feature should:

  • Derive a default copybook layout from the Spark Schema
  • Support configurable endianness
  • Support configurable code page output
  • Support writing Cobol output data files in the F, FB, V, VB file types from https://www.ibm.com/docs/en/zos-basic-skills?topic=set-data-record-formats
  • Support the writing of a copybook file that matches the output schema as-written
  • Provide a declarative configuration option to override individual DataFrame Schema -> Copybook transformation decisions at a field level including:
  • specify width for PIC X(n) fields
  • specify scale and precision for PIC 9 fields such as S9(11)V99
  • specify binary packing options for individual fields such as COMP-3

Proposed Solution [Optional]

We could contribute development labor to the implementation of the feature, however we would need assistance with high level design should such a feature be accepted.

At this point I would like to open a discussion about how such a feature might be implemented, and as I mentioned we would be willing to contribute some development labor to help make this feature a reality, but we would need some assistance in the architecture of the solution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions