Skip to content

This project is a simplified demonstration of a real-world big data project, showcasing efficient handling of large-scale datasets using PySpark and Apache Iceberg. This repository includes 5 sample datasets to replicate real-world scenarios, focusing on memory and time optimization while writing tables in Iceberg

Notifications You must be signed in to change notification settings

siam29/processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

notebook

About

This project is a simplified demonstration of a real-world big data project, showcasing efficient handling of large-scale datasets using PySpark and Apache Iceberg. This repository includes 5 sample datasets to replicate real-world scenarios, focusing on memory and time optimization while writing tables in Iceberg

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published