• 0 Posts
  • 13 Comments
Joined 3 months ago
cake
Cake day: June 22nd, 2024

help-circle



  • parquet is cloesely tied to the apache foundation, because it was designed as a storage format for hadoop.

    But many data processing libraries offer interfaces to handle parquet files so you can use it outside of the hadoop eco system.

    It’s really good for archiving data, because the format can store a lot of data with relatively low disk space, while still providing ok read performance because often times you won’t need to read the whole file due to how they are structured, where csv files would be a lot of plaintext taking up more diskspace.