@xaloc depending the data structures you need to read/write parquet could be the best option but you can also have a look to hdf5/netcdf and/or zarr. Also, the work done by Francesc Alted is inspiring. Have a look to Caterva: https://github.com/Blosc/caterva-scipy21-lt/blob/main/caterva-lightning-talk.ipynb
Conversation
Notices
-
Pybonacci (pybonacci@mastodon.social)'s status on Monday, 11-Oct-2021 21:31:04 CEST Pybonacci
-
xaloc (xaloc@fedi.xaloc.space)'s status on Monday, 11-Oct-2021 21:31:05 CEST xaloc
I have to do a fair bit of data analysis for my studies/work and I usually use #python, #pandas and store things in csv. The other day I discovered #parquet and suddenly my 3.4GB file was only a 688MB file and the reading time (by pandas) was only 12% of the time it took to read the csv 😱
-