Public
- Public
- Network
- Groups
- Popular
- People

Conversation

Notices

Pybonacci (pybonacci@mastodon.social)'s status on Monday, 11-Oct-2021 21:31:04 CEST Pybonacci
in reply to
- xaloc
@xaloc depending the data structures you need to read/write parquet could be the best option but you can also have a look to hdf5/netcdf and/or zarr. Also, the work done by Francesc Alted is inspiring. Have a look to Caterva: https://github.com/Blosc/caterva-scipy21-lt/blob/main/caterva-lightning-talk.ipynb
In conversation Monday, 11-Oct-2021 21:31:04 CEST from mastodon.social permalink
Attachments
1. caterva-scipy21-lt/caterva-lightning-talk.ipynb at main · Blosc/caterva-scipy21-lt
  
  Contribute to Blosc/caterva-scipy21-lt development by creating an account on GitHub.
- xaloc (xaloc@fedi.xaloc.space)'s status on Monday, 11-Oct-2021 21:31:05 CEST xaloc
  
  I have to do a fair bit of data analysis for my studies/work and I usually use #python, #pandas and store things in csv. The other day I discovered #parquet and suddenly my 3.4GB file was only a 688MB file and the reading time (by pandas) was only 12% of the time it took to read the csv 😱
  
  In conversation Monday, 11-Oct-2021 21:31:05 CEST permalink

Feeds

tiflolinux.org - GNU Social is a social network, courtesy of tiflolinux.org. It runs on GNU social, version 2.0.1-beta0, available under the GNU Affero General Public License.

All tiflolinux.org - GNU Social content and data are available under the Creative Commons Attribution 3.0 license.