What is a good way to store processed CSV data to train model in Python?
I have about 100MB of CSV data that is cleaned and used for training in Keras stored as Panda DataFrame. What is a good (simple) way of saving it for fast reads? I don't need to query or load part of it.
Some options appear to be:
- HDFS
- HDF5
- HDFS3
- PyArrow
Topic serialisation keras csv dataset python
Category Data Science