serialisation

model serialization - what is ".pk" format?

JXuan

2022年5月15日 07:51

I encountered a piece of pytorch code where the trained model is saved with .pk. I often see pytorch models being saved as .pth or .pt. What is .pk format and how is it different from .pth or .pt? Btw, the following parameters and weights are saved under the .pk file: save_dict = { "encoderOpts": encoderOpts, "classifierOpts": classifierOpts, "dataOpts": dataOpts, "encoderState": encoder_state_dict, "classifierState": classifier_state_dict, } Many thanks in advance!

Topic: serialisation

Category: Data Science

PicklingError: Could not serialize object: TypeError: can't pickle fasttext_pybind.fasttext objects

DS_Tn

2020年12月9日 16:41

I built a fasttext classification model in order to do sentiment analysis for facebook comments (using pyspark 2.4.1 on windows). When I use the prediction model function to predict the class of a sentence, the result is a tuple with the form below: [('__label__positif', '__label__négatif', '__label__neutre', 0.8947999477386475, 0.08174632489681244, 0.023483742028474808)] but when I tried to apply it to the column "text" I did this : from pyspark.sql.types import * from pyspark.sql.functions import udf, col import fasttext schema = StructType([ StructField("pos", StringType(), …

Topic: serialisation data-science-model dataframe pyspark

Category: Data Science

Use serialized model without installing dependencies

Rusoiba

2020年6月9日 21:23

I prototyped an ML model consisting of preprocessing + multiple stacked regressors. I would like a colleague of mine to develop an API that will query the model. Is there any way to query the model (sklearn pipeline) without having to download all the dependencies (XGBoost, LGBM, CatBoost, ...). I tried to serialize it with Joblib but when we deserialize it on another machine it requires to have dependencies installed. The goal is really to transform the sklearn's pipeline to …

Topic: serialisation machine-learning-model scikit-learn

Category: Data Science

What is a YAML file and where is it used in a machine learning context?

2019年6月24日 04:47

I am not entirely sure if this is on-topic here, so please let me know if it is not. I keep seeing the idea of YAML files pop up while reading machine learning literature. My question is, what exactly is a YAML file, and how does it relate to machine learning and data science projects?

Topic: serialisation deep-learning neural-network machine-learning

Category: Data Science

What is a good way to store processed CSV data to train model in Python?

B Seven

2019年3月26日 11:27

I have about 100MB of CSV data that is cleaned and used for training in Keras stored as Panda DataFrame. What is a good (simple) way of saving it for fast reads? I don't need to query or load part of it. Some options appear to be: HDFS HDF5 HDFS3 PyArrow

Topic: serialisation keras csv dataset python

Category: Data Science

model serialization - what is ".pk" format?

PicklingError: Could not serialize object: TypeError: can't pickle fasttext_pybind.fasttext objects

Use serialized model without installing dependencies

What is a YAML file and where is it used in a machine learning context?

What is a good way to store processed CSV data to train model in Python?

About