there is a limitation on spark at the moment, which would throw out an exception immediately if the object has a circular reference.

which turns out the issue is only with the default serializer (java). if instead, switching to kryo which a better performance, the circular reference could be well taken care of:

instead of using

Encoder<Model> encoder = Encoders.bean(Model.class);
Dataset<Model> rowData = spark.createDataset(models, encoder);
//this worked
Encoder<Model> encoder = Encoders.kryo(Model.class);

--

--

No responses yet