Pyspark create index column. I converted resulting rdd back to df.