• Notebook

  • In this project I trained Hindi Language Model with BBC Hindi News Dataset and then Built a Hindi News Classifier.

Dataset:

Hindi Language Model:

LSTM cell structure

kd

  • Model Summary :

kd

  • DataBlock :

kd

  • Used Mixed Precision training to decrease up the training time.

Achieved a final accuracy of 30% for the Hindi Language Model

kd

Completing sentences using Hindi Language Model

kd

kd

Hindi News Classifier

  • Hindi News Dataset :

  • Classifier Data Block

kd

  • News categories
  • ‘business’ , ‘china’ , ‘entertainment’ , ‘india’ , ‘institutional’ , ‘international’ , ‘learningenglish’
  • ‘multimedia’ , ‘news’ , ‘pakistan’ , ‘science’ , ‘social’ , ‘southasia’ , ‘sport’

  • Metrics I choose for the News dataset was a f1_score

    average = macro

  • Because there was a class imbalance in the news dataset

Final f_beta score of the classifier was 0.789

kd

  • Top losses :

kd

System Requirements

Python v3.6.x
fastai v1
Pytorch v1.2.x
Numpy
Pandas
tqdm (Progress bar)
Jupyter Notebook (Visualisations)

Shoutout to

  • Practical Deep Learning for Coders MOOC by team fast.ai