-
In this project I trained Hindi Language Model with BBC Hindi News Dataset and then Built a Hindi News Classifier.
Dataset:
Hindi Language Model:
- Architecture : AWD-LSTM
- Learn more about AWD-LSTM Here —> ASGD Weight-Dropped LSTM
LSTM cell structure
- Model Summary :
- DataBlock :
- Used Mixed Precision training to decrease up the training time.
Achieved a final accuracy of 30% for the Hindi Language Model
Completing sentences using Hindi Language Model
Hindi News Classifier
-
Hindi News Dataset :
-
Classifier Data Block
- News categories
- ‘business’ , ‘china’ , ‘entertainment’ , ‘india’ , ‘institutional’ , ‘international’ , ‘learningenglish’
-
‘multimedia’ , ‘news’ , ‘pakistan’ , ‘science’ , ‘social’ , ‘southasia’ , ‘sport’
- Metrics I choose for the News dataset was a f1_score
average = macro
- Because there was a class imbalance in the news dataset
Final f_beta score of the classifier was 0.789
- Top losses :
System Requirements
Python v3.6.x
fastai v1
Pytorch v1.2.x
Numpy
Pandas
tqdm (Progress bar)
Jupyter Notebook (Visualisations)
Shoutout to
- Practical Deep Learning for Coders MOOC by team fast.ai