- In this repo a natural language model has been trained and developed to classify post as fake or real regarding natural disasters earthquake, cyclone, etc.
Social media has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programatically monitoring Twitter (i.e. disaster relief organizations and news agencies).
But, it’s not always clear whether a person’s words are actually announcing a disaster.
- Take this example:
- Tweet source: https://twitter.com/AnyOtherAnnaK/status/629195955506708480
The author explicitly uses the word “ABLAZE” but means it metaphorically. This is clear to a human right away, especially with the visual aid. But it’s less clear to a machine.
-
In this project I build a deep learning model that classifies which posts are about real disasters and which one’s aren’t.
- Disclaimer: The dataset for this model contains text that may be considered profane, vulgar, or offensive.
Acknowledgments
- This dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ website.
Project Notebook
-
Project notebook here
-
First step was building a language model which can learn and understand tweets.
-
I was able to get an accuracy of 49% with help of learning rate finder.
- Then I build a classification model to classify the post as real or fake.
- And was able to achieve an accuracy of 80.3681 %.
- Confusion Matrix:
- Top losses:
- The final models has been saved here.
For fun lets see how accurate the language learner model can complete sentences and post.
- A great thanks to jeremy howard for fast.ai deep learning libraries.