The existing data pipelines in Daggit - Sunbird ML Workbench, were bundled into independent solutions for general use-cases beyond Sunbird. This involved extending individual DAG nodes and operators as microservices and develop REST APIs for Daggit tasks.
Different services developed:
- Profanity check for English and Hindi text
- Topic modeling for English and Hindi text
- Multimedia (Youtube, pdf) text extraction
- Keyword extraction for English text using DBpedia ontology
Tools: REST, Python, Kafka