We analyse audio and video content using Deep Learning technologies to extract meaningful metadata, classify content, build speech recognition system among other purposes.
- Metadata extraction consists of several Machine Learning (ML) models, each essentially created and trained to extract meaningful information i.e., metadata from the media content. Such metadata can improve the discoverability and accessibility of digital objects and provide detailed context to users
- For example, we are developing novel Deep Learning models for Image Processing Technologies to recognize custom objects, summarize video contents or actions depicted in the video.
- We are building a high-quality Speech Recognition system to transcribe media content in low data resourced Indian languages and address pressing research problems such as Speaker Diarization, Noise Cancellation, Speaker Recognition, etc.