Publications

Publications

|

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

Authors: Ayush Ghadiya, Purbayan Kar ,Vishal Chudasama, Pankaj Wasnik
CVPR 2024 7th MULA Workshop | June 2024

|

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning

Authors: Shivam Ratnakant Mhaskar, Nirmesh Shah, Mohammadi Zaki, Ashishkumar Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah (IIIT Delhi)
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024),: findings | June 2024

|

Efficacy of Large Language Models in Predicting Hindi Movies' Attributes: A Comprehensive Survey and Content-Based Analysis

Authors: Prabir Mondal (IIT Patna), Siddharth Singh (IIT Patna), Kushum (IIT Patna), Sriparna Saha (IIT Patna), Jyoti Prakash Singh (IIT Patna), Brijraj Singh, Niranjan Pedanekar
WebConf 2024 (WWW 2024) | May 2024

|

Optimizing Movie Selections: A Multi-Task, Multi-Modal Framework with Strategies for Missing Modality Challenges

Authors: Subham Raj (IIT Patna), Pawan Agrawal (IIT Patna), Sriparna Saha (IIT Patna), Brijraj Singh, Niranjan Pedanekar
ACM Symposium on Applied Computing (SAC) | April 2024

|

Estimation of individual causal effects in network setup for multiple treatments

Authors: Abhinav Thorat, Ravi Kolla, Niranjan Pedanekar, Naoyuki Onoe
38th Annual Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence [Graphs and Complex Structure for Learning and Reasoning (GCLR) Workshop] | February 2024

Read More

|

Open-set Object Detection By Aligning Known Class Representations

Authors: Hiran Sarkar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
Winter Conference on Applications of Computer Vision (WACV) | January 2024

|

Open-set Object Detection By Aligning Known Class Representations

Authors: Hiran Sarkar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
Winter Conference on Applications of Computer Vision (WACV) | January 2024

Read More

|

Efficient infusion of self-supervised representations in Automatic Speech Recognition

Authors: Darshan Prabhu, Saiganesh Mirishkar, Pankaj Wasnik
Poster presentation at the Neural Information Processing Systems (NeurIPS) 3rd Workshop | December 2023

Read More

Read More

|

Enhancing Social Recommendation with Multi-View BERT Network

Authors: Tushar Prakash, Raksha Jalan, Naoyuki Onoe
IEEE International Conference on Data Mining (IEEE ICDM) | December 2023

Read More

|

Fiducial Focus Augmentation for Facial Landmark Detection

Authors: Purbayan Kar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
British Machine Vision Conference (BMVC) | November 2023

Read More

|

Impulsion of Movie's Content-Based Factors in Multi-Modal Movie Recommendation System

Authors: Prabir Mondal (IIT Patna), Pulkit Kapoor (IIT Patna), Siddharth Singh (IIT Patna), Prof. Sriparna Saha (IIT Patna), Naoyuki Onoe, Brijraj Singh
International Conference on Neural Information Processing (ICONIP) | November 2023

Read More

|

LLM Based Generation of Item-Description for Recommendation System

Authors: Arkadeep Acharya, Brijraj Singh, Naoyuki Onoe
Recommender Systems Conference (RECSYS) | September 2023

Read More

|

CR-SoRec: BERT driven Consistency Regularization for Social Recommendation

Authors: Tushar Prakash, Raksha Jalan, Brijraj Singh, Naoyuki Onoe
Recommender Systems Conference (RECSYS) | September 2023

Read More

|

Iteratively Improving Speech Recognition and Voice Conversion

Authors: Mayank Kumar Singh, Naoya Takahashi, Onoe Naoyuki
INTERSPEECH | August 2023

Read More

|

Cd-HRNN: Content-Driven HRNN to Improve Session-Based Recommendation System

Authors: Sonal Dabral, Brijraj Singh, Naoyuki Onoe
International Joint Conference on Neural Networks (IJCNN Main Conference) | April 2023

Read More

|

A Multi-Modal Multi-Task Based Approach for Movie Recommendation

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Joint Conference on Neural Networks (IJCNN Main Conference) | April 2023

|

A Meta-Learning Based Generative Model with Graph Attention Network for Multi-Modal Recommender Systems

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Neural Network Society Workshop on Deep Learning Innovations and Applications (INNS DLIA)/International Joint Conference on Neural Networks (IJCNN) | April 2023

|

Task-Specific and Graph Convolutional Network Based Multi-Modal Movie Recommendation System in Indian Setting

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Neural Network Society Workshop on Deep Learning Innovations and Applications (INNS DLIA)/International Joint Conference on Neural Networks (IJCNN) | April 2023

|

Revisiting Class Imbalance for End-to-end Semi-Supervised Object Detection

Authors: Purbayan Kar, Vishal Chudasama, Pankaj Wasnik, Naoyuki Onoe
Efficient Deep Learning for Computer Vision (ECV) Workshop in Computer Vision and Pattern Recognition (CVPR) | April 2023

Read More

|

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing

Authors: Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe
The International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2023

Read More

|

Hierarchical disentangled representation learning for singing voice conversion

Authors: Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji
The International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2023

Read More

|

Graph Network based Approaches for Multi-modal Movie Recommendation System

Authors: Daipayan Chakder (IIT Patna), Parbir Mondal (IIT Patna), Subham Raj (IIT Patna), Sriparna Saha (IIT Patna), Angshuman Gosh, Naoyuki Onoe
IEEE International Conference on System, Man, and Cybernetics (SMC) | November 2022
Read More ➜

|

Semi-supervised Acoustic and Language Modeling for Hindi ASR

Authors: Tarun Sai Bandarupalli (IISc Bangalore), Shakti Rath (IISc Bangalore), Nirmesh Shah, Onoe Naoyuki, Sriram Ganapathy (IISc Bangalore)
INTERSPEECH | September 2022

Read More

|

Towards Developing a Multi-Modal Video Recommendation System

Authors: Sriram Pingali (IIT Patna), Prabir Mondal (IIT Patna), Daipayan Chakder (IIT Patna), Sriparna Saha (IIT Patna), Angshuman Ghosh
International Joint Conference on Neural Networks (IJCNN) | September 2022

Read More

|

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer

Authors: Shrutina Agarwal (IISc Bangalore), Sriram Ganapathy (IISc Bangalore), Naoya Takahashi
INTERSPEECH | September 2022

Read More

|

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Authors: Vishal Chudasama, Purbayan Kar, Ashish Gudmalwar, Nirmesh Shah, Pankaj Wasnik, Naoyuki Onoe
Conference on Computer Vision and Pattern Recognition (CVPR) | June 2022

Read More

|

A Unified Model for Fingerprint Authentication and Presentation Attack Detection

Authors: Additya Popli (IIIT Hyderabad), Saraansh Tandon (IIIT Hyderabad), Joshua J. Engelsma (Michigan State University), Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri (IIIT Hyderabad)
International Conference on Acoustics, Speech, and Signal Processing (IJCB) | April 2021

Read More

|

End-to-end lyrics Recognition with Voice to Singing Style Transfer

Authors: Sakya Basak (IISc Bangalore), Shrutina Agarwal (IISc Bangalore), Sriram Ganapathy (IISc Bangalore), Naoya Takahashi
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2021

Read More

***International Institute of Information Technology Hyderabad **Indian Institute of Technology Patna *Indian Institute of Science, Bangalore #Michigan State University

Skip to content