I love solving problems and building cool stuff with code. My interests include artificial intelligence, competitive programming, computer graphics programming, and game development!
Abstract: Non-parallel voice conversion aims to convert voice from a source domain to a target domain without paired training data. Cycle-Consistent Generative Adversarial Networks (CycleGAN) and Variational Autoencoders (VAE) have been used for this task, but these models suffer from difficult training and unsatisfactory results. Later, Contrastive Voice Conversion (CVC) was introduced, utilizing a contrastive learning-based approach to address these issues. However, these methods use CNN-based generators, which can capture local semantics but lacks the ability to capture long-range dependencies necessary for global semantics. In this paper, we propose VCTR, an efficient method for non-parallel voice conversion that leverages the Hybrid Perception Block (HPB) and Dual Pruned Self-Attention (DPSA) along with a contrastive learning-based adversarial approach.
SmolLlama3 is an 8B-parameter language model built as part of my experimentation with fine-tuning LLMs. It is based on Llama 3.1 8B and was fine-tuned using a custom dataset, smol-smoltalk-10k, which contains 10,000 conversational samples. The model is designed for simple conversational tasks; however, its responses may be less refined as Direct Preference Optimization (DPO) was not applied.
A fine-tuned version of Gemma 2 2b-it, specifically developed for Kaggle's 'Unlock Global Communication with Gemma' competition. It has been fine-tuned to handle language-specific tasks, with a primary focus on Hindi. The model is designed to enhance communication capabilities, enabling better understanding and processing of the Hindi language for a variety of applications. The model was fine-tuned on a corpus of 7,640 Hindi instructions, enabling it to better understand and process the language.
This project features a GPT (Generative Pre-trained Transformer) language model with 124 million parameters that has been fine-tuned for Python code generation. Unlike larger models like GPT-2 or GPT-3, this is a smaller-scale model designed primarily for testing and experimental purposes. It was trained on a small corpus of 25,000 Python code samples.
This model is a fine-tuned version of Flux.1-dev, optimized for generating collage-style images using LoRA (Low-Rank Adaptation).
CSS code generator that generates the beautiful and trendy glassmorphism UI design style. Glassmorphism is a design trend that combines transparent elements, vibrant colors, and blurred backgrounds to create a visually appealing and modern user interface.
This is a small platformer game created as a learning project. Play as a brave knight on a mission to collect four magical fruits from different worlds to save your king. Dodge enemies like slimes, collect coins, and explore vibrant levels.
I participated in the International College Jam as a developer, where I created a game with a team of three: one artist and writer, and another artist. It was a week-long game jam in which students from various colleges and universities took part. The game is a visual novel based on a cozy cyberpunk theme. We designed and developed the entire game within one week, collaborating closely to align gameplay, visuals, and narrative under tight deadlines.
This game was created in 11 hours for the Micro Game Jam using the Unity game engine. It’s a simple arcade-style game where you dodge shooting stars and collect fire to stay airborne.
I have spent 2 years learning about machine learning and AI independently. In 2025, I published my first research paper on an Voice Conversion model.
Bachelor of Computer Applications at LCB College under Gauhati University.