About Me
I am a Graduate Research Assistant at the University of Wyoming, working with Dr. Sheshappanavar and a Microsoft Certified Fabric Data Engineer. My expertise lies in Vision-Language Models (VLMs), Large Language Models (LLMs), and building scalable data platforms. With a strong background in Azure Databricks, PySpark, and Airflow, I am passionate about bridging the gap between complex data engineering and advanced AI research.

Education
University of Wyoming, Laramie, United States (August 2025 — August 2027)
- Degree: Master’s degree, Computer Science
Kathmandu Engineering College, Kathmandu, Nepal (2019 — 2023)
- Degree: Bachelor’s degree, Computer Engineering
Certifications
- Microsoft Certified: Fabric Data Engineer Associate
- Astronomer Certification: Apache Airflow Fundamentals
- Coursera: Machine Learning Specialization
- Coursera: Supervised Machine Learning: Regression and Classification
Professional Experience
University of Wyoming
Graduate Research Assistant (August 2025 - Present)
- Conducting research on Vision-Language Models (VLMs), focusing on aligning visual and textual representations for improved multimodal understanding.
- Developing and evaluating deep learning architectures for computer vision tasks, including object recognition and image-text retrieval.
Fusemachines (2 Years)
Data Engineer Consultant at Wheels Up (April 2025 — July 2025)
- Migrated complex legacy Oracle SQL workflows to Azure Databricks, modernizing the data platform and reducing query processing time.
- Engineered scalable data ingestion pipelines in PySpark, integrating data from diverse sources for high availability.
- Automated infrastructure provisioning with Terraform, streamlining deployment and cutting manual efforts.
- Implemented PySpark optimizations (partitioning, caching, broadcast joins) that reduced job execution time by ~35%.
Data Engineer - Internal Analytics (August 2024 — March 2025)
- Built a data ingestion framework using Airbyte, integrating multiple data sources into Amazon S3.
- Implemented a medallion architecture (bronze, silver, gold) to standardize and clean datasets.
- Automated end-to-end data workflows with Apache Airflow and deployed monitoring systems.
- Developed business intelligence dashboards in Apache Superset to visualize KPIs like project efficiency and revenue trends.
Data Engineer Consultant at Broadway Licensing Global (January 2024 — August 2024)
- Designed an end-to-end ETL pipeline for structured and unstructured data, including web scraping and API integration.
- Integrated Elasticsearch to power scalable search capabilities, improving retrieval speed by 50%.
- Reduced manual data collection efforts by 80% through automated extraction from websites, PDFs, and APIs.
Data Engineer Associate Trainee (August 2023 — January 2024)
- Built and orchestrated ETL pipelines with Apache Spark and Apache Airflow.
- Explored AWS services including Lambda for serverless processing and Glue for ETL automation.
Yirifi.ai
Data Engineer (June 2024 - July 2025)
- Leveraged OpenAI’s GPT models via API to generate synthetic datasets for testing and analytical purposes.
- Developed and maintained scalable data workflows in n8n, handling extraction and transformation logic.
- Utilized MongoDB to store, structure, and retrieve raw and processed data.
- Worked in cloud and containerized environments using Docker and AWS.
Projects
Face Recognition-Based Attendance System
- Developed a desktop application for face-based attendance management.
- Integrated Haarcascade for face detection and LBPH algorithm for face recognition.
Yoga Asana Classification and Feedback System
- Designed a real-time system to classify and correct yoga poses, reducing risks from incorrect posture.
- Combined Mediapipe Pose Model with LSTM networks for accurate pose identification.
- Developed a feedback module to provide real-time suggestions for posture improvement.
Technical Skills
- Cloud & Infrastructure: Microsoft Fabric, Azure Databricks, AWS (S3, Glue, Lambda), Terraform, Docker
- Data Engineering: PySpark, Apache Airflow, Airbyte, Kafka, Elasticsearch, n8n
- AI & Machine Learning: Vision Language Models (VLMs), LLMs, Unsupervised Learning, Recommenders
- Databases: MongoDB, PostgreSQL, Oracle SQL, Azure Blob Storage
- Visualization: Apache Superset, Power BI
I am always open to new opportunities and collaborations. Feel free to reach out via email or LinkedIn!