Background
Welcome to My World

Sahaj
Gyawali

Aspiring Data Scientist
Focused on transforming data into meaningful insights
Kathmandu, Nepal.

Download CV

About Statement

I am a CSIT student at Tribhuvan University with a distinguished passion for Data Science and Artificial Intelligence. I architect solutions where mathematics, data, and algorithms converge to solve complex real-world challenges.

Academic Path

B.Sc. CSIT // Ongoing

Bhaktapur Multiple Campus

Tribhuvan University

Higher Secondary

Kathmandu Model College

Major: Physics, Chemistry, Math, Bio

Beyond the Code

Cricket
Photography
Problem Solving
Continuous Learning

Core Philosophy

Applied Statistics & Linear Algebra
End-to-End Project Architecture
Deterministic Problem Solving
Scalable Industry Solutions

Experience

72-Hour Rapid Development Cycle

AI Developer

Ambition College HackFest 2025 Kathmandu, NP

Selected as a core team member to architect and deploy a complete tech solution within a 3-day, 2-night intensive hackathon environment.

  • Collaborated in a cross-functional team of four, leading the integration between AI models and the frontend/backend architecture.
  • Facilitated real-time brainstorming and rapid prototyping to pivot ideas into functional features under strict deadlines.
  • Implemented robust version control and coordination using GitHub, ensuring seamless deployment during late-night coding sessions.
  • Engineered a production-ready prototype that demonstrated practical problem-solving and effective feature scaling.

Competencies

TeamworkAI IntegrationRapid PrototypingFrontendBackendGitHubCommunication
01

COVID-19 DATA ANALYSIS & VISUALIZATION

DATA ANALYSIS / EDA / VISUALIZATION

2024

Implementation Details

Performed exploratory data analysis (EDA) on COVID-19 datasets to identify global and country-level trends.

Visualized confirmed, deaths, recovered, and active cases worldwide and specifically for Nepal using line plots and bar charts.

Created heatmaps to compare countries and WHO regions, highlighting concentrations of confirmed cases, recovery rates, and active cases.

Implemented sorting functions to identify the top 5 countries by confirmed, recovered, active cases, and deaths.

Focused on clear, insightful visualizations using Matplotlib and Seaborn for actionable interpretation of pandemic trends.

Tech Stack

PythonPandasNumPyMatplotlibSeaborn
02

HOUSE PRICE PREDICTION

AI / ML

2025

Implementation Details

Created prediction model using RandomForestRegressor to predict house price with high accuracy.

Performed data preprocessing: handling missing values, encoding categorical features, and feature scaling.

Utilized Pandas and Numpy for manipulation; Matplotlib & Seaborn for feature visualization.

Implemented Scikit-learn for model training, hyperparameter tuning, and performance evaluation.

Tech Stack

Random ForestScikit-learnNumpyPandasFastAPIReact
03

NEPALI MOVIE RECOMMENDATION

NLP / WEB

2025

Implementation Details

Developed a full-stack recommendation engine using content-based filtering with Cosine Similarity logic.

Engineered a custom web scraper using BeautifulSoup to aggregate movie metadata and plot synopses.

Built an interactive UI using Streamlit to provide real-time recommendations and live movie details.

Implemented NLP techniques for text vectorization using Scikit-learn to measure feature similarities.

Tech Stack

PythonScikit-learnCosine SimilarityBeautifulSoupStreamlitPandas
04

LLM-POWERED LINKEDIN CONTENT Generator

NLP / LLM / WEB

2025

Implementation Details

Developed a sophisticated LLM-powered content tool that bridges raw data analysis and creative LinkedIn post generation.

Implemented a 'Data-to-Draft' pipeline using preprocess.py for metadata extraction, hashtag unification, and style alignment.

Built a few-shot learning engine (few_shot.py) to retrieve contextually relevant examples from processed_posts.json for dynamic prompt engineering.

Integrated LangChain with Groq's llama-3.3-70b-versatile model for fast, style-aligned content generation.

Delivered an intuitive Streamlit dashboard allowing toggling of post length, language (English/Nepali), and specific tags.

Strategic enhancements planned include Engagement Prediction (Viral Score), Semantic Style Matching with vector embeddings, and Automated Image/Graphic pairing for complete content automation.

Tech Stack

PythonLangChainGroq Llama 3.3StreamlitPandas
05

TELCO CUSTOMER CHURN PREDICTION

ML / WEB / DATA SCIENCE

2026

Implementation Details

Developed a production-ready churn prediction system bridging research notebooks to an interactive UI.

Implemented a clean separation of concerns: Pydantic for data validation, Model Service for business logic, and FastAPI + Streamlit for delivery.

Loaded and served trained ANN model (.keras) and scaler (.pkl) using a Singleton pattern for efficient inference.

Enhanced UX with churn probability, risk levels, and retention recommendations displayed on a real-time Streamlit dashboard.

Strategic enhancements include planned integration of SHAP/LIME for model explainability and GeoJSON-based province-level churn visualization.

Tech Stack

PythonTensorFlowScikit-learnPandasFastAPIStreamlitJoblibGeoJSON

Technical Arsenal

[Core Languages]Database management & logic
Python / MySQL / C++
[Data Science & Viz]End-to-end data pipelines & storytelling
Pandas / NumPy / Power BI / Matplotlib / Seaborn
[AI / Machine Learning]Predictive modeling & Neural Architectures
Scikit-Learn / TensorFlow / Keras / PyTorch
[Engineering & Web]Scalable deployment & interfaces
FastAPI / React / Next.js / Docker
[Dev Environment]Version control & optimized workflow
Git / GitHub / Linux / Jupyter / VS Code
Problem Solving
Critical Thinking
Software Optimization
Communication

Contact

Server Ready

Protocol: SMTP_SECURE

Direct Address

sahajgnawali@gmail.com