Open to Work · United States & Remote

Hi, I'm
Rishi.

Data Analyst & Data Scientist

ASU M.S. graduate with a 4.0 GPA, turning complex data into clear decisions using Python, SQL, AWS MLOps pipelines, and production-grade NLP systems.

📍 Based in Phoenix, AZ · Open to relocate anywhere in the US

# Rishi's stack
 
import pandas as pd
import torch, xgboost
from transformers import BioBERT
 
# AWS MLOps pipeline
s3EMRSageMaker
LambdaSNS
 
-- Analytical SQL
SELECT insight, ROW_NUMBER()
OVER (PARTITION BY domain) AS rank;
4.0GPA
AWSMLOps
6+Projects

Technical Expertise

Skills & Tools

A hands-on toolkit built across analytics, cloud engineering, AI/ML, and business intelligence — from wrangling raw data to deploying production models.

🐍
Python & Data Science
End-to-end ML workflows, data wrangling, model training, and deep learning applications.
PandasNumPyScikit-learn PyTorchXGBoostMatplotlibSeaborn
☁️
AWS & MLOps
Designing and deploying cloud-native data and ML pipelines for real-time inference and alerting.
SageMakerLambdaS3 EMRAthenaCloudWatchSNS
🗃️
SQL & Databases
Advanced schema design, complex querying, stored procedures, triggers, and analytical reporting.
SQLMongoDBCTEs Window FunctionsStored Procedures3NF
📊
Visualization & BI
Translating data into actionable dashboards and stakeholder-ready visual narratives.
TableauPower BIExcel MatplotlibSeaborn
🧠
AI / ML & NLP
Domain-specific NLP with transfer learning, RAG pipelines, and production model deployment.
BioBERTRAGTransfer Learning NLPXGBoostFeature Eng.
🌐
Web & Dev Tools
Full-stack development and team tooling to support analytics products and data applications.
JavaScriptAngularNode.js GitPostmanRC++

Portfolio

Featured Projects

Six end-to-end projects spanning biomedical NLP, AWS cloud pipelines, database engineering, and interactive dashboards.

🧬
NLP · BioBERT · PyTorch

PubMed NLP Hybrid BioBERT Pipeline

Fine-tuned BioBERT v1.1 on a Tesla T4 GPU combined with a Regex augmentation layer to extract SSB obesity risk thresholds from 167 PubMed abstracts — lifting recall from ~0% to 77% and identifying 28 high-confidence dose-response thresholds. Delivered structured CSV output (DOSAGE, INTERVENTION, OUTCOME per PMID), cutting literature review time by weeks.

BioBERTPyTorchRegex NERPubMed
❤️
AWS · XGBoost · MLOps

Intelligent Heart Attack Prediction System

Built a 5-service AWS MLOps pipeline (S3 → EMR/PySpark → SageMaker → Lambda → SNS) ingesting simulated IoT wearable data for 20 patients, training an AUC-optimized XGBoost classifier, and delivering automated cardiac risk alerts. Engineered a schema-alignment safeguard preventing silent column-mismatch errors between training and inference stages.

SageMakerEMRLambdaXGBoostPySpark
🏥
SQL · Database Engineering

Healthcare Database Management System

Designed a fully normalized 3NF relational schema across 7 tables (patients, doctors, appointments, prescriptions, health readings, medications, alerts) with zero referential integrity violations across 100+ records. Built an automated audit trigger on Medications capturing 100% of DML operations with timestamps, plus 3 analytical views, stored procedures, a scalar UDF, and a cursor-based real-time glucose alert workflow.

SQL3NF SchemaTriggersStored ProcsUDF
📉
SQL · Tableau

End-to-End Sales Data Analysis

Queried relational e-commerce sales data using advanced SQL (joins, CTEs, window functions) to extract and transform raw transactions. Built an interactive Tableau dashboard tracking sales trends, regional performance, and product profitability — surfacing a data-driven recommendation that identified a potential 10% increase in cross-selling opportunities.

SQLTableauCTEsWindow Functions
🎬
AWS SageMaker · XGBoost

Netflix Ratings & Viewing Trends Prediction

Engineered an end-to-end SageMaker MLOps pipeline across 6,187 records with 62 features (multi-hot genre encoding, synthetic watch_ratio signals). Deployed dual XGBoost models to a live ml.m5.large endpoint. Used CloudWatch monitoring to diagnose a false-positive bias (Specificity: 22%) and document scale_pos_weight and threshold tuning strategies.

SageMakerCloudWatchXGBoostFeature Eng.
🏏
Python · EDA · Visualization

IPL Cricket Analytics Dashboard

Performed exploratory data analysis on 10+ seasons of IPL match data using Python (Pandas, Matplotlib, Seaborn) to uncover player performance metrics, team strategy patterns, and match outcome drivers through static visualizations.

PythonPandasMatplotlibSeabornEDA

Background

Experience & Education

Aug 2024 – May 2026
Tempe, AZ
M.S. Information Technology · GPA 4.0/4.0
Arizona State University
  • Coursework: Big Data Analytics, Data Visualization & Reporting, Data-Driven Decision Making, Natural Language Processing for IT, Statistical Foundations for IT, Advanced Database Management Systems, Advanced System Architecture, IT Project Management, Advanced Information Systems Security
  • Built production-quality projects in AWS, BioBERT NLP, SQL database engineering, and XGBoost MLOps throughout the program
May 2023 – Jul 2024
Shiv Infotech
Data Analyst
Shiv Infotech
  • Built Tableau and Power BI dashboards from scratch to centralize KPI tracking for 10+ stakeholders, which cut down on fragmented spreadsheet reporting and helped leadership spot project bottlenecks about 3x faster than before
  • Re-engineered ELT data extraction workflows in SQL across 3 to 5 concurrent client projects, which shaved 2 days off reporting turnaround and gave the team enough breathing room to take on extra accounts in the same sprint
  • Wrote Python scripts to automate data-cleaning on datasets between 50K and 100K rows, saving the team upwards of 15 hours a month that used to go toward repetitive cleanup work
  • Built and maintained ETL pipelines in Python and Snowflake to keep data accurate, consistent, and ready for both scheduled batch runs and real-time reporting needs
Jan 2023 – Apr 2023
IT Path Solutions
Software Developer Intern
IT Path Solutions
  • Engineered and deployed a full-stack client management feature using Angular and Node.js; built Tableau dashboards to visualize user interaction data, contributing to a 15% improvement in task completion efficiency
  • Managed development lifecycle with Agile/JIRA, translated stakeholder requirements into technical specs, and designed Power BI reports to monitor sprint metrics — enabling on-time delivery of all sprint goals
Aug 2022 – Dec 2022
MasterKoder
Assistant Coder
MasterKoder
  • Boosted student engagement by 50% by designing an interactive quiz module and using SQL-backed performance metrics to adaptively tune quiz difficulty
  • Collaborated with developers to debug code and deploy website features via Git, improving site stability and learner experience

About Me

Data engineering
meets applied AI.

I'm a recent Arizona State University M.S. graduate in Information Technology with a 4.0 GPA, based in Phoenix, AZ and open to relocating anywhere in the US. My work lives at the intersection of technical rigor and practical impact — I care about building systems that not only analyze data correctly, but deliver answers that drive real decisions.

From fine-tuning BioBERT on biomedical literature to orchestrating end-to-end AWS pipelines for cardiac risk prediction, I gravitate toward projects where data engineering, machine learning, and domain knowledge converge.

I'm actively seeking entry-level roles in data analytics, data science, or analytics engineering — in Phoenix, remotely, or by relocating anywhere in the US.

ASU M.S. · GPA 4.0/4.0
Information Technology graduate with deep project work in analytics, NLP, cloud systems, and database engineering.
AWS-Experienced MLOps
SageMaker, EMR, Lambda, Athena, CloudWatch, SNS — built and deployed live ML endpoints on AWS.
Open to Work · Phoenix / Remote
Actively applying for data analyst, data scientist, and analytics engineering roles starting now.
Applied NLP Practitioner
BioBERT fine-tuning, Regex NER, RAG-style pipelines — real NLP beyond tutorials.

Let's Connect

Open to
opportunities.

I'm actively looking for entry-level data analyst, data scientist, or analytics engineering roles — in Phoenix, fully remote, or open to relocating anywhere in the US. If you're working on data-driven products, cloud pipelines, or applied AI — let's talk.

👋
Ready to solve
hard data problems.

Whether it's an AWS MLOps pipeline, a BioBERT NLP feature, a Tableau dashboard, or a complex SQL system — I bring strong technical depth, fast learning, and a focus on measurable business impact.

Send an Email