California, USA

Hi, I'm Darshil. I build ML systems

I'm an SDE II on Amazon Ads, where I build large-scale ad-tech systems — brand-safety integrations, high-throughput data pipelines, and microservices that run at Prime Day scale. My background spans machine learning, data science, and cloud computing, and I like turning messy, large-scale data into systems that are fast, reliable, and genuinely useful.

See my experience Get in touch

About

Engineer at the intersection of data, ML & the cloud.

I'm an SDE II on Amazon Ads, building large-scale ad-tech systems — brand-safety integrations on Amazon DSP, Spark pipelines processing terabytes a day, and microservices that hold up at Prime Day scale. My background spans machine learning, data science, and cloud computing, and I care about building software that's fast, reliable, and actually used.

Outside of shipping code, I like learning new things and trading ideas with people who are just as curious. Based in California, USA.

Open Source

Experience & Education

Where I've worked & studied.

Work

Jun 2022 — Present
Software Development Engineer II

Amazon Ads
- Integrated industry-leading brand-safety products (DoubleVerify, IAS) into Amazon DSP; designed cache-miss handling and offline cache population in the Ad Exchange server, sustaining 300K+ QPS during Prime Day.
- Architected Spark pipelines on Amazon EMR orchestrated by AWS Step Functions, processing 3TB+ of operational data daily and cutting complex-log query time ~30% to speed up debugging and incident response.
- Designed an automated publisher-onboarding system for Amazon DSP that reduced onboarding from 6+ weeks to under 2 hours — a 99%+ reduction in processing time.
- Led 2 engineers to build a search-query ad-blocking system with a pipeline automating 60K+ keyword annotations, blocking ~2% of ad traffic and increasing advertiser brand trust.
- Led 2 engineers to design a high-volume microservice collecting user-action events across Amazon Advertising (with PII compliance) plus its visualization UI, driving a year-over-year reduction of 5,000+ customer support contacts.
- Mentored 1 intern (converted to full-time) and 2 SDEs through technical ramp-up and growth.
Jul 2019 — Jul 2022
Senior Software Engineer

Oracle
- Designed and shipped features for Oracle's CX Cloud CRM platform, adopting modern conversational-UI patterns across a full-stack JavaScript and microservice codebase for enterprise customers.
- Architected a central navigation framework and event-based data-loading system for Oracle CX Cloud, simplifying complex workflows and improving client-side state management for 100K+ daily users.
- Built a reusable UI component library that let internal teams and external customers extend the platform with minimal custom development, reducing customization effort for enterprise implementations.
- Drove accessibility initiatives using assistive technologies (JAWS) to make the platform usable across a wide range of abilities.
Jul 2018 — Aug 2018
Software Engineer Intern

Cox Automotive Inc.
- Built ML models (Logistic Regression, XGBoost) to predict dealer contract probability at 94.5% accuracy, helping dealers prioritize their time, and used a correlation matrix to identify the most influential parameters for feature engineering.
- Refactored the core product application to React-based components, improving code modularity and maintainability.
Aug 2015 — Jul 2017
Systems Engineer

Tata Consultancy Services
- Delivered software quality and automation across client engagements for Dell EMC and the Kenya Revenue Authority. At Dell EMC, owned full test coverage for the SYMMETRIX (DMX/VMAX) remote-replication disaster-recovery product and was among the first to introduce Python automation alongside Linux/Windows scripts.
- For a Hibernate/Spring Java web app, built Selenium test automation and mapped scenarios to business requirements for complete coverage, reporting bi-weekly release status and risk plans to clients.

Education

2017 — 2019
M.S. in Computer Science

New York University

GPA 3.7/4.0 · Machine Learning, AI, Cloud Computing, Big Data, Algorithms
2011 — 2015
B.E. in Computer Engineering

Gujarat Technological University

GPA 8.33/10 · Ranked 3rd in class · Distributed & Parallel Computing, Compilers

Academic Projects

Earlier ML, data & cloud work from grad school.

Cloud · Computer Vision

Shopper Sentiment Analysis

Actionable retail insights derived from live in-store video streams.

Problem: Brick-and-mortar retailers have rich in-store behavior happening in front of cameras every day, but almost none of it is captured as data. Store owners couldn't answer basic questions: which aisles draw the most traffic, who their shoppers are, or how customers actually feel while browsing.
Approach: Built a pipeline that ingests live store video and applies managed ML vision services to detect foot-traffic hotspots, segment shoppers by demographic (approximate age, gender), and infer sentiment from facial expression — turning raw footage into a queryable stream of behavioral signals.
Impact: Surfaced high- vs. low-traffic zones to inform product placement
Demographic segmentation of store traffic in near real time
TODO: add a hard metric (e.g. frames/sec processed, accuracy, cost per store/day)

AWS Rekognition
AWS Lambda
S3
Python

Cloud · Serverless

Restaurant Chatbot & Recommendation System

A serverless conversational concierge that recommends restaurants by location, cuisine & rating.

Problem: Finding the right restaurant means juggling location, cuisine, date, and ratings across multiple apps. There was no single conversational interface that could take a natural request and return a tailored recommendation — delivered the way the user prefers.
Approach: Designed a fully serverless dining concierge. A web front end talks to a chatbot backed by Lambda; restaurant data scraped from Yelp & Google APIs lands in DynamoDB; incoming requests queue through SQS and recommendations are retrieved via Elasticsearch over an ML-trained dataset, then delivered by email or SMS.
Impact: End-to-end serverless architecture — no servers to manage, scales to zero
Recommendations delivered via the user's channel of choice (email/SMS)
TODO: add a metric (e.g. # restaurants indexed, median response latency)

AWS Lambda
DynamoDB
Elasticsearch
SQS / SNS / SES
Cognito
Node.js

Cloud · NLP

Smart Photo Album

A photo album you can search by voice or text in plain language.

Problem: Photo libraries grow faster than anyone can organize them. Finding "the photo of a dog at the beach" usually means scrolling endlessly, because albums are searchable by filename and date — not by what's actually in the picture.
Approach: Built a photo album web app searchable through natural language via both text and voice. An intelligent search layer indexes photos by the people, objects, actions, and landmarks they contain, so a spoken or typed query maps directly to matching images.
Impact: Natural-language search over image content, not just metadata
Voice and text input both supported
TODO: add a metric (e.g. label accuracy, index size, query latency)

AWS Rekognition
Elasticsearch
Lex
Lambda
S3

Big Data

Yelp Dataset Analysis

Big-data analytics over millions of Yelp reviews to compare distributed programming paradigms.

Problem: The Yelp open dataset is large enough that single-machine tools choke on it — 4.1M reviews, 947K tips, 1M users, 144K businesses. The challenge was both analytical (what can this data tell us?) and architectural (which distributed paradigm handles it best?).
Approach: Ran the same analyses across Spark, Spark SQL, and Pig Latin on Hadoop to directly compare the programming paradigms, then visualized the results — distributions, graphs, and maps — in Zeppelin and Tableau.
Impact: Processed a multi-million-record dataset across three distributed engines
Side-by-side comparison of Spark / Spark SQL / Pig for the same workloads
Statistical analysis and geo-visualizations of business & review trends

Apache Spark
Spark SQL
Pig
Hadoop
Zeppelin
Tableau

Artificial Intelligence

Pac-Man AI Agent

A self-playing Pac-Man driven by six classical search & optimization algorithms.

Problem: Pac-Man is a deceptively rich planning problem: a large state space, adversarial ghosts, and the need to balance exploration against reward. It's an ideal testbed for comparing how different AI search strategies behave under the same constraints.
Approach: Implemented an AI agent that plays Pac-Man using six algorithms — DFS, BFS, A*, Hill Climbing, a Genetic Algorithm, and Monte Carlo Tree Search — so their behavior, optimality, and runtime could be compared on identical game states.
Impact: Six search/optimization strategies implemented and benchmarked head-to-head
Clear illustration of trade-offs between optimality and compute cost
TODO: add a metric (e.g. avg. score or win-rate per algorithm)

Python
Search Algorithms
Genetic Algorithms
Monte Carlo Tree Search

Contact

Let's build
something together.

I'm always up for a good conversation — about ML, distributed systems, or what you're working on. The fastest way to reach me is email.

darshil.patel.2810@gmail.com

Hi, I'm Darshil. I build ML systems data pipelines cloud services things that scale

Engineer at the intersection of data, ML & the cloud.

Open Source

Where I've worked & studied.

Work

Software Development Engineer II

Senior Software Engineer

Software Engineer Intern

Systems Engineer