About
Software Engineer with 6.5 years of experience at Uber, Grab, and FourKites, specializing in designing and scaling high-performance distributed systems and event-driven backends. Expert in Go, Java, and Kafka, I leverage deep expertise in job execution, fintech, and logistics to deliver robust, scalable solutions that significantly reduce latency, enhance reliability, and drive critical business outcomes. Open to relocation for challenging roles in cutting-edge distributed systems.
Work
Bangalore, Karnataka, India
→
Summary
Led the Batch Compute Team, focusing on enhancing system reliability, performance, and observability for critical distributed job execution at scale.
Highlights
Designed and implemented automated fault isolation for Apache distributed job execution, utilizing circuit breakers and adaptive retries, which reduced incident Mean Time To Resolve (MTTR) to under 60 seconds and prevented cascading failures across downstream services.
Built a dynamic workload admission control service, leveraging Kubernetes ConfigMaps, to protect high-priority jobs during system outages, effectively blocking ~7% of 400K+ concurrent job volume in real-time.
Improved job scheduling correctness in a distributed execution engine by centralizing ID generation and enforcing submission-time ordering, eliminating allocation delays of up to 20 seconds for critical ML and data workloads.
Developed a ZooKeeper-backed federation controller for Apache YARN Router, enabling rapid cluster isolation and re-attachment to maintain system Service Level Agreement (SLA) during outages.
Enhanced Uber's internal Kubernetes batch job observability dashboard with workload-type and job-ID search capabilities and a resource-usage heatmap, significantly improving on-call debuggability and reducing time-to-insight during incidents.
Bangalore, Karnataka, India
→
Summary
Spearheaded critical fintech initiatives within the Lending Core Team, optimizing real-time processing and API performance for millions of users across Southeast Asia.
Highlights
Redesigned loan offer and credit limit generation from a monthly batch job to a real-time event-driven processing system using Kafka, reducing peak DB load by ~30% and decreasing loan offer waiting time for millions of drivers.
Reduced loan creation API p95 latency by almost 32% through MySQL batch inserts for installments, which improved throughput during high-concurrency database writes by minimizing network round-trips and lock acquisitions.
Developed a sub-second data aggregator API, integrating 5+ internal APIs for banking partners, complete with partial response fallback for high availability and robust data delivery.
Improved internal Kafka consumer framework reliability via graceful shutdown re-queuing, reducing message loss to near zero across critical financial transaction pipelines.
Improved the internal Kafka consumer framework reliability via graceful shutdown re-queuing, reducing message loss to near zero across financial transaction pipelines.
Bangalore, Karnataka, India
→
Summary
Contributed to the PayLater Team, designing and scaling critical fintech services for Indonesia, ensuring robust transaction processing and regulatory compliance.
Highlights
Designed and scaled the Refund API for Grab's Indonesia PayLater launch, handling 10% of the country's user base (3M+ users) by implementing idempotent and state machine design patterns, automating retries, and ensuring graceful error recovery.
Built a tiered service fee module for real-time charge APIs and batch billing systems, accurately implementing slab-rate logic in compliance with Indonesian regulatory requirements.
Designed a configurable credit risk assessment module through collaboration with product teams and external credit bureaus, integrating country-specific requirements across multiple Southeast Asian markets.
Developed a robust lending credit score API, integrating data from data-science team models and user metadata services, ensuring strict adherence to REST API design and security best practices.
Improved CI/CD build speeds by ~18% and increased unit test coverage by ~35% through Go version upgrade, glide to Go Modules adoption, and systematic refactoring, enhancing code quality and maintainability.
Chennai, Tamil Nadu, India
→
Summary
Developed and optimized high-scale distributed systems for the Multimodal Supply Chain Visibility Team, enhancing data integration, search capabilities, and operational efficiency.
Highlights
Architected a Redis-backed caching layer for direct integration with 10+ global maritime carriers, reducing reliance on costly third-party data providers and ensuring compliance with API rate limits; later adopted by rail and air teams.
Mentored an intern on feature implementation and code review for the maritime integration module, accelerating knowledge transfer and significantly boosting team productivity.
Improved global address search API availability by replacing a shared Elasticsearch dependency with a dedicated Port Autocomplete API, leveraging composite SQL indexing and in-memory cache to achieve sub-100ms latency for 80,000+ ports.
Designed an asynchronous Kafka-based maritime event enrichment pipeline for real-time ETA/ETD updates, integrating data from multiple microservices into a unified payload to ensure event persistence, configurability, and reliable delivery to customers.
Eliminated error-prone QA processes by replacing SSH-based Ruby script execution with an internal shipment event simulation tool featuring a dropdown-driven UI, event-specific data fields, and background worker integration.
Awards
The Grab Way Award
Awarded By
Grab
Recognized for leading engineering improvements in the Buy Now Pay Later (BNPL) project, significantly enhancing system reliability and supporting critical business growth.
Languages
English
Bengali
Skills
Languages
Go (Golang), Java, Python, SQL, C++.
Backend & API
REST, gRPC, Microservices, Event-Driven Architecture, Spring Boot, API Development, Distributed Systems.
Messaging & Infrastructure
Apache Kafka, Kubernetes, Docker, Apache YARN, Linux, System Reliability, High Availability.
Databases & Caching
MySQL, PostgreSQL, Redis, Aerospike, Caching Strategies.
Cloud & Observability
AWS, GCP, Datadog, Git, CI/CD, Observability Tools.
Fintech & Logistics Systems
Financial Transaction Pipelines, Payment Systems, Credit Risk Assessment, Regulatory Compliance, Supply Chain Visibility, Job Execution Systems.
Engineering Practices
Fault Isolation, Workload Management, Job Scheduling, Performance Optimization, Data Aggregation, QA Automation, Mentorship, Code Review, System Design.