Namaste! 🙏

I am Rutu, a

Currently a Master's student in Computer Science at The George Washington Univeristy. I am researching multiple areas in systems like eBPF in Database Security, Network Function Virtualization, and Energy-Efficient Machine Learning. Before this, I worked in the Machine Learning Operations domain at Syncron where I played roles of DevOps, Data Scientist, Software Developer, Machine Learning Engineer, and more. I am adept at learning abstract concetps and applying them across different domains, which helps me come up with creative solutions.

My current interests are: Kubernetes, Networking, eBPF, DPDK, benchmarking, and operating systems

Education

Degree: Master of Science

Major: Computer Science

University: School of Engineering and Applied Sciences, George Washington University

Duration: January 2024 - December 2025

Degree: Bachelor of Engineering

Major: Computer Science

University: Bangalore University

Duration: August 2016 - September 2020

Professional Experience

Organization: George Washington University

Position: Graduate Teaching Assistant

Duration: Sep 2024 - Present

Role:
  • Spring'25: Computer architecture and organization
  • Fall'24: Systems Programming

Organization: Syncron

Position: Software Engineer

Duration: Jan 2022 - Dec 2023

Role:

  • Member of the Datalab and Syncron.AI team
  • MLOps
  • Acting Data Scientist in the team
  • Understanding the maths behind complex statistical and ML solutions, and building applications and end-to-end pipelines products
  • Creating and maintaining infrastructure and applications to support ML Lifecycle

Tech Stack: AWS, Terraform, Kubernetes, Docker, Kubeflow, Python

My job as a Software Engineer has evolved to encompass the infrastructure aspect of the MLOps Domain. Through this, I have become proficient in utilizing Kubernetes and other CNCF-approved resources to create tools and infrastructure that support the Data Science and ML Lifecycle. Working with Kubernetes and its associated tools has sparked my interest in exploring their potential application in serving ML models as products.

The research paper titled Hidden Technical Debt in Machine Learning Systems by Google researchers has been a source of inspiration for me. The paper highlights the challenges of implementing ML systems in practical applications. The paper's diagram portrays the "ML Code" component as relatively small, which underscores the need for extensive post-coding work to realize its full potential. This principle resonates with my team's approach to providing value to our organization. My responsibilities also include designing and developing microservices that support the ML workflow and exploring various ML tools through experimentation.

Another fascinating aspect of my work involves constructing advanced analytics models for new price-related applications. This particular facet of my job demands a deep understanding of complex mathematical and ML concepts, which I then translate into programming and visualization. The models are implemented as end-to-end pipelines with analytical dashboards that visualize the results. One notable contribution of mine was the development of a solution called Top-Down Price Optimization. This solution combined enterprise pricing with multivariate calculus and business rules to provide a tool for strategically adjusting prices. The objective was to assist businesses in optimizing prices to achieve a desired increase in revenue and maximize profits while adhering to relevant business constraints.

Organization: Syncron

Position: Associate Software Developer

Duration: Jan 2021 - Dec 2021

Role:

  • Part of the Datalab and Analytics BI team
  • Designing and maintaining end-to-end statistical and ML Pipelines
  • Developing and maintaining Analytical and BI products and tools around them

Tech Stack: AWS, Docker, Kubeflow, Python

After completing a year-long internship, I was promoted to the position of Associate Software Developer and my responsibilities expanded to include the MLOps domain. In addition to developing and maintaining BI tools and services, I worked on developing price-related use cases. This involved understanding business requirements, analyzing customer data, and developing end-to-end pipelines to handle the entire process, from reading raw data to generating analytical dashboards for customers.

As a member of the Datalab team at Syncron, I gained valuable knowledge about applying Object-Oriented Programming principles beyond programming. Concepts such as Abstraction and Polymorphism were used in designing and planning the architecture. Additionally, I became familiar with the concept of functional programming, which allowed me to write code in terms of functions that were more readable and easy to follow.

Working on planning and designing solutions also exposed me to the concept of Domain-Driven Design. I learned how to design a solution based on the problem's domain, and then code it using the principles of OOP and functional programming.

Organization: Syncron

Position: Intern

Duration: Jan 2020 - Dec 2020

Role:

  • Part of the Analytics BI team
  • Developing and maintaining Analytical and BI products and tools around them
  • Developing and mainting microservices

Tech Stack: AWS, Serverless, Docker, Python

My internship provided my first professional exposure to the world of IT, where I gained valuable knowledge regarding software development and deployment as a service. I was able to hone my skills in writing readable and maintainable code while working extensively with Python, Serverless, and Docker. The software I developed primarily served to facilitate business intelligence work for customers, providing me with valuable insights into how data is visualized to meet business needs and draw insights.

Moreover, I had the opportunity to be involved in the design and development of microservices for internal use by various teams, which allowed me to expand my technical proficiency further.

My Research Interests

  • Kubernetes
  • Containers
  • Container Network Interface (CNI)
  • eBPF
  • Benchmarking
  • Performance Analysis and Optimization
  • Operating Systems
  • Networking
  • Machine Learning

Publications

eBPF-Based Intrusion Prevention System for Database Servers

IEEE CloudSummit 2024

eBPF (Extended Berkeley Packet Filter) is a powerful technology enabling the execution of sandboxed programs at the kernel level. This paper investigates its potential to implement security measures for critical database statements, safeguarding them from various threats. These threats can include malicious actors, compromised containers, or even insider attempts to tamper with production data.

View Publication

High Performance HTTP Benchmarking

As a part of my research work at the Cloud Systems Lab at the George Washington University I worked on the following projects to help setup an infrastructure for load testing web servers and gathering the metrics. This work started with an evaluation of multiple HTTP benchmarking tools and led to a framework that helps with load testing experimentations in a research setting.


A qualitative and quantitive analysis of HTTP Benchmarking Tools



This work presents a thorough comparison of different HTTP Benchmarking tools. It evaluates and compares the following tools:

Read the full report on how these tools compare against each other here.

Experiment Load Testing



A platform for Load Testing your web services with various configuration.

This is a result of the research work I did as a part of my Research course during my Masters at GWU. It provides a repository that lets researchers load test their web servers with variying configurations and gather metrics for the entire duration of the load test.

  • Uses K6 for generating loads
  • Allows reproducibility of experiments
  • Collects metrics for both load generator and the target system
  • Generates timestamped metrics
  • Generates plots for analysis
  • Allows easy configurations and experiments management
Read the complete documentation on how to use this platform here.


CloudLab - Research Tools

As a part of my research work at The George Washington University I use the CloudLab platform a lot. This platform provides researchers with the resources to repeatedly test their work and reproduce it as and when required. As someone who constantly retries his work with multiple configurations and breaks the system while playing around with different tools, this platform has been of the utmost importance.

Though this work has been built and tested on CloudLab, it provides utilities and tools to work with any Linux machine that can be accessed via ssh.


Cloudlab eBPF



Tools and experiments setup for working with eBPF on CloudLab. I built this tool because I was developing eBPF in a Mac environment and did not have a way to compile the eBPF programs or even perform go generate type operations without having to perform rsync or scp on the remote machine. This tool takes care of that by syncing your code to the remote machine and then runs the relevant go generate commands, and then it copies the generated files back to your local machine. This way you can work on eBPF from anywhere. This tool provides the following features:

  • Remote compilation
  • Copying of the generated code
  • Install dependencies
  • Experimentation

Cloudlab Kubernetes



A repository with setup scripts for creating a 3-node Kubernetes cluster with minimum configuration and just a single make command. This tool provides the following features:

  • Creates a 3-Node Kubernetes cluster
  • Copies the .kubeconfig to the local
  • Sets up the CNI
  • Can be customized to manage more/less nodes in the cluster
An intention behind this tool is that the initial setup required for Kubernetes can be overwhelming for newcomers. This tool bridges that gap and allows them to focus on their goals.

Cloudlab DPDK



This repository contains the required scripts for working with DPDK in a remote machine. It performs all the inital setup required for DPDK. This tool provides the following features:

  • Installs DPDK dependencies
  • Allocates Hugepages
  • Loads the Kernel Modules
  • Builds DPDK


Cloudlab Tools



This repository is the building block of every cloudlab project that I have built. It contains the necessary scripts and commands to get working with CloudLab (or any remote machine). It can be used as a submodule in your experimentation repository and easily integrates while providing tools for ease of development. I made it lightweight as possible. It provides the following features:

  • Installation scripts for:
    1. eBPF
    2. DPDK
    3. Kubernetes
    4. Docker
    5. Go
  • Tools for remote sync, scp, run command on remote, ssh to host, and more.

Cloudlab Template



This is a template repository that comes with all the setup for starting a new project that works with any of the technologies listed above or needs some kind of development setup to talk to remote machines. It provides the following features:

  • Submodule capabilities
  • Leightweight installation scripts
  • Customizations for your own stack


Kubernetes

Kubernetes has been a topic of interest to me for a long-long time now. And I want to convey the same enthusiasm to everyone who is remotely looking into this area. Learning Kubernetes was one of the most difficult taks I did and I want to make it easier for everyone by helping them appreciate the beauty of Kubernetes.


Opentelemetry Kubernetes Experiments



OpenTelemetry and Kubernetes are two of the most popular CNCF projects. This repository contains the examples for setting up OpenTelemetry in a Kubernetes cluster.

Bootstrapping with Kubernetes



I am writing a Book on Kubernetes. This book is intended for getting started with Kubernetes with only a bare minimum knowledge of containers. I am planning to cover both high-level and in-depth topics in Kubernetes. The book is open source and free to read.

The book is available here.

Bootstrapping with Kubernetes - Examples



This repository contains the associated examples to go with my Kubernetes book.


Procman

This is my own implementation of Kubernetes and Docker. I am building a sereis of projects to mimic the functionalities of Docker and Kubernetes. This is an ongoing work and currently it is focused on building a containerization platform from scratch. The intended result of this work will be a set of tools to orchestrate Processes in a cluster. Though it started off as a way for me to learn the inner workings of Docker and Kubernetes, I am learning new technologies everday and hence planning to add my own customizations like: Process migration with criu, CNI using eBPF, Native DPDK/SPDK/RDMA support


Procman CLI



This is the Go-based interface for the Procman tool. It helps set up the environment for containerization using user-provided inputs. It provides the following features:

  • Manage Images using Alpine Minirootfs
  • Creates filesystem for containers

Procman Daemon



This will be the long-running daemon for procman. It sets up the isolation for the container processes and starts their run. This is an important part of the Procman ecosystem as it has complete control over the underlying container processes. It provides the following features:

  • Trigger the run for containers
  • Manage namespaces (i.e. Network, MNT, Chroot, etc)
  • Set up networking for the containers ( via veth )
  • Monitor the run of processes - restart if required.

DPDK - Experiments and code walkthroughs


Installing DPDK and running the Helloworld application.

My first experiment with DPDK

Read Article
Breaking down the Helloworld DPDK Application

Understanding the parts of DPDK and the code for a basic DPDK application.

Read Article
Binding Network ports to DPDK

An experiment on binding network ports to DPDK and running the simple-lan experiment

Read Article
Running the basic forwarding Application with DPDK

This experiment covers running the Basic Port Forwarding Sample Application from DPDK. It binds and pairs the ports and forwards packets betwen them.

Read Article
Breaking down the Basic Port forwarding application with DPDK

This covers an explanation of hte basic port forwarding applicaiton built with DPDK. It contains a detailed explanation of every component involved.

Read Article
Modifying the basic port forwarding application

This article shows how the basic port forwarding application from DPDK can be modified to print the queries sent to the MySQL database.

Read Article


eBPF - Experiments and Code walkthroughs


Trying out eBPF and XDP

My first experiment with eBPF and XDP.

Read Article
Breaking down the Basic 01 XDP application

This contains a detailed walkthrough of the Basic 01 XDP application. It contains the line-by-line explanation of the underlying C code.

Read Article
Trying out eBPF and Go

Learning how to use eBPF with Go. This experiment led me to building cloudlab-tools, cloudlab-ebpf, and all the other associated projects.

Read Article
Tracing syscalls with eBPF and Go

An example of how eBPf can be used to trace system calls. It contains detailed explanation of my work with some problems and resolutions. This experiment was done using cloudlab-tools.

Read Article