{{{#!comment See (https://www.orbit-lab.org/wiki/WikiFormatting) for wiki formatting }}} [[TOC(Other/Summer/2025/R3/*, depth=1, heading=R3)]] = Real-time, robust, and reliable (R^3^) machine learning over wireless networks = **WINLAB Summer Internship 2025** **Group Members:** Ayaan Qayyum(GR), Joshua Menezes(GR), Nihal Abdul Muneer(GR), Hasan Ali(UG), Madhav Subramaniyam(UG) == Project Objective == === Introduction: === Edge computing is an emerging paradigm for enabling ML/AI applications in mobile networks. This setting presents many challenges in terms of latency, accuracy, security/privacy, and adaptivity to changing environments. The goal of this project will be to implement and evaluate algorithms and approaches which work “on paper” to see how well they work in practice. In particular, students will work on methods for mobile devices to strategically “offload” complex computing tasks to the cloud, approaches for fully decentralized model updating and adaptation that are robust against malicious attacks, and strategies that can enable real-time tracking and control. === Sub-Groups: === ||= Learning to Help (L2H) =||= Feature Extraction for Distributed Systems (PCA) =|| || The recently proposed Learning to Help (L2H) model proposed training a server model given a fixed local (client) model. This differs from the Learning to Defer (L2D) framework which trains the client for a fixed (expert) server. L2H demonstrates its applicability in a number of different scenarios of practical interest in which access to the server may be limited by cost, availability, or policy.|| Implement a distributed feature extraction method, specifically Principal Component Analysis (PCA), on the Orbit testbed. Enable multiple nodes in the Orbit network to collaboratively learn the eigenvectors to reduce the dimensions of new data samples. This compressed data will then be fed into a pre-trained machine learning model for inference. The central idea is that collaboration among nodes can speed up the process of learning these eigenvectors improving the efficiency of our learning or inference tasks. || == Progress == === Week 1 === [https://docs.google.com/presentation/d/1vpTfC3btD_fo2PTWTjl_F7GIQ_yFmAVL1vh2xrC8-Us/edit?usp=drive_link Week 1 Slides] - Read the L2H paper [https://docs.google.com/document/d/1SbiWl773-a59fv6qETlRt_3xXUfRgAaZzjJVOTREiqU/edit?tab=t.8x9jjtvl79to info] - Read the PCA paper [https://docs.google.com/document/d/1SbiWl773-a59fv6qETlRt_3xXUfRgAaZzjJVOTREiqU/edit?tab=t.gd7hstpnqblj#heading=h.qmb8rahmqgyk info] === Week 2 === [https://docs.google.com/presentation/d/1HQLzoaZy9GAZdUuAKzWyq4eZfQxWejc9XjZ0i7Q0ZDQ/edit?usp=sharing Week 2 Slides] - Finished all given papers - Reviewed ML Concepts - Practiced Cosmos (Linux, Vim, etc.) - Set up IDE - Set up project Gitlab === Week 3 === [https://docs.google.com/presentation/d/1CoeVqW_KkwaUJu5cP5dy_3ICegnhyhhBIlqUHjwtDFQ/edit?usp=sharing Week 3 Slides] - Officially split into sub-groups \\ - L2H (Joshua & Madhav) \\ - PCA (Aayan, Nihal, & Hasan) ||= L2H=||= PCA=|| || - Designed Model Architecture based on the paper \\ - Trained all the models on MNIST (expert, rejector, client)\\ - Ran L2H system on one node \\ - Ran multiclass L2H across nodes || - Created an example distributed system framework \\ - Implemented some fault handling features for reliability/robustness \\ - Created a number guessing game with a resilient central node and subnode architecture \\ - Wrote simulation of ORBIT distributed environment using Docker || === Week 4 === [https://docs.google.com/presentation/d/1m7BtaOoItuCyNeE5cEjFde8bFQstZb9ObNr1WhVC1B0/edit?usp=sharing Week 4 Slides] ||= L2H=||= PCA=|| || - Ongoing || - Ongoing || == Useful Links: == **For Reserving Nodes (Usually Weeks):** https://www.orbit-lab.org/cPanel/controlPanel/start **Documentation/Notes:** https://drive.google.com/drive/folders/15HJmhyTUdzQafzMtNqIhy9jgc2sllF35?usp=sharing **Gitlab:** https://gitlab.orbit-lab.org/r3-25 **Presentations:** https://drive.google.com/drive/folders/1aThnKsOUpBFiJtRgjq9AjoppeDdHGuQy?usp=sharing