Changes between Version 99 and Version 100 of Other/Summer/2023/Inference


Ignore:
Timestamp:
Aug 20, 2023, 11:52:14 PM (11 months ago)
Author:
LakshyaG42
Comment:

InDepth

Legend:

Unmodified
Added
Removed
Modified
  • Other/Summer/2023/Inference

    v99 v100  
    303010. Limited CPU power in terminal to imitate mobile devices
    313111. Implemented different threshold values based on confidence for sending the data to the edge and server for inference
    32   * Generated graphs for threshold vs latency, accuracy vs latency, ____ (LINK TO GRAPHS)
    33 12. Retrained neural network to achieve 88% accuracy and collected new graphs (LINK)
     32  * Generated graphs for threshold vs latency, accuracy vs latency, etc.
     3312. Retrained neural network to achieve 88% accuracy and collected new graphs
    343413. Introduced a delay in the inference as well as data transfer to simulate a queue
    3535
    36 == Week 1
     36== Networking Setup for Our Experiment Setup
     37We have developed an experiment enviornment that explores the nature of mobile-edge computing. Our innovative solution involves a client-server architecture that enables seamless communication between devices for predictive analysis. In this section, we'll provide you with an overview of the technologies and processes we employ in our networking setup.
     38= Socket Programming: Establishing Connectivity
     39Our networking framework relies on socket programming to establish seamless connections between client and server devices. This connectivity is pivotal for the real-time exchange of data, enabling us to experiment and analyze performance in a dynamic environment. We used the **socket** library to accomplish this connectivity.
     40= Data Serialization and Deserialization: Ensuring Accuracy
     41To ensure precise data transmission, we employ data serialization and deserialization methods. The **struct** library plays a crucial role in packaging and unpacking data, guaranteeing that information traverses between devices without compromise.
     42= Tracking Metrics with Pandas
     43Efficient data management forms the foundation of our endeavors. Leveraging the **pandas** library, we meticulously organize and log data in CSV format. This meticulous record-keeping is essential for tracking timing metrics and prediction results, facilitating in-depth analysis and iterative improvement.
     44= Unraveling the Networking Workflow
     451. **Image Transmission:** Images are prepared and transmitted from the client to the server, initiating the analysis process.
     46
     472. **Server Processing:** Upon receipt, the server takes charge. It unboxes the data, engages the MobileNetV2 model for inference, and meticulously times the journey from image input to prediction output.
     48
     493. **Response to Client:** The server sends back the prediction results alongside a wealth of timing data, enabling comprehensive analysis.
     50
     514. **Client's Analytical Role:** Armed with the prediction outcomes and timing insights, the client takes the reins. It logs data, evaluates accuracy, and extracts valuable insights from the entire process.
     52
     53= Our Pursuit's Significance
     54Our networking ecosystem holds profound significance for several reasons:
     55
     56* **Learning Through Experimentation:** Our framework serves as a playground for mobile edge computing experimentation. We actively investigate optimization strategies and thresholds, expanding our knowledge base.
     57* **Resource Optimization:** The dynamic decision-making mechanism ensures optimal resource utilization. We determine the tipping point where leveraging the edge device's capabilities yields the greatest advantage.
     58* **Informed Decision-Making:** By logging timing metrics and prediction results, we gain a panoramic view. This informs ongoing optimizations and strategies for achieving higher accuracy.\
     59
     60
     61== Unveiling the Power of Training and Optimization
     62Not only did we learn how to set up the networking in our experiment but, we were equally dedicated to mastering the art of training, optimization, and making the most of the resources at hand. This section sheds light on our training methodologies and the plethora of tools and techniques we've harnessed to hone our models.
     63= Foundational Training
     64Our training journey commenced by delving into the fundamentals of PyTorch and other essential libraries. We took inspiration from tutorials and insights gathered from previous classes to build neural networks from the ground up. Our exploration led us to harness the capabilities of diverse libraries, including TensorFlow, Keras, Optuna, torchvision, torch, pandas, numpy, matplotlib, and time.
     65= Strategic Techniques for Optimization
     66To tackle challenges of overfitting and ensure robustness, we embraced a plethora of techniques that transformed our models into precision instruments. Here are the key strategies we employed:
     67* Image Cutouts: We utilized image cutouts, a sophisticated augmentation technique that randomly masks out portions of input images during training. This approach fortifies the model's resilience and acts as a buffer against overfitting.
     68* Learning Rate Scheduling: We adopted the step decay learning rate scheduling method, progressively reducing the learning rate as epochs advanced. This dynamic strategy enhances convergence and accuracy.
     69* Batch Normalization: Our models reaped the benefits of batch normalization, a technique that accelerates convergence speed and enhances training stability. Its primary role is to mitigate the internal covariate shift problem.
     70* Early Stopping: To expedite training, we implemented an early stopping mechanism that halts training when loss ceases to decrease, within a predefined patience window. This dynamic strategy reduced training time and was experimented with varying patience levels from 5 to 15.
     71* Random Rotations: Our models underwent random rotations during training, a technique designed to boost accuracy by imparting robustness.
     72* Random Erasing: This ingenious technique strategically resets selected weights to their original state, preventing over-reliance on individual nodes and fostering balanced learning.
     73= Hyperparameter Tuning
     74To unlock the full potential of our models, we turned to hyperparameter tuning. Optuna, a hyperparameter optimization library, became our ally. We designed experiments with 20 training trials, fine-tuning parameters like normalization transforms, patience levels, and starting learning rates. This approach ensured that our models achieved peak performance across various scenarios.
     75= Navigating Model Limitations
     76In a race against time, we confronted the limitations posed by our chosen models, including Mobilenet_v2, Densenet121, and Resnet18. These models were initially tailored for ImageNet, a rich dataset comprising high-quality images and an extensive class spectrum. However, within our 10-week timeline, employing ImageNet for training and testing was impractical due to the prolonged time requirement. Instead, we focused on training the last layers of these models specifically for Cifar10. This strategic adaptation allowed us to attain meaningful results in a constrained timeframe.
     77= The Significance of Training and Optimization
     78Our training and optimization endeavors hold exceptional significance:
     79
     80* Holistic Skill Development: Through our journey, we've mastered PyTorch and a host of libraries, honing skills that transcend individual projects.
     81* Robust Model Foundation: The diverse strategies we employed fortified our models against overfitting and fostered resilience in the face of challenging scenarios.
     82* Efficient Resource Utilization: Techniques like hyperparameter tuning and early stopping streamline the training process, making the most of limited resources.
     83
     84== Weekly Updates
     85= Week 1
    3786**Summary**
    3887   * Understood the goal of the project and broke down its objectives
     
    4493   * Attempt to simulate the difference between their performances at inference
    4594 
    46 == Week 2
     95= Week 2
    4796**Summary**
    4897  * Performed basics of pattern recognition and Machine Learning (PPA - Patterns, Predictions, Actions) using Pytorch
     
    55104  * Attempt to simulate the difference between their performances at inference
    56105
    57 == Week 3
     106= Week 3
    58107**Summary**
    59108  * Debugged issues with work we had done in previous weeks
     
    69118  * Think about other implementations - Early Exiting, model compression, data compression, … Mixture?
    70119
    71 == Week 4
     120= Week 4
    72121**Summary**
    73122  * Compared performances of both neural networks on the CIFAR10 dataset
     
    76125       * Need to serialize by bytes instead of transferring as strings
    77126  * Read and discussed various research papers related to our project - created a brief presentation on each paper
    78        * James -
    79        * Shreya -  <a href="https://docs.google.com/presentation/d/1LvucmccciojbtaxkDGnkFUM0lYNFh7YuKeKjHB8zxjI/edit?usp=sharing">Mobile Edge Computing: A Survey</a>
    80        * Tanushree - <a href= "https://docs.google.com/presentation/d/121ELfzNZW6dZzfScij3yEqSZ_zurKLX8nQJ05IO3Qeg/edit?usp=sharing">Distributed Deep Neural Networks over the Cloud, the Edge and End Devices</a>
    81        * Lakshya -
    82        * Haider -
    83127
    84128**Next Steps**
     
    87131   * Track and add Age of Information Metrics
    88132
    89 == Week 5
    90 * ntp
     133= Week 5
     134* We learned how to use the Network Time Protocol while we waited for the more accurate Precision Time Protocol to be implemented in the technology we were using.
     135* Implemented NTP in our code files so that we can measure time and evaluate the trade offs that we came up with.
    91136
    92 == Week 6
     137= Week 6
    93138**Summary**
    94139  * Figured out how to properly split a NN using split computing
     
    105150Transfer to Precision time protocol(PTP)
    106151
    107 == Week 7
     152= Week 7
    108153**Summary**
    109154  * Explored different research questions with the data collected