Verdant Robotics: Machine Learning Internship

Skills: Python, Machine Learning, PyTorch, NVIDIA TLT/TAO

During the Summer 2022, I had the opportunity to work at Verdant Robotics as a Machine Learning Intern under the supervision of Dustin Webb. Verdant is an agricultural robotics company focusing on the spraying of crops and weeds for multi-action farming in order to improve farming efficiency.

My work at the company consisted of not only creating datasets and training models, but also significantly contributing to the development of the data processing pipeline and programs.

Project 1: Multicrop Processing for Labeling Pipeline

For this project, I worked on upgrading the Verdant labeling pipeline, specifically on how files were transferred. The pipeline receives messages containing crop information from the field and the labelers and sends relevant data to the appropriate locations. The earliest iteration of the pipeline was only able to handle a single crop type at a time, and I adapted the code to handle multiple crop types simultaneously. In addition, I established a file format that would house the information needed for different crops in terms of source and destination locations.

Project 2: Superresolution of Data through ESRGAN

I was tasked with the superresolution of the data that Verdant uses for another project. After Verdant upgraded their camera system with new lenses that obtain higher resolution data, the goal of superresolution was to prevent the old data from becoming obsolete by upsampling, or superresolving, the older low resolution images to match the higher resolution of the new data. To approach this, an Enhanced SuperResolution Generative Adversarial Network (ESRGAN) was trained on the new lens data using a PyTorch framework. I adapted code from a GitHub repository by Lornatang and trained it on high resolution data while also implementing distributed data parallel training in order to make the training process more efficient. The resulting superresolved images of the original low resolution data are visually almost indistinguishable from the original images, and further work in tuning the network may see the ESRGAN be used on all of Verdant's old data in the future.

Project 3: Finetuning Alignment with NVIDIA TLT/TAO

During my internship, NVIDIA TLT/TAO was used to train Verdant's models for deployment on the field. Whenever these models are trained, they always start from epoch 0 and step 0. As such, there is no way to directly track comparisons between a finetuned model and its pretrained counterpart. For this project, I worked on find_best_model.py, a module used to find the best model weight after model training and report relevant statistics. My modifications took into account the epoch and step of the pretrained model and aligned them with those of the finetuned model for continuity in reported metrics. They also allowed the program to handle standard TLT/TAO, finetuned, and distributed trained model log parsing, and I also developed unit tests for find_best_model.py to support testing of all of the helper functions used in the code.

Project 4: Restructuring the Verdant Dataset Creation Process

I spearheaded efforts on restructuring how Verdant creates datasets for model training. The pipeline for creating datasets at Verdant involved using a manifest generation program to first generate the manifests indicating the images and annotations to be included, then calling another function to tile the images and prepare the dataset for training with NVIDIA TLT/TAO. With these two files, there are several sets of arguments and configurations to filter the dataset and set them up. The goal of this project was to simplify the process by using a file to define the parameters, then call a dataset creation function to automatically filter and create the manifests, as well as tile and generate the files for training all in one go. I developed a design document for the full scope of the project, code structure, and initial implementation for this process prior to the end of my internship.

Project 5: Model Archiving with SFTP Clients, SSH, and Keys

Verdant employs several servers to train their models in parallel, and afterwards all of the models are consolidated onto a single server for post processing and deployment. I worked on adapting the model archiving script from a bash script to a python script. At its core, the code takes a trained model and its files and copies it from the server it was trained on into the main local cache. For this project, I used paramiko, a python module, and developed the code using an SSH Transport and SSH Client to transfer files across servers. In addition, I added supported functionality for the program to use passwords, ssh keys, and ssh agents in order to successfully establish connections across servers.

Project 6: Downsampling Data for Training and Processing

This project I worked on was branched off of functionality I developed for the ESRGAN, specifically downsampling. Downsampling code was made to generate low resolution images from the new higher resolution ones so that comparisons could be made between how well models trained on both resolutions performed. To this end, I had the code downsample not only the images, but the corresponding manifests as well, which were the files that housed the labeled image data. Through this process, I also updated the existing copy function of an Object Oriented Programming (OOP) class in the manifest processing code as well as created a new copy function for another OOP class for which I also added class filtering functionality.

Project 7: Training Models

Alongside my other projects, I also worked on model training, archiving, and evaluating. Some of the models I’ve trained included a 4 crop multicrop model as well as a 1 class plant model. The purpose of the multicrop model was to determine whether training a model on more than one crop improved the mean average precision of the individual crops, and the purpose of training a 1-class_plant model was to finetune the model using more crop and weed classes on it to see if pretraining on plants allows the model to better differentiate crops. I also developed a manifest combining file to combine different crop manifests together in order to have a singular multicrop manifest to generate tiles and tfrecords from for training.