Deep Learning for Disease Resistance Classification using Drone Imagery of Corn
Project Overview
Corn breeders currently rely on slow, subjective visual scoring to assess foliage disease resistance across
thousands of plots. This project reframes disease assessment as a supervised regression task: training a deep
neural network to predict Southern Leaf Blight (SLB) severity directly from drone-collected RGB images. The
challenge lies in designing a system that generalizes across seasons and field conditions while managing large,
noisy, and imbalanced real-world data.
Data
The dataset is composed of approximately 10,000 images of the plants and associated foliage health scores
on a scale of 1 (highly diseased) to 9 (no disease) with 0.5 size increments. Disease severity scores were
collected by PhD students and researchers in plant pathology over two years.
Numerous CNN models were trained and tested to evaluate custom model architectures as well as off the
shelf model implementations including EfficientNetv2 and ResNet. Custom models were implemented in PyTorch
and using Optuna and MLFlow we were able to train and test numerous variations of convolutional and fully
connected layers, adjusting both the depth and width of the models. Initial training struggled to learn the
data effectively with accuracy remaining below 40%. Deeper evaluation of the images used for training led us
to suspect that the models were having difficulty identifying and focusing on appropriate corn foliage and
disease features and were not able to filter out and ignore the other non-relevant features in the images like
non-corn vegetation, soil and shadows.
To augment our training images, we decided to segment the images to filter out the non-relevant features and
allow the models train on images that were segmented and filtered to focus on the corn foliage using
Facebook's Segment Anything Model 2. These images show an example of the original image, image mask and
filtered image transition used to augment the training images.
Model Implementation
Using image masking and filtered images for training and testing significantly improved the ability of our
models to effectively learn the data, raising early accuracy scores above 80%. Model training and evaluation
is still underway with a focus optimizing our custom model implementation to find an optimal blend of
convolutional and fully connected layer depth and widths.
Model performance metrics will be added here in early September 2025...
Tech Stack
Provide details of MLFlow, Optuna, Python implemented custom CNN