Accuracy

The various computer vision models were used to perform DD object detection until it converged or the maximum number of batches is reached. All models performed well compared to the ground truth. Since all the images contained a single hoof, the annotations contained a single bounded box. Therefore, image classification was performed using ResNet-18 as a baseline. All R-CNN and YOLO models performed well compared to the baseline with an accuracy of 72.20%, precision of 75.57%, and recall of 79.84%. All R-CNN and YOLO models achieved an mAP between 96.4% to 0.99.8%. Cascade R-CNN outperformed Faster R-CNN with respect to all three performance measures. This corresponds with the expectation where Cascade R-CNN is an extension of Faster R-CNN. Both YOLOv4 and Tiny YOLOv4 outperformed YOLOv3 and Tiny YOLOv3 respectively with respect to all three performance measures. This corresponds with the expectation where YOLOv4 and Tiny YOLOv4 is an extension of YOLOv3 and Tiny YOLOv3 respectively. Overall, YOLOv4 and YOLOv4-tiny outperformed all other models with an accuracy of 97.50%, precision of 98.00%, and recall of 100.00%.

Speed

The various computer vision models were used to perform DD object detection on a set of images. Tiny YOLOv4 outperformed all other models with respect to inference speed where the model processes images at 333 FPS. The next closest models were SSD and SSD Lite at a speed of approximately 100 FPS, followed by YOLOv4 at 65 FPS. This makes it an extremely fast and accurate model for real-time object detection on streaming video using a camera

Object Detection on Images

Object Detection on Videos