Yolov8-源码解析-十六-

Yolov8 源码解析(十六)


comments: true
description: Discover YOLOv10, the latest in real-time object detection, eliminating NMS and boosting efficiency. Achieve top performance with a low computational cost.
keywords: YOLOv10, real-time object detection, NMS-free, deep learning, Tsinghua University, Ultralytics, machine learning, neural networks, performance optimization

YOLOv10: Real-Time End-to-End Object Detection

YOLOv10, built on the Ultralytics Python package by researchers at Tsinghua University, introduces a new approach to real-time object detection, addressing both the post-processing and model architecture deficiencies found in previous YOLO versions. By eliminating non-maximum suppression (NMS) and optimizing various model components, YOLOv10 achieves state-of-the-art performance with significantly reduced computational overhead. Extensive experiments demonstrate its superior accuracy-latency trade-offs across multiple model scales.

YOLOv10 consistent dual assignment for NMS-free training

Overview

Real-time object detection aims to accurately predict object categories and positions in images with low latency. The YOLO series has been at the forefront of this research due to its balance between performance and efficiency. However, reliance on NMS and architectural inefficiencies have hindered optimal performance. YOLOv10 addresses these issues by introducing consistent dual assignments for NMS-free training and a holistic efficiency-accuracy driven model design strategy.

Architecture

The architecture of YOLOv10 builds upon the strengths of previous YOLO models while introducing several key innovations. The model architecture consists of the following components:

  1. Backbone: Responsible for feature extraction, the backbone in YOLOv10 uses an enhanced version of CSPNet (Cross Stage Partial Network) to improve gradient flow and reduce computational redundancy.
  2. Neck: The neck is designed to aggregate features from different scales and passes them to the head. It includes PAN (Path Aggregation Network) layers for effective multiscale feature fusion.
  3. One-to-Many Head: Generates multiple predictions per object during training to provide rich supervisory signals and improve learning accuracy.
  4. One-to-One Head: Generates a single best prediction per object during inference to eliminate the need for NMS, thereby reducing latency and improving efficiency.

Key Features

  1. NMS-Free Training: Utilizes consistent dual assignments to eliminate the need for NMS, reducing inference latency.
  2. Holistic Model Design: Comprehensive optimization of various components from both efficiency and accuracy perspectives, including lightweight classification heads, spatial-channel decoupled down sampling, and rank-guided block design.
  3. Enhanced Model Capabilities: Incorporates large-kernel convolutions and partial self-attention modules to improve performance without significant computational cost.

Model Variants

YOLOv10 comes in various model scales to cater to different application needs:

  • YOLOv10-N: Nano version for extremely resource-constrained environments.
  • YOLOv10-S: Small version balancing speed and accuracy.
  • YOLOv10-M: Medium version for general-purpose use.
  • YOLOv10-B: Balanced version with increased width for higher accuracy.
  • YOLOv10-L: Large version for higher accuracy at the cost of increased computational resources.
  • YOLOv10-X: Extra-large version for maximum accuracy and performance.

Performance

YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in terms of accuracy and efficiency. For example, YOLOv10-S is 1.8x faster than RT-DETR-R18 with similar AP on the COCO dataset, and YOLOv10-B has 46% less latency and 25% fewer parameters than YOLOv9-C with the same performance.

Model Input Size APval FLOPs (G) Latency (ms)
YOLOv10-N 640 38.5 6.7 1.84
YOLOv10-S 640 46.3 21.6 2.49
YOLOv10-M 640 51.1 59.1 4.74
YOLOv10-B 640 52.5 92.0 5.74
YOLOv10-L 640 53.2 120.3 7.28
YOLOv10-X 640 54.4 160.4 10.70

Latency measured with TensorRT FP16 on T4 GPU.

Methodology

Consistent Dual Assignments for NMS-Free Training

YOLOv10 employs dual label assignments, combining one-to-many and one-to-one strategies during training to ensure rich supervision and efficient end-to-end deployment. The consistent matching metric aligns the supervision between both strategies, enhancing the quality of predictions during inference.

Holistic Efficiency-Accuracy Driven Model Design

Efficiency Enhancements

  1. Lightweight Classification Head: Reduces the computational overhead of the classification head by using depth-wise separable convolutions.
  2. Spatial-Channel Decoupled Down sampling: Decouples spatial reduction and channel modulation to minimize information loss and computational cost.
  3. Rank-Guided Block Design: Adapts block design based on intrinsic stage redundancy, ensuring optimal parameter utilization.

Accuracy Enhancements

  1. Large-Kernel Convolution: Enlarges the receptive field to enhance feature extraction capability.
  2. Partial Self-Attention (PSA): Incorporates self-attention modules to improve global representation learning with minimal overhead.

Experiments and Results

YOLOv10 has been extensively tested on standard benchmarks like COCO, demonstrating superior performance and efficiency. The model achieves state-of-the-art results across different variants, showcasing significant improvements in latency and accuracy compared to previous versions and other contemporary detectors.

Comparisons

YOLOv10 comparison with SOTA object detectors

Compared to other state-of-the-art detectors:

  • YOLOv10-S / X are 1.8× / 1.3× faster than RT-DETR-R18 / R101 with similar accuracy
  • YOLOv10-B has 25% fewer parameters and 46% lower latency than YOLOv9-C at same accuracy
  • YOLOv10-L / X outperform YOLOv8-L / X by 0.3 AP / 0.5 AP with 1.8× / 2.3× fewer parameters

Here is a detailed comparison of YOLOv10 variants with other state-of-the-art models:

Model Params
(M)
FLOPs
(G)
mAPval
50-95
Latency
(ms)
Latency-forward
(ms)
YOLOv6-3.0-N 4.7 11.4 37.0 2.69 1.76
Gold-YOLO-N 5.6 12.1 39.6 2.92 1.82
YOLOv8-N 3.2 8.7 37.3 6.16 1.77
YOLOv10-N 2.3 6.7 39.5 1.84 1.79
YOLOv6-3.0-S 18.5 45.3 44.3 3.42 2.35
Gold-YOLO-S 21.5 46.0 45.4 3.82 2.73
YOLOv8-S 11.2 28.6 44.9 7.07 2.33
YOLOv10-S 7.2 21.6 46.8 2.49 2.39
RT-DETR-R18 20.0 60.0 46.5 4.58 4.49
YOLOv6-3.0-M 34.9 85.8 49.1 5.63 4.56
Gold-YOLO-M 41.3 87.5 49.8 6.38 5.45
YOLOv8-M 25.9 78.9 50.6 9.50 5.09
YOLOv10-M 15.4 59.1 51.3 4.74 4.63
YOLOv6-3.0-L 59.6 150.7 51.8 9.02 7.90
Gold-YOLO-L 75.1 151.7 51.8 10.65 9.78
YOLOv8-L 43.7 165.2 52.9 12.39 8.06
RT-DETR-R50 42.0 136.0 53.1 9.20 9.07
YOLOv10-L 24.4 120.3 53.4 7.28 7.21
YOLOv8-X 68.2 257.8 53.9 16.86 12.83
RT-DETR-R101 76.0 259.0 54.3 13.71 13.58
YOLOv10-X 29.5 160.4 54.4 10.70 10.60

Usage Examples

For predicting new images with YOLOv10:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a pre-trained YOLOv10n model
    model = YOLO("yolov10n.pt")

    # Perform object detection on an image
    results = model("image.jpg")

    # Display the results
    results[0].show()
    ```

=== "CLI"

    ```py
    # Load a COCO-pretrained YOLOv10n model and run inference on the 'bus.jpg' image
    yolo detect predict model=yolov10n.pt source=path/to/bus.jpg
    ```

For training YOLOv10 on a custom dataset:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load YOLOv10n model from scratch
    model = YOLO("yolov10n.yaml")

    # Train the model
    model.train(data="coco8.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    # Build a YOLOv10n model from scratch and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov10n.yaml data=coco8.yaml epochs=100 imgsz=640

    # Build a YOLOv10n model from scratch and run inference on the 'bus.jpg' image
    yolo predict model=yolov10n.yaml source=path/to/bus.jpg
    ```

Supported Tasks and Modes

The YOLOv10 models series offers a range of models, each optimized for high-performance Object Detection. These models cater to varying computational needs and accuracy requirements, making them versatile for a wide array of applications.

Model Filenames Tasks Inference Validation Training Export
YOLOv10 yolov10n.pt yolov10s.pt yolov10m.pt yolov10l.pt yolov10x.pt Object Detection

Exporting YOLOv10

Due to the new operations introduced with YOLOv10, not all export formats provided by Ultralytics are currently supported. The following table outlines which formats have been successfully converted using Ultralytics for YOLOv10. Feel free to open a pull request if you're able to provide a contribution change for adding export support of additional formats for YOLOv10.

Export Format Supported
TorchScript
ONNX
OpenVINO
TensorRT
CoreML
TF SavedModel
TF GraphDef
TF Lite
TF Edge TPU
TF.js
PaddlePaddle
NCNN

Conclusion

YOLOv10 sets a new standard in real-time object detection by addressing the shortcomings of previous YOLO versions and incorporating innovative design strategies. Its ability to deliver high accuracy with low computational cost makes it an ideal choice for a wide range of real-world applications.

Citations and Acknowledgements

We would like to acknowledge the YOLOv10 authors from Tsinghua University for their extensive research and significant contributions to the Ultralytics framework:

!!! Quote ""

=== "BibTeX"

    ```py
    @article{THU-MIGyolov10,
      title={YOLOv10: Real-Time End-to-End Object Detection},
      author={Ao Wang, Hui Chen, Lihao Liu, et al.},
      journal={arXiv preprint arXiv:2405.14458},
      year={2024},
      institution={Tsinghua University},
      license = {AGPL-3.0}
    }
    ```

For detailed implementation, architectural innovations, and experimental results, please refer to the YOLOv10 research paper and GitHub repository by the Tsinghua University team.

FAQ

What is YOLOv10 and how does it differ from previous YOLO versions?

YOLOv10, developed by researchers at Tsinghua University, introduces several key innovations to real-time object detection. It eliminates the need for non-maximum suppression (NMS) by employing consistent dual assignments during training and optimized model components for superior performance with reduced computational overhead. For more details on its architecture and key features, check out the YOLOv10 overview section.

How can I get started with running inference using YOLOv10?

For easy inference, you can use the Ultralytics YOLO Python library or the command line interface (CLI). Below are examples of predicting new images using YOLOv10:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load the pre-trained YOLOv10-N model
    model = YOLO("yolov10n.pt")
    results = model("image.jpg")
    results[0].show()
    ```

=== "CLI"

    ```py
    yolo detect predict model=yolov10n.pt source=path/to/image.jpg
    ```

For more usage examples, visit our Usage Examples section.

Which model variants does YOLOv10 offer and what are their use cases?

YOLOv10 offers several model variants to cater to different use cases:

  • YOLOv10-N: Suitable for extremely resource-constrained environments
  • YOLOv10-S: Balances speed and accuracy
  • YOLOv10-M: General-purpose use
  • YOLOv10-B: Higher accuracy with increased width
  • YOLOv10-L: High accuracy at the cost of computational resources
  • YOLOv10-X: Maximum accuracy and performance

Each variant is designed for different computational needs and accuracy requirements, making them versatile for a variety of applications. Explore the Model Variants section for more information.

How does the NMS-free approach in YOLOv10 improve performance?

YOLOv10 eliminates the need for non-maximum suppression (NMS) during inference by employing consistent dual assignments for training. This approach reduces inference latency and enhances prediction efficiency. The architecture also includes a one-to-one head for inference, ensuring that each object gets a single best prediction. For a detailed explanation, see the Consistent Dual Assignments for NMS-Free Training section.

Where can I find the export options for YOLOv10 models?

YOLOv10 supports several export formats, including TorchScript, ONNX, OpenVINO, and TensorRT. However, not all export formats provided by Ultralytics are currently supported for YOLOv10 due to its new operations. For details on the supported formats and instructions on exporting, visit the Exporting YOLOv10 section.

What are the performance benchmarks for YOLOv10 models?

YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in both accuracy and efficiency. For example, YOLOv10-S is 1.8x faster than RT-DETR-R18 with a similar AP on the COCO dataset. YOLOv10-B shows 46% less latency and 25% fewer parameters than YOLOv9-C with the same performance. Detailed benchmarks can be found in the Comparisons section.


comments: true
description: Discover YOLOv3 and its variants YOLOv3-Ultralytics and YOLOv3u. Learn about their features, implementations, and support for object detection tasks.
keywords: YOLOv3, YOLOv3-Ultralytics, YOLOv3u, object detection, Ultralytics, computer vision, AI models, deep learning

YOLOv3, YOLOv3-Ultralytics, and YOLOv3u

Overview

This document presents an overview of three closely related object detection models, namely YOLOv3, YOLOv3-Ultralytics, and YOLOv3u.

  1. YOLOv3: This is the third version of the You Only Look Once (YOLO) object detection algorithm. Originally developed by Joseph Redmon, YOLOv3 improved on its predecessors by introducing features such as multiscale predictions and three different sizes of detection kernels.

  2. YOLOv3-Ultralytics: This is Ultralytics' implementation of the YOLOv3 model. It reproduces the original YOLOv3 architecture and offers additional functionalities, such as support for more pre-trained models and easier customization options.

  3. YOLOv3u: This is an updated version of YOLOv3-Ultralytics that incorporates the anchor-free, objectness-free split head used in YOLOv8 models. YOLOv3u maintains the same backbone and neck architecture as YOLOv3 but with the updated detection head from YOLOv8.

Ultralytics YOLOv3

Key Features

  • YOLOv3: Introduced the use of three different scales for detection, leveraging three different sizes of detection kernels: 13x13, 26x26, and 52x52. This significantly improved detection accuracy for objects of different sizes. Additionally, YOLOv3 added features such as multi-label predictions for each bounding box and a better feature extractor network.

  • YOLOv3-Ultralytics: Ultralytics' implementation of YOLOv3 provides the same performance as the original model but comes with added support for more pre-trained models, additional training methods, and easier customization options. This makes it more versatile and user-friendly for practical applications.

  • YOLOv3u: This updated model incorporates the anchor-free, objectness-free split head from YOLOv8. By eliminating the need for pre-defined anchor boxes and objectness scores, this detection head design can improve the model's ability to detect objects of varying sizes and shapes. This makes YOLOv3u more robust and accurate for object detection tasks.

Supported Tasks and Modes

The YOLOv3 series, including YOLOv3, YOLOv3-Ultralytics, and YOLOv3u, are designed specifically for object detection tasks. These models are renowned for their effectiveness in various real-world scenarios, balancing accuracy and speed. Each variant offers unique features and optimizations, making them suitable for a range of applications.

All three models support a comprehensive set of modes, ensuring versatility in various stages of model deployment and development. These modes include Inference, Validation, Training, and Export, providing users with a complete toolkit for effective object detection.

Model Type Tasks Supported Inference Validation Training Export
YOLOv3 Object Detection
YOLOv3-Ultralytics Object Detection
YOLOv3u Object Detection

This table provides an at-a-glance view of the capabilities of each YOLOv3 variant, highlighting their versatility and suitability for various tasks and operational modes in object detection workflows.

Usage Examples

This example provides simple YOLOv3 training and inference examples. For full documentation on these and other modes see the Predict, Train, Val and Export docs pages.

!!! Example

=== "Python"

    PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv3n model
    model = YOLO("yolov3n.pt")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

    # Run inference with the YOLOv3n model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    CLI commands are available to directly run the models:

    ```py
    # Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640

    # Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
    yolo predict model=yolov3n.pt source=path/to/bus.jpg
    ```

Citations and Acknowledgements

If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository:

!!! Quote ""

=== "BibTeX"

    ```py
    @article{redmon2018yolov3,
      title={YOLOv3: An Incremental Improvement},
      author={Redmon, Joseph and Farhadi, Ali},
      journal={arXiv preprint arXiv:1804.02767},
      year={2018}
    }
    ```

Thank you to Joseph Redmon and Ali Farhadi for developing the original YOLOv3.

FAQ

What are the differences between YOLOv3, YOLOv3-Ultralytics, and YOLOv3u?

YOLOv3 is the third iteration of the YOLO (You Only Look Once) object detection algorithm developed by Joseph Redmon, known for its balance of accuracy and speed, utilizing three different scales (13x13, 26x26, and 52x52) for detections. YOLOv3-Ultralytics is Ultralytics' adaptation of YOLOv3 that adds support for more pre-trained models and facilitates easier model customization. YOLOv3u is an upgraded variant of YOLOv3-Ultralytics, integrating the anchor-free, objectness-free split head from YOLOv8, improving detection robustness and accuracy for various object sizes. For more details on the variants, refer to the YOLOv3 series.

How can I train a YOLOv3 model using Ultralytics?

Training a YOLOv3 model with Ultralytics is straightforward. You can train the model using either Python or CLI:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv3n model
    model = YOLO("yolov3n.pt")

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    # Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640
    ```

For more comprehensive training options and guidelines, visit our Train mode documentation.

What makes YOLOv3u more accurate for object detection tasks?

YOLOv3u improves upon YOLOv3 and YOLOv3-Ultralytics by incorporating the anchor-free, objectness-free split head used in YOLOv8 models. This upgrade eliminates the need for pre-defined anchor boxes and objectness scores, enhancing its capability to detect objects of varying sizes and shapes more precisely. This makes YOLOv3u a better choice for complex and diverse object detection tasks. For more information, refer to the Why YOLOv3u section.

How can I use YOLOv3 models for inference?

You can perform inference using YOLOv3 models by either Python scripts or CLI commands:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv3n model
    model = YOLO("yolov3n.pt")

    # Run inference with the YOLOv3n model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    ```py
    # Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
    yolo predict model=yolov3n.pt source=path/to/bus.jpg
    ```

Refer to the Inference mode documentation for more details on running YOLO models.

What tasks are supported by YOLOv3 and its variants?

YOLOv3, YOLOv3-Ultralytics, and YOLOv3u primarily support object detection tasks. These models can be used for various stages of model deployment and development, such as Inference, Validation, Training, and Export. For a comprehensive set of tasks supported and more in-depth details, visit our Object Detection tasks documentation.

Where can I find resources to cite YOLOv3 in my research?

If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository. Example BibTeX citation:

!!! Quote ""

=== "BibTeX"

    ```py
    @article{redmon2018yolov3,
      title={YOLOv3: An Incremental Improvement},
      author={Redmon, Joseph and Farhadi, Ali},
      journal={arXiv preprint arXiv:1804.02767},
      year={2018}
    }
    ```

For more citation details, refer to the Citations and Acknowledgements section.


comments: true
description: Explore YOLOv4, a state-of-the-art real-time object detection model by Alexey Bochkovskiy. Discover its architecture, features, and performance.
keywords: YOLOv4, object detection, real-time detection, Alexey Bochkovskiy, neural networks, machine learning, computer vision

YOLOv4: High-Speed and Precise Object Detection

Welcome to the Ultralytics documentation page for YOLOv4, a state-of-the-art, real-time object detector launched in 2020 by Alexey Bochkovskiy at https://github.com/AlexeyAB/darknet. YOLOv4 is designed to provide the optimal balance between speed and accuracy, making it an excellent choice for many applications.

YOLOv4 architecture diagram YOLOv4 architecture diagram. Showcasing the intricate network design of YOLOv4, including the backbone, neck, and head components, and their interconnected layers for optimal real-time object detection.

Introduction

YOLOv4 stands for You Only Look Once version 4. It is a real-time object detection model developed to address the limitations of previous YOLO versions like YOLOv3 and other object detection models. Unlike other convolutional neural network (CNN) based object detectors, YOLOv4 is not only applicable for recommendation systems but also for standalone process management and human input reduction. Its operation on conventional graphics processing units (GPUs) allows for mass usage at an affordable price, and it is designed to work in real-time on a conventional GPU while requiring only one such GPU for training.

Architecture

YOLOv4 makes use of several innovative features that work together to optimize its performance. These include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT), Mish-activation, Mosaic data augmentation, DropBlock regularization, and CIoU loss. These features are combined to achieve state-of-the-art results.

A typical object detector is composed of several parts including the input, the backbone, the neck, and the head. The backbone of YOLOv4 is pre-trained on ImageNet and is used to predict classes and bounding boxes of objects. The backbone could be from several models including VGG, ResNet, ResNeXt, or DenseNet. The neck part of the detector is used to collect feature maps from different stages and usually includes several bottom-up paths and several top-down paths. The head part is what is used to make the final object detections and classifications.

Bag of Freebies

YOLOv4 also makes use of methods known as "bag of freebies," which are techniques that improve the accuracy of the model during training without increasing the cost of inference. Data augmentation is a common bag of freebies technique used in object detection, which increases the variability of the input images to improve the robustness of the model. Some examples of data augmentation include photometric distortions (adjusting the brightness, contrast, hue, saturation, and noise of an image) and geometric distortions (adding random scaling, cropping, flipping, and rotating). These techniques help the model to generalize better to different types of images.

Features and Performance

YOLOv4 is designed for optimal speed and accuracy in object detection. The architecture of YOLOv4 includes CSPDarknet53 as the backbone, PANet as the neck, and YOLOv3 as the detection head. This design allows YOLOv4 to perform object detection at an impressive speed, making it suitable for real-time applications. YOLOv4 also excels in accuracy, achieving state-of-the-art results in object detection benchmarks.

Usage Examples

As of the time of writing, Ultralytics does not currently support YOLOv4 models. Therefore, any users interested in using YOLOv4 will need to refer directly to the YOLOv4 GitHub repository for installation and usage instructions.

Here is a brief overview of the typical steps you might take to use YOLOv4:

  1. Visit the YOLOv4 GitHub repository: https://github.com/AlexeyAB/darknet.

  2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.

  3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.

Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv4 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv4 GitHub repository.

We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv4 is implemented.

Conclusion

YOLOv4 is a powerful and efficient object detection model that strikes a balance between speed and accuracy. Its use of unique features and bag of freebies techniques during training allows it to perform excellently in real-time object detection tasks. YOLOv4 can be trained and used by anyone with a conventional GPU, making it accessible and practical for a wide range of applications.

Citations and Acknowledgements

We would like to acknowledge the YOLOv4 authors for their significant contributions in the field of real-time object detection:

!!! Quote ""

=== "BibTeX"

    ```py
    @misc{bochkovskiy2020yolov4,
          title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
          author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
          year={2020},
          eprint={2004.10934},
          archivePrefix={arXiv},
          primaryClass={cs.CV}
    }
    ```

The original YOLOv4 paper can be found on arXiv. The authors have made their work publicly available, and the codebase can be accessed on GitHub. We appreciate their efforts in advancing the field and making their work accessible to the broader community.

FAQ

What is YOLOv4 and why should I use it for object detection?

YOLOv4, which stands for "You Only Look Once version 4," is a state-of-the-art real-time object detection model developed by Alexey Bochkovskiy in 2020. It achieves an optimal balance between speed and accuracy, making it highly suitable for real-time applications. YOLOv4's architecture incorporates several innovative features like Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), and Self-adversarial-training (SAT), among others, to achieve state-of-the-art results. If you're looking for a high-performance model that operates efficiently on conventional GPUs, YOLOv4 is an excellent choice.

How does the architecture of YOLOv4 enhance its performance?

The architecture of YOLOv4 includes several key components: the backbone, the neck, and the head. The backbone, which can be models like VGG, ResNet, or CSPDarknet53, is pre-trained to predict classes and bounding boxes. The neck, utilizing PANet, connects feature maps from different stages for comprehensive data extraction. Finally, the head, which uses configurations from YOLOv3, makes the final object detections. YOLOv4 also employs "bag of freebies" techniques like mosaic data augmentation and DropBlock regularization, further optimizing its speed and accuracy.

What are "bag of freebies" in the context of YOLOv4?

"Bag of freebies" refers to methods that improve the training accuracy of YOLOv4 without increasing the cost of inference. These techniques include various forms of data augmentation like photometric distortions (adjusting brightness, contrast, etc.) and geometric distortions (scaling, cropping, flipping, rotating). By increasing the variability of the input images, these augmentations help YOLOv4 generalize better to different types of images, thereby improving its robustness and accuracy without compromising its real-time performance.

Why is YOLOv4 considered suitable for real-time object detection on conventional GPUs?

YOLOv4 is designed to optimize both speed and accuracy, making it ideal for real-time object detection tasks that require quick and reliable performance. It operates efficiently on conventional GPUs, needing only one for both training and inference. This makes it accessible and practical for various applications ranging from recommendation systems to standalone process management, thereby reducing the need for extensive hardware setups and making it a cost-effective solution for real-time object detection.

How can I get started with YOLOv4 if Ultralytics does not currently support it?

To get started with YOLOv4, you should visit the official YOLOv4 GitHub repository. Follow the installation instructions provided in the README file, which typically include cloning the repository, installing dependencies, and setting up environment variables. Once installed, you can train the model by preparing your dataset, configuring the model parameters, and following the usage instructions provided. Since Ultralytics does not currently support YOLOv4, it is recommended to refer directly to the YOLOv4 GitHub for the most up-to-date and detailed guidance.


comments: true
description: Explore YOLOv5u, an advanced object detection model with optimized accuracy-speed tradeoff, featuring anchor-free Ultralytics head and various pre-trained models.
keywords: YOLOv5, YOLOv5u, object detection, Ultralytics, anchor-free, pre-trained models, accuracy, speed, real-time detection

YOLOv5

Overview

YOLOv5u represents an advancement in object detection methodologies. Originating from the foundational architecture of the YOLOv5 model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the YOLOv8 models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications.

Ultralytics YOLOv5

Key Features

  • Anchor-free Split Ultralytics Head: Traditional object detection models rely on predefined anchor boxes to predict object locations. However, YOLOv5u modernizes this approach. By adopting an anchor-free split Ultralytics head, it ensures a more flexible and adaptive detection mechanism, consequently enhancing the performance in diverse scenarios.

  • Optimized Accuracy-Speed Tradeoff: Speed and accuracy often pull in opposite directions. But YOLOv5u challenges this tradeoff. It offers a calibrated balance, ensuring real-time detections without compromising on accuracy. This feature is particularly invaluable for applications that demand swift responses, such as autonomous vehicles, robotics, and real-time video analytics.

  • Variety of Pre-trained Models: Understanding that different tasks require different toolsets, YOLOv5u provides a plethora of pre-trained models. Whether you're focusing on Inference, Validation, or Training, there's a tailor-made model awaiting you. This variety ensures you're not just using a one-size-fits-all solution, but a model specifically fine-tuned for your unique challenge.

Supported Tasks and Modes

The YOLOv5u models, with various pre-trained weights, excel in Object Detection tasks. They support a comprehensive range of modes, making them suitable for diverse applications, from development to deployment.

Model Type Pre-trained Weights Task Inference Validation Training Export
YOLOv5u yolov5nu, yolov5su, yolov5mu, yolov5lu, yolov5xu, yolov5n6u, yolov5s6u, yolov5m6u, yolov5l6u, yolov5x6u Object Detection

This table provides a detailed overview of the YOLOv5u model variants, highlighting their applicability in object detection tasks and support for various operational modes such as Inference, Validation, Training, and Export. This comprehensive support ensures that users can fully leverage the capabilities of YOLOv5u models in a wide range of object detection scenarios.

Performance Metrics

!!! Performance

=== "Detection"

See [Detection Docs](../tasks/detect.md) for usage examples with these models trained on [COCO](../datasets/detect/coco.md), which include 80 pre-trained classes.

| Model                                                                                       | YAML                                                                                                           | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|---------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-----------------------|----------------------|--------------------------------|-------------------------------------|--------------------|-------------------|
| [yolov5nu.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5nu.pt)   | [yolov5n.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml)     | 640                   | 34.3                 | 73.6                           | 1.06                                | 2.6                | 7.7               |
| [yolov5su.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5su.pt)   | [yolov5s.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml)     | 640                   | 43.0                 | 120.7                          | 1.27                                | 9.1                | 24.0              |
| [yolov5mu.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5mu.pt)   | [yolov5m.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml)     | 640                   | 49.0                 | 233.9                          | 1.86                                | 25.1               | 64.2              |
| [yolov5lu.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5lu.pt)   | [yolov5l.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml)     | 640                   | 52.2                 | 408.4                          | 2.50                                | 53.2               | 135.0             |
| [yolov5xu.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5xu.pt)   | [yolov5x.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml)     | 640                   | 53.2                 | 763.2                          | 3.81                                | 97.2               | 246.4             |
|                                                                                             |                                                                                                                |                       |                      |                                |                                     |                    |                   |
| [yolov5n6u.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5n6u.pt) | [yolov5n6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280                  | 42.1                 | 211.0                          | 1.83                                | 4.3                | 7.8               |
| [yolov5s6u.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5s6u.pt) | [yolov5s6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280                  | 48.6                 | 422.6                          | 2.34                                | 15.3               | 24.6              |
| [yolov5m6u.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5m6u.pt) | [yolov5m6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280                  | 53.6                 | 810.9                          | 4.36                                | 41.2               | 65.7              |
| [yolov5l6u.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5l6u.pt) | [yolov5l6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280                  | 55.7                 | 1470.9                         | 5.47                                | 86.1               | 137.4             |
| [yolov5x6u.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov5x6u.pt) | [yolov5x6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280                  | 56.8                 | 2436.5                         | 8.98                                | 155.4              | 250.7             |

Usage Examples

This example provides simple YOLOv5 training and inference examples. For full documentation on these and other modes see the Predict, Train, Val and Export docs pages.

!!! Example

=== "Python"

    PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv5n model
    model = YOLO("yolov5n.pt")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

    # Run inference with the YOLOv5n model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    CLI commands are available to directly run the models:

    ```py
    # Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640

    # Load a COCO-pretrained YOLOv5n model and run inference on the 'bus.jpg' image
    yolo predict model=yolov5n.pt source=path/to/bus.jpg
    ```

Citations and Acknowledgements

If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows:

!!! Quote ""

=== "BibTeX"

    ```py
    @software{yolov5,
      title = {Ultralytics YOLOv5},
      author = {Glenn Jocher},
      year = {2020},
      version = {7.0},
      license = {AGPL-3.0},
      url = {https://github.com/ultralytics/yolov5},
      doi = {10.5281/zenodo.3908559},
      orcid = {0000-0001-5950-6979}
    }
    ```

Please note that YOLOv5 models are provided under AGPL-3.0 and Enterprise licenses.

FAQ

What is Ultralytics YOLOv5u and how does it differ from YOLOv5?

Ultralytics YOLOv5u is an advanced version of YOLOv5, integrating the anchor-free, objectness-free split head that enhances the accuracy-speed tradeoff for real-time object detection tasks. Unlike the traditional YOLOv5, YOLOv5u adopts an anchor-free detection mechanism, making it more flexible and adaptive in diverse scenarios. For more detailed information on its features, you can refer to the YOLOv5 Overview.

How does the anchor-free Ultralytics head improve object detection performance in YOLOv5u?

The anchor-free Ultralytics head in YOLOv5u improves object detection performance by eliminating the dependency on predefined anchor boxes. This results in a more flexible and adaptive detection mechanism that can handle various object sizes and shapes with greater efficiency. This enhancement directly contributes to a balanced tradeoff between accuracy and speed, making YOLOv5u suitable for real-time applications. Learn more about its architecture in the Key Features section.

Can I use pre-trained YOLOv5u models for different tasks and modes?

Yes, you can use pre-trained YOLOv5u models for various tasks such as Object Detection. These models support multiple modes, including Inference, Validation, Training, and Export. This flexibility allows users to leverage the capabilities of YOLOv5u models across different operational requirements. For a detailed overview, check the Supported Tasks and Modes section.

How do the performance metrics of YOLOv5u models compare on different platforms?

The performance metrics of YOLOv5u models vary depending on the platform and hardware used. For example, the YOLOv5nu model achieves a 34.3 mAP on COCO dataset with a speed of 73.6 ms on CPU (ONNX) and 1.06 ms on A100 TensorRT. Detailed performance metrics for different YOLOv5u models can be found in the Performance Metrics section, which provides a comprehensive comparison across various devices.

How can I train a YOLOv5u model using the Ultralytics Python API?

You can train a YOLOv5u model by loading a pre-trained model and running the training command with your dataset. Here's a quick example:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv5n model
    model = YOLO("yolov5n.pt")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    # Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640
    ```

For more detailed instructions, visit the Usage Examples section.


comments: true
description: Explore Meituan YOLOv6, a top-tier object detector balancing speed and accuracy. Learn about its unique features and performance metrics on Ultralytics Docs.
keywords: Meituan YOLOv6, object detection, real-time applications, BiC module, Anchor-Aided Training, COCO dataset, high-performance models, Ultralytics Docs

Meituan YOLOv6

Overview

Meituan YOLOv6 is a cutting-edge object detector that offers remarkable balance between speed and accuracy, making it a popular choice for real-time applications. This model introduces several notable enhancements on its architecture and training scheme, including the implementation of a Bi-directional Concatenation (BiC) module, an anchor-aided training (AAT) strategy, and an improved backbone and neck design for state-of-the-art accuracy on the COCO dataset.

Meituan YOLOv6
Model example image Overview of YOLOv6. Model architecture diagram showing the redesigned network components and training strategies that have led to significant performance improvements. (a) The neck of YOLOv6 (N and S are shown). Note for M/L, RepBlocks is replaced with CSPStackRep. (b) The structure of a BiC module. (c) A SimCSPSPPF block. (source).

Key Features

  • Bidirectional Concatenation (BiC) Module: YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation.
  • Anchor-Aided Training (AAT) Strategy: This model proposes AAT to enjoy the benefits of both anchor-based and anchor-free paradigms without compromising inference efficiency.
  • Enhanced Backbone and Neck Design: By deepening YOLOv6 to include another stage in the backbone and neck, this model achieves state-of-the-art performance on the COCO dataset at high-resolution input.
  • Self-Distillation Strategy: A new self-distillation strategy is implemented to boost the performance of smaller models of YOLOv6, enhancing the auxiliary regression branch during training and removing it at inference to avoid a marked speed decline.

Performance Metrics

YOLOv6 provides various pre-trained models with different scales:

  • YOLOv6-N: 37.5% AP on COCO val2017 at 1187 FPS with NVIDIA Tesla T4 GPU.
  • YOLOv6-S: 45.0% AP at 484 FPS.
  • YOLOv6-M: 50.0% AP at 226 FPS.
  • YOLOv6-L: 52.8% AP at 116 FPS.
  • YOLOv6-L6: State-of-the-art accuracy in real-time.

YOLOv6 also provides quantized models for different precisions and models optimized for mobile platforms.

Usage Examples

This example provides simple YOLOv6 training and inference examples. For full documentation on these and other modes see the Predict, Train, Val and Export docs pages.

!!! Example

=== "Python"

    PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:

    ```py
    from ultralytics import YOLO

    # Build a YOLOv6n model from scratch
    model = YOLO("yolov6n.yaml")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

    # Run inference with the YOLOv6n model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    CLI commands are available to directly run the models:

    ```py
    # Build a YOLOv6n model from scratch and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640

    # Build a YOLOv6n model from scratch and run inference on the 'bus.jpg' image
    yolo predict model=yolov6n.yaml source=path/to/bus.jpg
    ```

Supported Tasks and Modes

The YOLOv6 series offers a range of models, each optimized for high-performance Object Detection. These models cater to varying computational needs and accuracy requirements, making them versatile for a wide array of applications.

Model Type Pre-trained Weights Tasks Supported Inference Validation Training Export
YOLOv6-N yolov6-n.pt Object Detection
YOLOv6-S yolov6-s.pt Object Detection
YOLOv6-M yolov6-m.pt Object Detection
YOLOv6-L yolov6-l.pt Object Detection
YOLOv6-L6 yolov6-l6.pt Object Detection

This table provides a detailed overview of the YOLOv6 model variants, highlighting their capabilities in object detection tasks and their compatibility with various operational modes such as Inference, Validation, Training, and Export. This comprehensive support ensures that users can fully leverage the capabilities of YOLOv6 models in a broad range of object detection scenarios.

Citations and Acknowledgements

We would like to acknowledge the authors for their significant contributions in the field of real-time object detection:

!!! Quote ""

=== "BibTeX"

    ```py
    @misc{li2023yolov6,
          title={YOLOv6 v3.0: A Full-Scale Reloading},
          author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu},
          year={2023},
          eprint={2301.05586},
          archivePrefix={arXiv},
          primaryClass={cs.CV}
    }
    ```

The original YOLOv6 paper can be found on arXiv. The authors have made their work publicly available, and the codebase can be accessed on GitHub. We appreciate their efforts in advancing the field and making their work accessible to the broader community.

FAQ

What is Meituan YOLOv6 and what makes it unique?

Meituan YOLOv6 is a state-of-the-art object detector that balances speed and accuracy, ideal for real-time applications. It features notable architectural enhancements like the Bi-directional Concatenation (BiC) module and an Anchor-Aided Training (AAT) strategy. These innovations provide substantial performance gains with minimal speed degradation, making YOLOv6 a competitive choice for object detection tasks.

How does the Bi-directional Concatenation (BiC) Module in YOLOv6 improve performance?

The Bi-directional Concatenation (BiC) module in YOLOv6 enhances localization signals in the detector's neck, delivering performance improvements with negligible speed impact. This module effectively combines different feature maps, increasing the model's ability to detect objects accurately. For more details on YOLOv6's features, refer to the Key Features section.

How can I train a YOLOv6 model using Ultralytics?

You can train a YOLOv6 model using Ultralytics with simple Python or CLI commands. For instance:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Build a YOLOv6n model from scratch
    model = YOLO("yolov6n.yaml")

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640
    ```

For more information, visit the Train page.

What are the different versions of YOLOv6 and their performance metrics?

YOLOv6 offers multiple versions, each optimized for different performance requirements:

  • YOLOv6-N: 37.5% AP at 1187 FPS
  • YOLOv6-S: 45.0% AP at 484 FPS
  • YOLOv6-M: 50.0% AP at 226 FPS
  • YOLOv6-L: 52.8% AP at 116 FPS
  • YOLOv6-L6: State-of-the-art accuracy in real-time scenarios

These models are evaluated on the COCO dataset using an NVIDIA Tesla T4 GPU. For more on performance metrics, see the Performance Metrics section.

How does the Anchor-Aided Training (AAT) strategy benefit YOLOv6?

Anchor-Aided Training (AAT) in YOLOv6 combines elements of anchor-based and anchor-free approaches, enhancing the model's detection capabilities without compromising inference efficiency. This strategy leverages anchors during training to improve bounding box predictions, making YOLOv6 effective in diverse object detection tasks.

Which operational modes are supported by YOLOv6 models in Ultralytics?

YOLOv6 supports various operational modes including Inference, Validation, Training, and Export. This flexibility allows users to fully exploit the model's capabilities in different scenarios. Check out the Supported Tasks and Modes section for a detailed overview of each mode.


comments: true
description: Discover YOLOv7, the breakthrough real-time object detector with top speed and accuracy. Learn about key features, usage, and performance metrics.
keywords: YOLOv7, real-time object detection, Ultralytics, AI, computer vision, model training, object detector

YOLOv7: Trainable Bag-of-Freebies

YOLOv7 is a state-of-the-art real-time object detector that surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS. It has the highest accuracy (56.8% AP) among all known real-time object detectors with 30 FPS or higher on GPU V100. Moreover, YOLOv7 outperforms other object detectors such as YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, and many others in speed and accuracy. The model is trained on the MS COCO dataset from scratch without using any other datasets or pre-trained weights. Source code for YOLOv7 is available on GitHub.

YOLOv7 comparison with SOTA object detectors

Comparison of SOTA object detectors

From the results in the YOLO comparison table we know that the proposed method has the best speed-accuracy trade-off comprehensively. If we compare YOLOv7-tiny-SiLU with YOLOv5-N (r6.1), our method is 127 fps faster and 10.7% more accurate on AP. In addition, YOLOv7 has 51.4% AP at frame rate of 161 fps, while PPYOLOE-L with the same AP has only 78 fps frame rate. In terms of parameter usage, YOLOv7 is 41% less than PPYOLOE-L. If we compare YOLOv7-X with 114 fps inference speed to YOLOv5-L (r6.1) with 99 fps inference speed, YOLOv7-X can improve AP by 3.9%. If YOLOv7-X is compared with YOLOv5-X (r6.1) of similar scale, the inference speed of YOLOv7-X is 31 fps faster. In addition, in terms the amount of parameters and computation, YOLOv7-X reduces 22% of parameters and 8% of computation compared to YOLOv5-X (r6.1), but improves AP by 2.2% (Source).

Model Params
(M)
FLOPs
(G)
Size
(pixels)
FPS APtest / val
50-95
APtest
50
APtest
75
APtest
S
APtest
M
APtest
L
YOLOX-S 9.0M 26.8G 640 102 40.5% / 40.5% - - - - -
YOLOX-M 25.3M 73.8G 640 81 47.2% / 46.9% - - - - -
YOLOX-L 54.2M 155.6G 640 69 50.1% / 49.7% - - - - -
YOLOX-X 99.1M 281.9G 640 58 51.5% / 51.1% - - - - -
PPYOLOE-S 7.9M 17.4G 640 208 43.1% / 42.7% 60.5% 46.6% 23.2% 46.4% 56.9%
PPYOLOE-M 23.4M 49.9G 640 123 48.9% / 48.6% 66.5% 53.0% 28.6% 52.9% 63.8%
PPYOLOE-L 52.2M 110.1G 640 78 51.4% / 50.9% 68.9% 55.6% 31.4% 55.3% 66.1%
PPYOLOE-X 98.4M 206.6G 640 45 52.2% / 51.9% 69.9% 56.5% 33.3% 56.3% 66.4%
YOLOv5-N (r6.1) 1.9M 4.5G 640 159 - / 28.0% - - - - -
YOLOv5-S (r6.1) 7.2M 16.5G 640 156 - / 37.4% - - - - -
YOLOv5-M (r6.1) 21.2M 49.0G 640 122 - / 45.4% - - - - -
YOLOv5-L (r6.1) 46.5M 109.1G 640 99 - / 49.0% - - - - -
YOLOv5-X (r6.1) 86.7M 205.7G 640 83 - / 50.7% - - - - -
YOLOR-CSP 52.9M 120.4G 640 106 51.1% / 50.8% 69.6% 55.7% 31.7% 55.3% 64.7%
YOLOR-CSP-X 96.9M 226.8G 640 87 53.0% / 52.7% 71.4% 57.9% 33.7% 57.1% 66.8%
YOLOv7-tiny-SiLU 6.2M 13.8G 640 286 38.7% / 38.7% 56.7% 41.7% 18.8% 42.4% 51.9%
YOLOv7 36.9M 104.7G 640 161 51.4% / 51.2% 69.7% 55.9% 31.8% 55.5% 65.0%
YOLOv7-X 71.3M 189.9G 640 114 53.1% / 52.9% 71.2% 57.8% 33.8% 57.1% 67.4%
YOLOv5-N6 (r6.1) 3.2M 18.4G 1280 123 - / 36.0% - - - - -
YOLOv5-S6 (r6.1) 12.6M 67.2G 1280 122 - / 44.8% - - - - -
YOLOv5-M6 (r6.1) 35.7M 200.0G 1280 90 - / 51.3% - - - - -
YOLOv5-L6 (r6.1) 76.8M 445.6G 1280 63 - / 53.7% - - - - -
YOLOv5-X6 (r6.1) 140.7M 839.2G 1280 38 - / 55.0% - - - - -
YOLOR-P6 37.2M 325.6G 1280 76 53.9% / 53.5% 71.4% 58.9% 36.1% 57.7% 65.6%
YOLOR-W6 79.8G 453.2G 1280 66 55.2% / 54.8% 72.7% 60.5% 37.7% 59.1% 67.1%
YOLOR-E6 115.8M 683.2G 1280 45 55.8% / 55.7% 73.4% 61.1% 38.4% 59.7% 67.7%
YOLOR-D6 151.7M 935.6G 1280 34 56.5% / 56.1% 74.1% 61.9% 38.9% 60.4% 68.7%
YOLOv7-W6 70.4M 360.0G 1280 84 54.9% / 54.6% 72.6% 60.1% 37.3% 58.7% 67.1%
YOLOv7-E6 97.2M 515.2G 1280 56 56.0% / 55.9% 73.5% 61.2% 38.0% 59.9% 68.4%
YOLOv7-D6 154.7M 806.8G 1280 44 56.6% / 56.3% 74.0% 61.8% 38.8% 60.1% 69.5%
YOLOv7-E6E 151.7M 843.2G 1280 36 56.8% / 56.8% 74.4% 62.1% 39.3% 60.5% 69.0%

Overview

Real-time object detection is an important component in many computer vision systems, including multi-object tracking, autonomous driving, robotics, and medical image analysis. In recent years, real-time object detection development has focused on designing efficient architectures and improving the inference speed of various CPUs, GPUs, and neural processing units (NPUs). YOLOv7 supports both mobile GPU and GPU devices, from the edge to the cloud.

Unlike traditional real-time object detectors that focus on architecture optimization, YOLOv7 introduces a focus on the optimization of the training process. This includes modules and optimization methods designed to improve the accuracy of object detection without increasing the inference cost, a concept known as the "trainable bag-of-freebies".

Key Features

YOLOv7 introduces several key features:

  1. Model Re-parameterization: YOLOv7 proposes a planned re-parameterized model, which is a strategy applicable to layers in different networks with the concept of gradient propagation path.

  2. Dynamic Label Assignment: The training of the model with multiple output layers presents a new issue: "How to assign dynamic targets for the outputs of different branches?" To solve this problem, YOLOv7 introduces a new label assignment method called coarse-to-fine lead guided label assignment.

  3. Extended and Compound Scaling: YOLOv7 proposes "extend" and "compound scaling" methods for the real-time object detector that can effectively utilize parameters and computation.

  4. Efficiency: The method proposed by YOLOv7 can effectively reduce about 40% parameters and 50% computation of state-of-the-art real-time object detector, and has faster inference speed and higher detection accuracy.

Usage Examples

As of the time of writing, Ultralytics does not currently support YOLOv7 models. Therefore, any users interested in using YOLOv7 will need to refer directly to the YOLOv7 GitHub repository for installation and usage instructions.

Here is a brief overview of the typical steps you might take to use YOLOv7:

  1. Visit the YOLOv7 GitHub repository: https://github.com/WongKinYiu/yolov7.

  2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.

  3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.

Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv7 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv7 GitHub repository.

We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv7 is implemented.

Citations and Acknowledgements

We would like to acknowledge the YOLOv7 authors for their significant contributions in the field of real-time object detection:

!!! Quote ""

=== "BibTeX"

    ```py
    @article{wang2022yolov7,
      title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
      author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
      journal={arXiv preprint arXiv:2207.02696},
      year={2022}
    }
    ```

The original YOLOv7 paper can be found on arXiv. The authors have made their work publicly available, and the codebase can be accessed on GitHub. We appreciate their efforts in advancing the field and making their work accessible to the broader community.

FAQ

What is YOLOv7 and why is it considered a breakthrough in real-time object detection?

YOLOv7 is a cutting-edge real-time object detection model that achieves unparalleled speed and accuracy. It surpasses other models, such as YOLOX, YOLOv5, and PPYOLOE, in both parameters usage and inference speed. YOLOv7's distinguishing features include its model re-parameterization and dynamic label assignment, which optimize its performance without increasing inference costs. For more technical details about its architecture and comparison metrics with other state-of-the-art object detectors, refer to the YOLOv7 paper.

How does YOLOv7 improve on previous YOLO models like YOLOv4 and YOLOv5?

YOLOv7 introduces several innovations, including model re-parameterization and dynamic label assignment, which enhance the training process and improve inference accuracy. Compared to YOLOv5, YOLOv7 significantly boosts speed and accuracy. For instance, YOLOv7-X improves accuracy by 2.2% and reduces parameters by 22% compared to YOLOv5-X. Detailed comparisons can be found in the performance table YOLOv7 comparison with SOTA object detectors.

Can I use YOLOv7 with Ultralytics tools and platforms?

As of now, Ultralytics does not directly support YOLOv7 in its tools and platforms. Users interested in using YOLOv7 need to follow the installation and usage instructions provided in the YOLOv7 GitHub repository. For other state-of-the-art models, you can explore and train using Ultralytics tools like Ultralytics HUB.

How do I install and run YOLOv7 for a custom object detection project?

To install and run YOLOv7, follow these steps:

  1. Clone the YOLOv7 repository:
    git clone https://github.com/WongKinYiu/yolov7
    
  2. Navigate to the cloned directory and install dependencies:
    cd yolov7
    pip install -r requirements.txt
    
  3. Prepare your dataset and configure the model parameters according to the usage instructions provided in the repository.
    For further guidance, visit the YOLOv7 GitHub repository for the latest information and updates.

What are the key features and optimizations introduced in YOLOv7?

YOLOv7 offers several key features that revolutionize real-time object detection:

  • Model Re-parameterization: Enhances the model's performance by optimizing gradient propagation paths.
  • Dynamic Label Assignment: Uses a coarse-to-fine lead guided method to assign dynamic targets for outputs across different branches, improving accuracy.
  • Extended and Compound Scaling: Efficiently utilizes parameters and computation to scale the model for various real-time applications.
  • Efficiency: Reduces parameter count by 40% and computation by 50% compared to other state-of-the-art models while achieving faster inference speeds.
    For further details on these features, see the YOLOv7 Overview section.

comments: true
description: Discover YOLOv8, the latest advancement in real-time object detection, optimizing performance with an array of pre-trained models for diverse tasks.
keywords: YOLOv8, real-time object detection, YOLO series, Ultralytics, computer vision, advanced object detection, AI, machine learning, deep learning

YOLOv8

Overview

YOLOv8 is the latest iteration in the YOLO series of real-time object detectors, offering cutting-edge performance in terms of accuracy and speed. Building upon the advancements of previous YOLO versions, YOLOv8 introduces new features and optimizations that make it an ideal choice for various object detection tasks in a wide range of applications.

Ultralytics YOLOv8



Watch: Ultralytics YOLOv8 Model Overview

Key Features

  • Advanced Backbone and Neck Architectures: YOLOv8 employs state-of-the-art backbone and neck architectures, resulting in improved feature extraction and object detection performance.
  • Anchor-free Split Ultralytics Head: YOLOv8 adopts an anchor-free split Ultralytics head, which contributes to better accuracy and a more efficient detection process compared to anchor-based approaches.
  • Optimized Accuracy-Speed Tradeoff: With a focus on maintaining an optimal balance between accuracy and speed, YOLOv8 is suitable for real-time object detection tasks in diverse application areas.
  • Variety of Pre-trained Models: YOLOv8 offers a range of pre-trained models to cater to various tasks and performance requirements, making it easier to find the right model for your specific use case.

Supported Tasks and Modes

The YOLOv8 series offers a diverse range of models, each specialized for specific tasks in computer vision. These models are designed to cater to various requirements, from object detection to more complex tasks like instance segmentation, pose/keypoints detection, oriented object detection, and classification.

Each variant of the YOLOv8 series is optimized for its respective task, ensuring high performance and accuracy. Additionally, these models are compatible with various operational modes including Inference, Validation, Training, and Export, facilitating their use in different stages of deployment and development.

Model Filenames Task Inference Validation Training Export
YOLOv8 yolov8n.pt yolov8s.pt yolov8m.pt yolov8l.pt yolov8x.pt Detection
YOLOv8-seg yolov8n-seg.pt yolov8s-seg.pt yolov8m-seg.pt yolov8l-seg.pt yolov8x-seg.pt Instance Segmentation
YOLOv8-pose yolov8n-pose.pt yolov8s-pose.pt yolov8m-pose.pt yolov8l-pose.pt yolov8x-pose.pt yolov8x-pose-p6.pt Pose/Keypoints
YOLOv8-obb yolov8n-obb.pt yolov8s-obb.pt yolov8m-obb.pt yolov8l-obb.pt yolov8x-obb.pt Oriented Detection
YOLOv8-cls yolov8n-cls.pt yolov8s-cls.pt yolov8m-cls.pt yolov8l-cls.pt yolov8x-cls.pt Classification

This table provides an overview of the YOLOv8 model variants, highlighting their applicability in specific tasks and their compatibility with various operational modes such as Inference, Validation, Training, and Export. It showcases the versatility and robustness of the YOLOv8 series, making them suitable for a variety of applications in computer vision.

Performance Metrics

!!! Performance

=== "Detection (COCO)"

    See [Detection Docs](../tasks/detect.md) for usage examples with these models trained on [COCO](../datasets/detect/coco.md), which include 80 pre-trained classes.

    | Model                                                                                | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
    | ------------------------------------------------------------------------------------ | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
    | [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt) | 640                   | 37.3                 | 80.4                           | 0.99                                | 3.2                | 8.7               |
    | [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt) | 640                   | 44.9                 | 128.4                          | 1.20                                | 11.2               | 28.6              |
    | [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m.pt) | 640                   | 50.2                 | 234.7                          | 1.83                                | 25.9               | 78.9              |
    | [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l.pt) | 640                   | 52.9                 | 375.2                          | 2.39                                | 43.7               | 165.2             |
    | [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x.pt) | 640                   | 53.9                 | 479.1                          | 3.53                                | 68.2               | 257.8             |

=== "Detection (Open Images V7)"

    See [Detection Docs](../tasks/detect.md) for usage examples with these models trained on [Open Image V7](../datasets/detect/open-images-v7.md), which include 600 pre-trained classes.

    | Model                                                                                     | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
    | ----------------------------------------------------------------------------------------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
    | [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-oiv7.pt) | 640                   | 18.4                 | 142.4                          | 1.21                                | 3.5                | 10.5              |
    | [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-oiv7.pt) | 640                   | 27.7                 | 183.1                          | 1.40                                | 11.4               | 29.7              |
    | [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-oiv7.pt) | 640                   | 33.6                 | 408.5                          | 2.26                                | 26.2               | 80.6              |
    | [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-oiv7.pt) | 640                   | 34.9                 | 596.9                          | 2.43                                | 44.1               | 167.4             |
    | [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-oiv7.pt) | 640                   | 36.3                 | 860.6                          | 3.56                                | 68.7               | 260.6             |

=== "Segmentation (COCO)"

    See [Segmentation Docs](../tasks/segment.md) for usage examples with these models trained on [COCO](../datasets/segment/coco.md), which include 80 pre-trained classes.

    | Model                                                                                        | size<br><sup>(pixels) | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
    | -------------------------------------------------------------------------------------------- | --------------------- | -------------------- | --------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
    | [YOLOv8n-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-seg.pt) | 640                   | 36.7                 | 30.5                  | 96.1                           | 1.21                                | 3.4                | 12.6              |
    | [YOLOv8s-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-seg.pt) | 640                   | 44.6                 | 36.8                  | 155.7                          | 1.47                                | 11.8               | 42.6              |
    | [YOLOv8m-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-seg.pt) | 640                   | 49.9                 | 40.8                  | 317.0                          | 2.18                                | 27.3               | 110.2             |
    | [YOLOv8l-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-seg.pt) | 640                   | 52.3                 | 42.6                  | 572.4                          | 2.79                                | 46.0               | 220.5             |
    | [YOLOv8x-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-seg.pt) | 640                   | 53.4                 | 43.4                  | 712.1                          | 4.02                                | 71.8               | 344.1             |

=== "Classification (ImageNet)"

    See [Classification Docs](../tasks/classify.md) for usage examples with these models trained on [ImageNet](../datasets/classify/imagenet.md), which include 1000 pre-trained classes.

    | Model                                                                                        | size<br><sup>(pixels) | acc<br><sup>top1 | acc<br><sup>top5 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
    | -------------------------------------------------------------------------------------------- | --------------------- | ---------------- | ---------------- | ------------------------------ | ----------------------------------- | ------------------ | ------------------------ |
    | [YOLOv8n-cls](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-cls.pt) | 224                   | 69.0             | 88.3             | 12.9                           | 0.31                                | 2.7                | 4.3                      |
    | [YOLOv8s-cls](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-cls.pt) | 224                   | 73.8             | 91.7             | 23.4                           | 0.35                                | 6.4                | 13.5                     |
    | [YOLOv8m-cls](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-cls.pt) | 224                   | 76.8             | 93.5             | 85.4                           | 0.62                                | 17.0               | 42.7                     |
    | [YOLOv8l-cls](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-cls.pt) | 224                   | 76.8             | 93.5             | 163.0                          | 0.87                                | 37.5               | 99.7                     |
    | [YOLOv8x-cls](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-cls.pt) | 224                   | 79.0             | 94.6             | 232.0                          | 1.01                                | 57.4               | 154.8                    |

=== "Pose (COCO)"

    See [Pose Estimation Docs](../tasks/pose.md) for usage examples with these models trained on [COCO](../datasets/pose/coco.md), which include 1 pre-trained class, 'person'.

    | Model                                                                                                | size<br><sup>(pixels) | mAP<sup>pose<br>50-95 | mAP<sup>pose<br>50 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
    | ---------------------------------------------------------------------------------------------------- | --------------------- | --------------------- | ------------------ | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
    | [YOLOv8n-pose](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-pose.pt)       | 640                   | 50.4                  | 80.1               | 131.8                          | 1.18                                | 3.3                | 9.2               |
    | [YOLOv8s-pose](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-pose.pt)       | 640                   | 60.0                  | 86.2               | 233.2                          | 1.42                                | 11.6               | 30.2              |
    | [YOLOv8m-pose](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-pose.pt)       | 640                   | 65.0                  | 88.8               | 456.3                          | 2.00                                | 26.4               | 81.0              |
    | [YOLOv8l-pose](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-pose.pt)       | 640                   | 67.6                  | 90.0               | 784.5                          | 2.59                                | 44.4               | 168.6             |
    | [YOLOv8x-pose](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-pose.pt)       | 640                   | 69.2                  | 90.2               | 1607.1                         | 3.73                                | 69.4               | 263.2             |
    | [YOLOv8x-pose-p6](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-pose-p6.pt) | 1280                  | 71.6                  | 91.2               | 4088.7                         | 10.04                               | 99.1               | 1066.4            |

=== "OBB (DOTAv1)"

    See [Oriented Detection Docs](../tasks/obb.md) for usage examples with these models trained on [DOTAv1](../datasets/obb/dota-v2.md#dota-v10), which include 15 pre-trained classes.

    | Model                                                                                        | size<br><sup>(pixels) | mAP<sup>test<br>50   | Speed<br><sup>CPU ONNX<br>(ms)   | Speed<br><sup>A100 TensorRT<br>(ms)   | params<br><sup>(M)   | FLOPs<br><sup>(B) |
    |----------------------------------------------------------------------------------------------|-----------------------| -------------------- | -------------------------------- | ------------------------------------- | -------------------- | ----------------- |
    | [YOLOv8n-obb](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-obb.pt) | 1024                  | 78.0                 | 204.77                           | 3.57                                  | 3.1                  | 23.3              |
    | [YOLOv8s-obb](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-obb.pt) | 1024                  | 79.5                 | 424.88                           | 4.07                                  | 11.4                 | 76.3              |
    | [YOLOv8m-obb](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-obb.pt) | 1024                  | 80.5                 | 763.48                           | 7.61                                  | 26.4                 | 208.6             |
    | [YOLOv8l-obb](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-obb.pt) | 1024                  | 80.7                 | 1278.42                          | 11.83                                 | 44.5                 | 433.8             |
    | [YOLOv8x-obb](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-obb.pt) | 1024                  | 81.36                | 1759.10                          | 13.23                                 | 69.5                 | 676.7             |

Usage Examples

This example provides simple YOLOv8 training and inference examples. For full documentation on these and other modes see the Predict, Train, Val and Export docs pages.

Note the below example is for YOLOv8 Detect models for object detection. For additional supported tasks see the Segment, Classify, OBB docs and Pose docs.

!!! Example

=== "Python"

    PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv8n model
    model = YOLO("yolov8n.pt")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

    # Run inference with the YOLOv8n model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    CLI commands are available to directly run the models:

    ```py
    # Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640

    # Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
    yolo predict model=yolov8n.pt source=path/to/bus.jpg
    ```

Citations and Acknowledgements

If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format:

!!! Quote ""

=== "BibTeX"

    ```py
    @software{yolov8_ultralytics,
      author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu},
      title = {Ultralytics YOLOv8},
      version = {8.0.0},
      year = {2023},
      url = {https://github.com/ultralytics/ultralytics},
      orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069},
      license = {AGPL-3.0}
    }
    ```

Please note that the DOI is pending and will be added to the citation once it is available. YOLOv8 models are provided under AGPL-3.0 and Enterprise licenses.

FAQ

What is YOLOv8 and how does it differ from previous YOLO versions?

YOLOv8 is the latest iteration in the Ultralytics YOLO series, designed to improve real-time object detection performance with advanced features. Unlike earlier versions, YOLOv8 incorporates an anchor-free split Ultralytics head, state-of-the-art backbone and neck architectures, and offers optimized accuracy-speed tradeoff, making it ideal for diverse applications. For more details, check the Overview and Key Features sections.

How can I use YOLOv8 for different computer vision tasks?

YOLOv8 supports a wide range of computer vision tasks, including object detection, instance segmentation, pose/keypoints detection, oriented object detection, and classification. Each model variant is optimized for its specific task and compatible with various operational modes like Inference, Validation, Training, and Export. Refer to the Supported Tasks and Modes section for more information.

What are the performance metrics for YOLOv8 models?

YOLOv8 models achieve state-of-the-art performance across various benchmarking datasets. For instance, the YOLOv8n model achieves a mAP (mean Average Precision) of 37.3 on the COCO dataset and a speed of 0.99 ms on A100 TensorRT. Detailed performance metrics for each model variant across different tasks and datasets can be found in the Performance Metrics section.

How do I train a YOLOv8 model?

Training a YOLOv8 model can be done using either Python or CLI. Below are examples for training a model using a COCO-pretrained YOLOv8 model on the COCO8 dataset for 100 epochs:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a COCO-pretrained YOLOv8n model
    model = YOLO("yolov8n.pt")

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
    ```

For further details, visit the Training documentation.

Can I benchmark YOLOv8 models for performance?

Yes, YOLOv8 models can be benchmarked for performance in terms of speed and accuracy across various export formats. You can use PyTorch, ONNX, TensorRT, and more for benchmarking. Below are example commands for benchmarking using Python and CLI:

!!! Example

=== "Python"

    ```py
    from ultralytics.utils.benchmarks import benchmark

    # Benchmark on GPU
    benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
    ```

=== "CLI"

    ```py
    yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
    ```

For additional information, check the Performance Metrics section.


comments: true
description: Explore YOLOv9, the latest leap in real-time object detection, featuring innovations like PGI and GELAN, and achieving new benchmarks in efficiency and accuracy.
keywords: YOLOv9, object detection, real-time, PGI, GELAN, deep learning, MS COCO, AI, neural networks, model efficiency, accuracy, Ultralytics

YOLOv9: A Leap Forward in Object Detection Technology

YOLOv9 marks a significant advancement in real-time object detection, introducing groundbreaking techniques such as Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). This model demonstrates remarkable improvements in efficiency, accuracy, and adaptability, setting new benchmarks on the MS COCO dataset. The YOLOv9 project, while developed by a separate open-source team, builds upon the robust codebase provided by Ultralytics YOLOv5, showcasing the collaborative spirit of the AI research community.



Watch: YOLOv9 Training on Custom Data using Ultralytics | Industrial Package Dataset

YOLOv9 performance comparison

Introduction to YOLOv9

In the quest for optimal real-time object detection, YOLOv9 stands out with its innovative approach to overcoming information loss challenges inherent in deep neural networks. By integrating PGI and the versatile GELAN architecture, YOLOv9 not only enhances the model's learning capacity but also ensures the retention of crucial information throughout the detection process, thereby achieving exceptional accuracy and performance.

Core Innovations of YOLOv9

YOLOv9's advancements are deeply rooted in addressing the challenges posed by information loss in deep neural networks. The Information Bottleneck Principle and the innovative use of Reversible Functions are central to its design, ensuring YOLOv9 maintains high efficiency and accuracy.

Information Bottleneck Principle

The Information Bottleneck Principle reveals a fundamental challenge in deep learning: as data passes through successive layers of a network, the potential for information loss increases. This phenomenon is mathematically represented as:

I(X, X) >= I(X, f_theta(X)) >= I(X, g_phi(f_theta(X)))

where I denotes mutual information, and f and g represent transformation functions with parameters theta and phi, respectively. YOLOv9 counters this challenge by implementing Programmable Gradient Information (PGI), which aids in preserving essential data across the network's depth, ensuring more reliable gradient generation and, consequently, better model convergence and performance.

Reversible Functions

The concept of Reversible Functions is another cornerstone of YOLOv9's design. A function is deemed reversible if it can be inverted without any loss of information, as expressed by:

X = v_zeta(r_psi(X))

with psi and zeta as parameters for the reversible and its inverse function, respectively. This property is crucial for deep learning architectures, as it allows the network to retain a complete information flow, thereby enabling more accurate updates to the model's parameters. YOLOv9 incorporates reversible functions within its architecture to mitigate the risk of information degradation, especially in deeper layers, ensuring the preservation of critical data for object detection tasks.

Impact on Lightweight Models

Addressing information loss is particularly vital for lightweight models, which are often under-parameterized and prone to losing significant information during the feedforward process. YOLOv9's architecture, through the use of PGI and reversible functions, ensures that even with a streamlined model, the essential information required for accurate object detection is retained and effectively utilized.

Programmable Gradient Information (PGI)

PGI is a novel concept introduced in YOLOv9 to combat the information bottleneck problem, ensuring the preservation of essential data across deep network layers. This allows for the generation of reliable gradients, facilitating accurate model updates and improving the overall detection performance.

Generalized Efficient Layer Aggregation Network (GELAN)

GELAN represents a strategic architectural advancement, enabling YOLOv9 to achieve superior parameter utilization and computational efficiency. Its design allows for flexible integration of various computational blocks, making YOLOv9 adaptable to a wide range of applications without sacrificing speed or accuracy.

YOLOv9 architecture comparison

YOLOv9 Benchmarks

Benchmarking in YOLOv9 using Ultralytics involves evaluating the performance of your trained and validated model in real-world scenarios. This process includes:

  • Performance Evaluation: Assessing the model's speed and accuracy.
  • Export Formats: Testing the model across different export formats to ensure it meets the necessary standards and performs well in various environments.
  • Framework Support: Providing a comprehensive framework within Ultralytics YOLOv8 to facilitate these assessments and ensure consistent and reliable results.

By benchmarking, you can ensure that your model not only performs well in controlled testing environments but also maintains high performance in practical, real-world applications.



Watch: How to Benchmark the YOLOv9 Model Using the Ultralytics Python Package

Performance on MS COCO Dataset

The performance of YOLOv9 on the COCO dataset exemplifies its significant advancements in real-time object detection, setting new benchmarks across various model sizes. Table 1 presents a comprehensive comparison of state-of-the-art real-time object detectors, illustrating YOLOv9's superior efficiency and accuracy.

Table 1. Comparison of State-of-the-Art Real-Time Object Detectors

!!! tip "Performance"

=== "Detection (COCO)"

    | Model                                                                                 | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | mAP<sup>val<br>50 | params<br><sup>(M) | FLOPs<br><sup>(B) |
    |---------------------------------------------------------------------------------------|-----------------------|----------------------|-------------------|--------------------|-------------------|
    | [YOLOv9t](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9t.pt)  | 640                   | 38.3                 | 53.1              | 2.0                | 7.7               |
    | [YOLOv9s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9s.pt)  | 640                   | 46.8                 | 63.4              | 7.2                | 26.7              |
    | [YOLOv9m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9m.pt)  | 640                   | 51.4                 | 68.1              | 20.1               | 76.8              |
    | [YOLOv9c](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9c.pt)  | 640                   | 53.0                 | 70.2              | 25.5               | 102.8             |
    | [YOLOv9e](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9e.pt)  | 640                   | 55.6                 | 72.8              | 58.1               | 192.5             |

=== "Segmentation (COCO)"

    | Model                                                                                         | size<br><sup>(pixels) | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | params<br><sup>(M) | FLOPs<br><sup>(B) |
    |-----------------------------------------------------------------------------------------------|-----------------------|----------------------|-----------------------|--------------------|-------------------|
    | [YOLOv9c-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9c-seg.pt)  | 640                   | 52.4                 | 42.2                  | 27.9               | 159.4             |
    | [YOLOv9e-seg](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9e-seg.pt)  | 640                   | 55.1                 | 44.3                  | 60.5               | 248.4             |

YOLOv9's iterations, ranging from the tiny t variant to the extensive e model, demonstrate improvements not only in accuracy (mAP metrics) but also in efficiency with a reduced number of parameters and computational needs (FLOPs). This table underscores YOLOv9's ability to deliver high precision while maintaining or reducing the computational overhead compared to prior versions and competing models.

Comparatively, YOLOv9 exhibits remarkable gains:

  • Lightweight Models: YOLOv9s surpasses the YOLO MS-S in parameter efficiency and computational load while achieving an improvement of 0.4∼0.6% in AP.
  • Medium to Large Models: YOLOv9m and YOLOv9e show notable advancements in balancing the trade-off between model complexity and detection performance, offering significant reductions in parameters and computations against the backdrop of improved accuracy.

The YOLOv9c model, in particular, highlights the effectiveness of the architecture's optimizations. It operates with 42% fewer parameters and 21% less computational demand than YOLOv7 AF, yet it achieves comparable accuracy, demonstrating YOLOv9's significant efficiency improvements. Furthermore, the YOLOv9e model sets a new standard for large models, with 15% fewer parameters and 25% less computational need than YOLOv8x, alongside an incremental 1.7% improvement in AP.

These results showcase YOLOv9's strategic advancements in model design, emphasizing its enhanced efficiency without compromising on the precision essential for real-time object detection tasks. The model not only pushes the boundaries of performance metrics but also emphasizes the importance of computational efficiency, making it a pivotal development in the field of computer vision.

Conclusion

YOLOv9 represents a pivotal development in real-time object detection, offering significant improvements in terms of efficiency, accuracy, and adaptability. By addressing critical challenges through innovative solutions like PGI and GELAN, YOLOv9 sets a new precedent for future research and application in the field. As the AI community continues to evolve, YOLOv9 stands as a testament to the power of collaboration and innovation in driving technological progress.

Usage Examples

This example provides simple YOLOv9 training and inference examples. For full documentation on these and other modes see the Predict, Train, Val and Export docs pages.

!!! Example

=== "Python"

    PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:

    ```py
    from ultralytics import YOLO

    # Build a YOLOv9c model from scratch
    model = YOLO("yolov9c.yaml")

    # Build a YOLOv9c model from pretrained weight
    model = YOLO("yolov9c.pt")

    # Display model information (optional)
    model.info()

    # Train the model on the COCO8 example dataset for 100 epochs
    results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

    # Run inference with the YOLOv9c model on the 'bus.jpg' image
    results = model("path/to/bus.jpg")
    ```

=== "CLI"

    CLI commands are available to directly run the models:

    ```py
    # Build a YOLOv9c model from scratch and train it on the COCO8 example dataset for 100 epochs
    yolo train model=yolov9c.yaml data=coco8.yaml epochs=100 imgsz=640

    # Build a YOLOv9c model from scratch and run inference on the 'bus.jpg' image
    yolo predict model=yolov9c.yaml source=path/to/bus.jpg
    ```

Supported Tasks and Modes

The YOLOv9 series offers a range of models, each optimized for high-performance Object Detection. These models cater to varying computational needs and accuracy requirements, making them versatile for a wide array of applications.

Model Filenames Tasks Inference Validation Training Export
YOLOv9 yolov9t yolov9s yolov9m yolov9c.pt yolov9e.pt Object Detection
YOLOv9-seg yolov9c-seg.pt yolov9e-seg.pt Instance Segmentation

This table provides a detailed overview of the YOLOv9 model variants, highlighting their capabilities in object detection tasks and their compatibility with various operational modes such as Inference, Validation, Training, and Export. This comprehensive support ensures that users can fully leverage the capabilities of YOLOv9 models in a broad range of object detection scenarios.

!!! note

Training YOLOv9 models will require _more_ resources **and** take longer than the equivalent sized [YOLOv8 model](yolov8.md).

Citations and Acknowledgements

We would like to acknowledge the YOLOv9 authors for their significant contributions in the field of real-time object detection:

!!! Quote ""

=== "BibTeX"

    ```py
    @article{wang2024yolov9,
      title={{YOLOv9}: Learning What You Want to Learn Using Programmable Gradient Information},
      author={Wang, Chien-Yao  and Liao, Hong-Yuan Mark},
      booktitle={arXiv preprint arXiv:2402.13616},
      year={2024}
    }
    ```

The original YOLOv9 paper can be found on arXiv. The authors have made their work publicly available, and the codebase can be accessed on GitHub. We appreciate their efforts in advancing the field and making their work accessible to the broader community.

FAQ

What innovations does YOLOv9 introduce for real-time object detection?

YOLOv9 introduces groundbreaking techniques such as Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). These innovations address information loss challenges in deep neural networks, ensuring high efficiency, accuracy, and adaptability. PGI preserves essential data across network layers, while GELAN optimizes parameter utilization and computational efficiency. Learn more about YOLOv9's core innovations that set new benchmarks on the MS COCO dataset.

How does YOLOv9 perform on the MS COCO dataset compared to other models?

YOLOv9 outperforms state-of-the-art real-time object detectors by achieving higher accuracy and efficiency. On the COCO dataset, YOLOv9 models exhibit superior mAP scores across various sizes while maintaining or reducing computational overhead. For instance, YOLOv9c achieves comparable accuracy with 42% fewer parameters and 21% less computational demand than YOLOv7 AF. Explore performance comparisons for detailed metrics.

How can I train a YOLOv9 model using Python and CLI?

You can train a YOLOv9 model using both Python and CLI commands. For Python, instantiate a model using the YOLO class and call the train method:

from ultralytics import YOLO

# Build a YOLOv9c model from pretrained weights and train
model = YOLO("yolov9c.pt")
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

For CLI training, execute:

yolo train model=yolov9c.yaml data=coco8.yaml epochs=100 imgsz=640

Learn more about usage examples for training and inference.

What are the advantages of using Ultralytics YOLOv9 for lightweight models?

YOLOv9 is designed to mitigate information loss, which is particularly important for lightweight models often prone to losing significant information. By integrating Programmable Gradient Information (PGI) and reversible functions, YOLOv9 ensures essential data retention, enhancing the model's accuracy and efficiency. This makes it highly suitable for applications requiring compact models with high performance. For more details, explore the section on YOLOv9's impact on lightweight models.

What tasks and modes does YOLOv9 support?

YOLOv9 supports various tasks including object detection and instance segmentation. It is compatible with multiple operational modes such as inference, validation, training, and export. This versatility makes YOLOv9 adaptable to diverse real-time computer vision applications. Refer to the supported tasks and modes section for more information.


comments: true
description: Learn how to evaluate your YOLOv8 model's performance in real-world scenarios using benchmark mode. Optimize speed, accuracy, and resource allocation across export formats.
keywords: model benchmarking, YOLOv8, Ultralytics, performance evaluation, export formats, ONNX, TensorRT, OpenVINO, CoreML, TensorFlow, optimization, mAP50-95, inference time

Model Benchmarking with Ultralytics YOLO

Ultralytics YOLO ecosystem and integrations

Introduction

Once your model is trained and validated, the next logical step is to evaluate its performance in various real-world scenarios. Benchmark mode in Ultralytics YOLOv8 serves this purpose by providing a robust framework for assessing the speed and accuracy of your model across a range of export formats.



Watch: Ultralytics Modes Tutorial: Benchmark

Why Is Benchmarking Crucial?

  • Informed Decisions: Gain insights into the trade-offs between speed and accuracy.
  • Resource Allocation: Understand how different export formats perform on different hardware.
  • Optimization: Learn which export format offers the best performance for your specific use case.
  • Cost Efficiency: Make more efficient use of hardware resources based on benchmark results.

Key Metrics in Benchmark Mode

  • mAP50-95: For object detection, segmentation, and pose estimation.
  • accuracy_top5: For image classification.
  • Inference Time: Time taken for each image in milliseconds.

Supported Export Formats

  • ONNX: For optimal CPU performance
  • TensorRT: For maximal GPU efficiency
  • OpenVINO: For Intel hardware optimization
  • CoreML, TensorFlow SavedModel, and More: For diverse deployment needs.

!!! Tip "Tip"

* Export to ONNX or OpenVINO for up to 3x CPU speedup.
* Export to TensorRT for up to 5x GPU speedup.

Usage Examples

Run YOLOv8n benchmarks on all supported export formats including ONNX, TensorRT etc. See Arguments section below for a full list of export arguments.

!!! Example

=== "Python"

    ```py
    from ultralytics.utils.benchmarks import benchmark

    # Benchmark on GPU
    benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
    ```

=== "CLI"

    ```py
    yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
    ```

Arguments

Arguments such as model, data, imgsz, half, device, and verbose provide users with the flexibility to fine-tune the benchmarks to their specific needs and compare the performance of different export formats with ease.

Key Default Value Description
model None Specifies the path to the model file. Accepts both .pt and .yaml formats, e.g., "yolov8n.pt" for pre-trained models or configuration files.
data None Path to a YAML file defining the dataset for benchmarking, typically including paths and settings for validation data. Example: "coco8.yaml".
imgsz 640 The input image size for the model. Can be a single integer for square images or a tuple (width, height) for non-square, e.g., (640, 480).
half False Enables FP16 (half-precision) inference, reducing memory usage and possibly increasing speed on compatible hardware. Use half=True to enable.
int8 False Activates INT8 quantization for further optimized performance on supported devices, especially useful for edge devices. Set int8=True to use.
device None Defines the computation device(s) for benchmarking, such as "cpu", "cuda:0", or a list of devices like "cuda:0,1" for multi-GPU setups.
verbose False Controls the level of detail in logging output. A boolean value; set verbose=True for detailed logs or a float for thresholding errors.

Export Formats

Benchmarks will attempt to run automatically on all possible export formats below.

Format format Argument Model Metadata Arguments
PyTorch - yolov8n.pt -
TorchScript torchscript yolov8n.torchscript imgsz, optimize, batch
ONNX onnx yolov8n.onnx imgsz, half, dynamic, simplify, opset, batch
OpenVINO openvino yolov8n_openvino_model/ imgsz, half, int8, batch
TensorRT engine yolov8n.engine imgsz, half, dynamic, simplify, workspace, int8, batch
CoreML coreml yolov8n.mlpackage imgsz, half, int8, nms, batch
TF SavedModel saved_model yolov8n_saved_model/ imgsz, keras, int8, batch
TF GraphDef pb yolov8n.pb imgsz, batch
TF Lite tflite yolov8n.tflite imgsz, half, int8, batch
TF Edge TPU edgetpu yolov8n_edgetpu.tflite imgsz
TF.js tfjs yolov8n_web_model/ imgsz, half, int8, batch
PaddlePaddle paddle yolov8n_paddle_model/ imgsz, batch
NCNN ncnn yolov8n_ncnn_model/ imgsz, half, batch

See full export details in the Export page.

FAQ

How do I benchmark my YOLOv8 model's performance using Ultralytics?

Ultralytics YOLOv8 offers a Benchmark mode to assess your model's performance across different export formats. This mode provides insights into key metrics such as mean Average Precision (mAP50-95), accuracy, and inference time in milliseconds. To run benchmarks, you can use either Python or CLI commands. For example, to benchmark on a GPU:

!!! Example

=== "Python"

    ```py
    from ultralytics.utils.benchmarks import benchmark

    # Benchmark on GPU
    benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
    ```

=== "CLI"

    ```py
    yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
    ```

For more details on benchmark arguments, visit the Arguments section.

What are the benefits of exporting YOLOv8 models to different formats?

Exporting YOLOv8 models to different formats such as ONNX, TensorRT, and OpenVINO allows you to optimize performance based on your deployment environment. For instance:

  • ONNX: Provides up to 3x CPU speedup.
  • TensorRT: Offers up to 5x GPU speedup.
  • OpenVINO: Specifically optimized for Intel hardware.
    These formats enhance both the speed and accuracy of your models, making them more efficient for various real-world applications. Visit the Export page for complete details.

Why is benchmarking crucial in evaluating YOLOv8 models?

Benchmarking your YOLOv8 models is essential for several reasons:

  • Informed Decisions: Understand the trade-offs between speed and accuracy.
  • Resource Allocation: Gauge the performance across different hardware options.
  • Optimization: Determine which export format offers the best performance for specific use cases.
  • Cost Efficiency: Optimize hardware usage based on benchmark results.
    Key metrics such as mAP50-95, Top-5 accuracy, and inference time help in making these evaluations. Refer to the Key Metrics section for more information.

Which export formats are supported by YOLOv8, and what are their advantages?

YOLOv8 supports a variety of export formats, each tailored for specific hardware and use cases:

  • ONNX: Best for CPU performance.
  • TensorRT: Ideal for GPU efficiency.
  • OpenVINO: Optimized for Intel hardware.
  • CoreML & TensorFlow: Useful for iOS and general ML applications.
    For a complete list of supported formats and their respective advantages, check out the Supported Export Formats section.

What arguments can I use to fine-tune my YOLOv8 benchmarks?

When running benchmarks, several arguments can be customized to suit specific needs:

  • model: Path to the model file (e.g., "yolov8n.pt").
  • data: Path to a YAML file defining the dataset (e.g., "coco8.yaml").
  • imgsz: The input image size, either as a single integer or a tuple.
  • half: Enable FP16 inference for better performance.
  • int8: Activate INT8 quantization for edge devices.
  • device: Specify the computation device (e.g., "cpu", "cuda:0").
  • verbose: Control the level of logging detail.
    For a full list of arguments, refer to the Arguments section.

comments: true
description: Learn how to export your YOLOv8 model to various formats like ONNX, TensorRT, and CoreML. Achieve maximum compatibility and performance.
keywords: YOLOv8, Model Export, ONNX, TensorRT, CoreML, Ultralytics, AI, Machine Learning, Inference, Deployment

Model Export with Ultralytics YOLO

Ultralytics YOLO ecosystem and integrations

Introduction

The ultimate goal of training a model is to deploy it for real-world applications. Export mode in Ultralytics YOLOv8 offers a versatile range of options for exporting your trained model to different formats, making it deployable across various platforms and devices. This comprehensive guide aims to walk you through the nuances of model exporting, showcasing how to achieve maximum compatibility and performance.



Watch: How To Export Custom Trained Ultralytics YOLOv8 Model and Run Live Inference on Webcam.

Why Choose YOLOv8's Export Mode?

  • Versatility: Export to multiple formats including ONNX, TensorRT, CoreML, and more.
  • Performance: Gain up to 5x GPU speedup with TensorRT and 3x CPU speedup with ONNX or OpenVINO.
  • Compatibility: Make your model universally deployable across numerous hardware and software environments.
  • Ease of Use: Simple CLI and Python API for quick and straightforward model exporting.

Key Features of Export Mode

Here are some of the standout functionalities:

  • One-Click Export: Simple commands for exporting to different formats.
  • Batch Export: Export batched-inference capable models.
  • Optimized Inference: Exported models are optimized for quicker inference times.
  • Tutorial Videos: In-depth guides and tutorials for a smooth exporting experience.

!!! Tip "Tip"

* Export to [ONNX](../integrations/onnx.md) or [OpenVINO](../integrations/openvino.md) for up to 3x CPU speedup.
* Export to [TensorRT](../integrations/tensorrt.md) for up to 5x GPU speedup.

Usage Examples

Export a YOLOv8n model to a different format like ONNX or TensorRT. See Arguments section below for a full list of export arguments.

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a model
    model = YOLO("yolov8n.pt")  # load an official model
    model = YOLO("path/to/best.pt")  # load a custom trained model

    # Export the model
    model.export(format="onnx")
    ```

=== "CLI"

    ```py
    yolo export model=yolov8n.pt format=onnx  # export official model
    yolo export model=path/to/best.pt format=onnx  # export custom trained model
    ```

Arguments

This table details the configurations and options available for exporting YOLO models to different formats. These settings are critical for optimizing the exported model's performance, size, and compatibility across various platforms and environments. Proper configuration ensures that the model is ready for deployment in the intended application with optimal efficiency.

Argument Type Default Description
format str 'torchscript' Target format for the exported model, such as 'onnx', 'torchscript', 'tensorflow', or others, defining compatibility with various deployment environments.
imgsz int or tuple 640 Desired image size for the model input. Can be an integer for square images or a tuple (height, width) for specific dimensions.
keras bool False Enables export to Keras format for TensorFlow SavedModel, providing compatibility with TensorFlow serving and APIs.
optimize bool False Applies optimization for mobile devices when exporting to TorchScript, potentially reducing model size and improving performance.
half bool False Enables FP16 (half-precision) quantization, reducing model size and potentially speeding up inference on supported hardware.
int8 bool False Activates INT8 quantization, further compressing the model and speeding up inference with minimal accuracy loss, primarily for edge devices.
dynamic bool False Allows dynamic input sizes for ONNX and TensorRT exports, enhancing flexibility in handling varying image dimensions.
simplify bool False Simplifies the model graph for ONNX exports with onnxslim, potentially improving performance and compatibility.
opset int None Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version.
workspace float 4.0 Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance.
nms bool False Adds Non-Maximum Suppression (NMS) to the CoreML export, essential for accurate and efficient detection post-processing.
batch int 1 Specifies export model batch inference size or the max number of images the exported model will process concurrently in predict mode.

Adjusting these parameters allows for customization of the export process to fit specific requirements, such as deployment environment, hardware constraints, and performance targets. Selecting the appropriate format and settings is essential for achieving the best balance between model size, speed, and accuracy.

Export Formats

Available YOLOv8 export formats are in the table below. You can export to any format using the format argument, i.e. format='onnx' or format='engine'. You can predict or validate directly on exported models, i.e. yolo predict model=yolov8n.onnx. Usage examples are shown for your model after export completes.

Format format Argument Model Metadata Arguments
PyTorch - yolov8n.pt -
TorchScript torchscript yolov8n.torchscript imgsz, optimize, batch
ONNX onnx yolov8n.onnx imgsz, half, dynamic, simplify, opset, batch
OpenVINO openvino yolov8n_openvino_model/ imgsz, half, int8, batch
TensorRT engine yolov8n.engine imgsz, half, dynamic, simplify, workspace, int8, batch
CoreML coreml yolov8n.mlpackage imgsz, half, int8, nms, batch
TF SavedModel saved_model yolov8n_saved_model/ imgsz, keras, int8, batch
TF GraphDef pb yolov8n.pb imgsz, batch
TF Lite tflite yolov8n.tflite imgsz, half, int8, batch
TF Edge TPU edgetpu yolov8n_edgetpu.tflite imgsz
TF.js tfjs yolov8n_web_model/ imgsz, half, int8, batch
PaddlePaddle paddle yolov8n_paddle_model/ imgsz, batch
NCNN ncnn yolov8n_ncnn_model/ imgsz, half, batch

FAQ

How do I export a YOLOv8 model to ONNX format?

Exporting a YOLOv8 model to ONNX format is straightforward with Ultralytics. It provides both Python and CLI methods for exporting models.

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Load a model
    model = YOLO("yolov8n.pt")  # load an official model
    model = YOLO("path/to/best.pt")  # load a custom trained model

    # Export the model
    model.export(format="onnx")
    ```

=== "CLI"

    ```py
    yolo export model=yolov8n.pt format=onnx  # export official model
    yolo export model=path/to/best.pt format=onnx  # export custom trained model
    ```

For more details on the process, including advanced options like handling different input sizes, refer to the ONNX section.

What are the benefits of using TensorRT for model export?

Using TensorRT for model export offers significant performance improvements. YOLOv8 models exported to TensorRT can achieve up to a 5x GPU speedup, making it ideal for real-time inference applications.

  • Versatility: Optimize models for a specific hardware setup.
  • Speed: Achieve faster inference through advanced optimizations.
  • Compatibility: Integrate smoothly with NVIDIA hardware.

To learn more about integrating TensorRT, see the TensorRT integration guide.

How do I enable INT8 quantization when exporting my YOLOv8 model?

INT8 quantization is an excellent way to compress the model and speed up inference, especially on edge devices. Here's how you can enable INT8 quantization:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    model = YOLO("yolov8n.pt")  # Load a model
    model.export(format="onnx", int8=True)
    ```

=== "CLI"

    ```py
    yolo export model=yolov8n.pt format=onnx int8=True   # export model with INT8 quantization
    ```

INT8 quantization can be applied to various formats, such as TensorRT and CoreML. More details can be found in the Export section.

Why is dynamic input size important when exporting models?

Dynamic input size allows the exported model to handle varying image dimensions, providing flexibility and optimizing processing efficiency for different use cases. When exporting to formats like ONNX or TensorRT, enabling dynamic input size ensures that the model can adapt to different input shapes seamlessly.

To enable this feature, use the dynamic=True flag during export:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    model = YOLO("yolov8n.pt")
    model.export(format="onnx", dynamic=True)
    ```

=== "CLI"

    ```py
    yolo export model=yolov8n.pt format=onnx dynamic=True
    ```

For additional context, refer to the dynamic input size configuration.

What are the key export arguments to consider for optimizing model performance?

Understanding and configuring export arguments is crucial for optimizing model performance:

  • format: The target format for the exported model (e.g., onnx, torchscript, tensorflow).
  • imgsz: Desired image size for the model input (e.g., 640 or (height, width)).
  • half: Enables FP16 quantization, reducing model size and potentially speeding up inference.
  • optimize: Applies specific optimizations for mobile or constrained environments.
  • int8: Enables INT8 quantization, highly beneficial for edge deployments.

For a detailed list and explanations of all the export arguments, visit the Export Arguments section.


comments: true
description: Discover the diverse modes of Ultralytics YOLOv8, including training, validation, prediction, export, tracking, and benchmarking. Maximize model performance and efficiency.
keywords: Ultralytics, YOLOv8, machine learning, model training, validation, prediction, export, tracking, benchmarking, object detection

Ultralytics YOLOv8 Modes

Ultralytics YOLO ecosystem and integrations

Introduction

Ultralytics YOLOv8 is not just another object detection model; it's a versatile framework designed to cover the entire lifecycle of machine learning models—from data ingestion and model training to validation, deployment, and real-world tracking. Each mode serves a specific purpose and is engineered to offer you the flexibility and efficiency required for different tasks and use-cases.



Watch: Ultralytics Modes Tutorial: Train, Validate, Predict, Export & Benchmark.

Modes at a Glance

Understanding the different modes that Ultralytics YOLOv8 supports is critical to getting the most out of your models:

  • Train mode: Fine-tune your model on custom or preloaded datasets.
  • Val mode: A post-training checkpoint to validate model performance.
  • Predict mode: Unleash the predictive power of your model on real-world data.
  • Export mode: Make your model deployment-ready in various formats.
  • Track mode: Extend your object detection model into real-time tracking applications.
  • Benchmark mode: Analyze the speed and accuracy of your model in diverse deployment environments.

This comprehensive guide aims to give you an overview and practical insights into each mode, helping you harness the full potential of YOLOv8.

Train

Train mode is used for training a YOLOv8 model on a custom dataset. In this mode, the model is trained using the specified dataset and hyperparameters. The training process involves optimizing the model's parameters so that it can accurately predict the classes and locations of objects in an image.

Train Examples

Val

Val mode is used for validating a YOLOv8 model after it has been trained. In this mode, the model is evaluated on a validation set to measure its accuracy and generalization performance. This mode can be used to tune the hyperparameters of the model to improve its performance.

Val Examples

Predict

Predict mode is used for making predictions using a trained YOLOv8 model on new images or videos. In this mode, the model is loaded from a checkpoint file, and the user can provide images or videos to perform inference. The model predicts the classes and locations of objects in the input images or videos.

Predict Examples

Export

Export mode is used for exporting a YOLOv8 model to a format that can be used for deployment. In this mode, the model is converted to a format that can be used by other software applications or hardware devices. This mode is useful when deploying the model to production environments.

Export Examples

Track

Track mode is used for tracking objects in real-time using a YOLOv8 model. In this mode, the model is loaded from a checkpoint file, and the user can provide a live video stream to perform real-time object tracking. This mode is useful for applications such as surveillance systems or self-driving cars.

Track Examples

Benchmark

Benchmark mode is used to profile the speed and accuracy of various export formats for YOLOv8. The benchmarks provide information on the size of the exported format, its mAP50-95 metrics (for object detection, segmentation and pose) or accuracy_top5 metrics (for classification), and the inference time in milliseconds per image across various export formats like ONNX, OpenVINO, TensorRT and others. This information can help users choose the optimal export format for their specific use case based on their requirements for speed and accuracy.

Benchmark Examples

FAQ

How do I train a custom object detection model with Ultralytics YOLOv8?

Training a custom object detection model with Ultralytics YOLOv8 involves using the train mode. You need a dataset formatted in YOLO format, containing images and corresponding annotation files. Use the following command to start the training process:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Train a custom model
    model = YOLO("yolov8n.pt")
    model.train(data="path/to/dataset.yaml", epochs=100, imgsz=640)
    ```

=== "CLI"

    ```py
    yolo train data=path/to/dataset.yaml epochs=100 imgsz=640
    ```

For more detailed instructions, you can refer to the Ultralytics Train Guide.

What metrics does Ultralytics YOLOv8 use to validate the model's performance?

Ultralytics YOLOv8 uses various metrics during the validation process to assess model performance. These include:

  • mAP (mean Average Precision): This evaluates the accuracy of object detection.
  • IOU (Intersection over Union): Measures the overlap between predicted and ground truth bounding boxes.
  • Precision and Recall: Precision measures the ratio of true positive detections to the total detected positives, while recall measures the ratio of true positive detections to the total actual positives.

You can run the following command to start the validation:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Validate the model
    model = YOLO("yolov8n.pt")
    model.val(data="path/to/validation.yaml")
    ```

=== "CLI"

    ```py
    yolo val data=path/to/validation.yaml
    ```

Refer to the Validation Guide for further details.

How can I export my YOLOv8 model for deployment?

Ultralytics YOLOv8 offers export functionality to convert your trained model into various deployment formats such as ONNX, TensorRT, CoreML, and more. Use the following example to export your model:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Export the model
    model = YOLO("yolov8n.pt")
    model.export(format="onnx")
    ```

=== "CLI"

    ```py
    yolo export model=yolov8n.pt format=onnx
    ```

Detailed steps for each export format can be found in the Export Guide.

What is the purpose of the benchmark mode in Ultralytics YOLOv8?

Benchmark mode in Ultralytics YOLOv8 is used to analyze the speed and accuracy of various export formats such as ONNX, TensorRT, and OpenVINO. It provides metrics like model size, mAP50-95 for object detection, and inference time across different hardware setups, helping you choose the most suitable format for your deployment needs.

!!! Example

=== "Python"

    ```py
    from ultralytics.utils.benchmarks import benchmark

    # Benchmark on GPU
    benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
    ```

=== "CLI"

    ```py
    yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
    ```

For more details, refer to the Benchmark Guide.

How can I perform real-time object tracking using Ultralytics YOLOv8?

Real-time object tracking can be achieved using the track mode in Ultralytics YOLOv8. This mode extends object detection capabilities to track objects across video frames or live feeds. Use the following example to enable tracking:

!!! Example

=== "Python"

    ```py
    from ultralytics import YOLO

    # Track objects in a video
    model = YOLO("yolov8n.pt")
    model.track(source="path/to/video.mp4")
    ```

=== "CLI"

    ```py
    yolo track source=path/to/video.mp4
    ```

For in-depth instructions, visit the Track Guide.

posted @ 2024-09-05 12:02  绝不原创的飞龙  阅读(152)  评论(0)    收藏  举报