What is Image Segmentation and how is it different from object detection?

Updated Feb 20, 2026

Short answer

Image segmentation divides an image into pixel-level regions, while object detection draws bounding boxes around objects.

Deep explanation

Computer Vision uses image segmentation to achieve fine-grained understanding of images. Unlike object detection, which uses rectangles, segmentation classifies every pixel in an image.

There are two main types:

  • Semantic segmentation: assigns each pixel a class label (e.g., road, sky, person).
  • Instance segmentation: distinguishes between different objects of the same class (e.g., two separate cars).

Segmentation models often use encoder-decoder architectures where the encoder extracts features and the decoder reconstructs pixel-level predictions.

This technique is essential for applications requiring precise boundaries.

Real-world example

In medical imaging, segmentation is used to highlight tumors in MRI scans by labeling each pixel as tumor or healthy tissue.

Common mistakes

  • - Thinking segmentation is the same as detection (it is more detailed).
  • - Assuming it is only useful in medical applications.
  • - Ignoring computational cost (pixel-level prediction is expensive).

Follow-up questions

  • What is semantic vs instance segmentation?
  • What is U-Net architecture?
  • Why is segmentation more computationally expensive?

More Computer Vision interview questions

View all →