The Near-Duplicate Detection model is used to find duplicate or near-duplicate segments in videos.
Duplicate segments are segments that are identical or nearly identical, even if they have been modified in some way (with overlays, cropping, split-screen, mixing etc.)
Common use-cases include:
Video Blacklists and Disallow lists: Blacklist videos and prevent them from (re)appearing on your site or app. For instance copyrighted videos, illegal videos, previously removed videos.
Copyright detection: Detect and manage copyrighted content in user uploads.
Video set deduplication: Identify and manage duplicate videos within a set.
Duplicate detection works across all types of videos, both long and short, realistic or animated.
Duplicates are detected across a wide range of transformations and modifications, many of which are typically used to try to evade duplicate detection. Examples:
Original clip
Resolution, size and format changes
Downscaling and upscaling
DPI/Resolution changes
Re-encoding or format conversion (e.g. JPEG, PNG, WEBP...)
Text overlays
Text overlays and added captions
Stickers, logos, watermarks
Clip overlays
Large clip overlays obscuring parts of the original clip
Emojis, shapes and other graphical overlays
Cropping and reframing
Tight crops, letterboxing, added borders/frames
Partial views of the original
Collage
When the source clip appears inside a multi-clip layout
Split-screen layouts
Blur
Strong gaussian blur, motion blur, defocus...
Pixelation
Clip mixing
When the source clip is blended with other clips or backgrounds