Glossary
Face-tracking crop
Face-tracking crop is the algorithm that keeps a speaker centered in a vertical clip by detecting their face in the source video and adjusting the crop window per frame. It's the difference between a clip where the speaker is always in frame and one where they slide off the edge whenever they move.
Part of the AI Video Clipping topic cluster.
What face-tracking does
On every frame of the source video, a face-detection model locates the dominant face and emits a bounding box. The crop window follows the box with smoothing applied so it doesn't jump frame-to-frame. When the face leaves the frame entirely (cutaway, B-roll), the crop falls back to a center crop or to the last known position depending on configuration.
Quality face-tracking handles two common edge cases: multiple faces (the algorithm picks the most confident detection or splits the screen) and fast motion (Kalman-filter-style smoothing prevents the crop from chasing every twitch).
When face-tracking fails
- Heavy graphics overlays — the algorithm can confuse a face on a thumbnail in the corner with the actual speaker.
- Profile shots — frontal faces detect more reliably than side profiles.
- Dim or backlit footage — low-contrast frames produce low-confidence detections.
How to recover from a bad track
When a clip's tracking is wrong, manual override is faster than re-running the model with different parameters. Clipperz exposes a per-clip crop override in the editor for the few cases where automation needs help.
See how Clipperz handles this in product: See vertical reframe →
Related terms
See it in action
Paste a video URL into Clipperz and watch the concept play out on your own content.
Try free →
