ClipperzClipperz

Glossary

Face-tracking crop

Face-tracking crop is the algorithm that keeps a speaker centered in a vertical clip by detecting their face in the source video and adjusting the crop window per frame. It's the difference between a clip where the speaker is always in frame and one where they slide off the edge whenever they move.

Part of the AI Video Clipping topic cluster.

What face-tracking does

On every frame of the source video, a face-detection model locates the dominant face and emits a bounding box. The crop window follows the box with smoothing applied so it doesn't jump frame-to-frame. When the face leaves the frame entirely (cutaway, B-roll), the crop falls back to a center crop or to the last known position depending on configuration.

Quality face-tracking handles two common edge cases: multiple faces (the algorithm picks the most confident detection or splits the screen) and fast motion (Kalman-filter-style smoothing prevents the crop from chasing every twitch).

When face-tracking fails

  • Heavy graphics overlays — the algorithm can confuse a face on a thumbnail in the corner with the actual speaker.
  • Profile shots — frontal faces detect more reliably than side profiles.
  • Dim or backlit footage — low-contrast frames produce low-confidence detections.

How to recover from a bad track

When a clip's tracking is wrong, manual override is faster than re-running the model with different parameters. Clipperz exposes a per-clip crop override in the editor for the few cases where automation needs help.

See how Clipperz handles this in product: See vertical reframe

Related terms

See it in action

Paste a video URL into Clipperz and watch the concept play out on your own content.

Try free →