As shown in Fig. 4, $Pcenter(x,y)$ and $Pcentert(x,y)$ represent the center of the initial template and the object’s bounding box in the $t$’th frame, respectively. $Pi(x,y)$ and $Pit(x,y)$ represent credible keypoints of the initial template and that in the $t$’th frame. $\theta n$ and $\theta nt$ represent the angle between the $i$ and $i+1$ keypoints of the initial template and that in the $t$’th frame. $dn$ and $dnt$, respectively, represent the Euclidean distance between the keypoints in the initial template and that in the $t$’th frame. With the following equations, the relative changing rate of position, scale and rotation angle can be calculated: Display Formula
$dcentert(x,y)=median(\Vert Pit(x,y)\u2212Pi(x,y)\Vert ),i\u2208[1,N],$(5)
Display Formula$scentert=median(dnt/dn),n\u2208[1,(N\u22121)!],$(6)
Display Formula$\theta centert=median(\theta nt\u2212\theta n),n\u2208[1,(N\u22121)!],$(7)
where median represents the function of calculating median. Set the four vertices’ coordinates of initial tracking box as $Pri(x,y),i=[1,4]$, its relative offset to the center of initial tracking box is $Pdi(x,y)$, $i=[1,4]$, in the $t$’th frame, the vertices’ coordinates of tracking box can be obtained by the following equations: Display Formula$Pcentert(x,y)=Pcenter(x,y)+dcentert(x,y),$(8)
Display Formula$xrotatet=cos\u2009\theta centert\xb7xPdi\u2212sin\u2009\theta centert\xb7yPdi,$(9)
Display Formula$yrotatet=cos\u2009\theta centert\xb7yPdi+sin\u2009\theta centert\xb7xPdi,$(10)
Display Formula$Prit(x,y)=Pcentert(x,y)+scentert\xb7Protatet(xrotatet,yrotatet),i=[1,4],$(11)
where $xrotatet$ and $yrotatet$, respectively, represent the $x$-coordinate and $y$-coordinate after rotation. $Prit(x,y)$ are the four vertices’ coordinates of tracking box in the $t$’th frame. The tracking box $B=(b1,b2,\u2026bn)$ of each frame can be obtained through the calculation above.