Abstract: Temporal sentence grounding in videos (TSGV), a.k.a., natural language video localization (NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that semantically ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results