A robust method for estimating synchronization and delay of audio and video for communication services
详细信息    查看全文
  • 作者:Andreas Rossholm ; Benny Lövström
  • 关键词:Lip sync ; Synchronization ; Delay ; QoE ; Video streaming ; Video conferencing
  • 刊名:Multimedia Tools and Applications
  • 出版年:2016
  • 出版时间:January 2016
  • 年:2016
  • 卷:75
  • 期:1
  • 页码:527-545
  • 全文大小:1,852 KB
  • 参考文献:1.Blakowski G, Steinmetz R (1996) A media synchronization survey: reference model, specification, and case studies. IEEE Sel Areas Commun 14(1):5–35CrossRef
    2.Boyaci O, Forte A, Baset S, Schulzrinne H (2009) vDelay: a tool to measure capture-to-display latency and frame rate. In: 11th IEEE international symposium on multimedia, 2009. ISM ’09, pp 194–200
    3.Claesson I, Rossholm (formerly Nilsson) A (2003) GSM TDMA frame rate internal active noise cancellation. Int J Acoust Vib 8:159–166
    4.ETSI ETR 297: Human factors (HF); Human factors in videotelephony (1996)
    5.ETSI TR 102 643: Human factors (HF); Quality of experience (QoE) requirements for real-time communication services (2010)
    6.Hollier MP, Rimell AN, Hands DS, Voelcker RM (1999) Multi-modal perception. BT Technol J 17:35–46CrossRef
    7.Huang Z, Nahrstedt K, Shu L, Steinmetz R (2013) Evolution of temporal multimedia synchronization principles: a historical viewpoint. ACM Trans Multimedia Comput Commun Appl (TOMCCAP) 9(34):40–47
    8.IETF RFC 3550: RTP: A transport protocol for real-time applications (2003)
    9.ITU-R BT.1359-1: Relative timing of sound and vision for broadcasting (1998)
    10.ITU-T: Recommendation J.100: - Tolerances for transmission time differences between vision and sound components of a television signal (1990)
    11.ITU-T Series G: Recommendation G.1010: End-user multimedia QoS categories (2001)
    12.ITU-T Series G: Recommendation G.114: One-way transmission time (2003)
    13.ITU-T Series H: Audiovisual and multimedia systems, Supplement 1 (05/99) Application profile - sign language and lip-reading real-time conversation using low bit-rate video communication (1999)
    14.Jansen J, Bulterman DCA (2013) User-centric video delay measurements. In: Proceeding of the 23rd ACM workshop on network and operating systems support for digital audio and video. ACM, pp 37–42
    15.Kryczka A, Arefin A, Nahrstedt K (2013) AvCloak: a tool for black box latency measurements in video conferencing applications. In: IEEE international symposium on multimedia (ISM), pp 271–278
    16.Liu Y, Sato Y (2008) Recovering audio-to-video synchronization by audiovisual correlation analysis. In: 19th international conference on pattern recognition, 2008. ICPR 2008, pp 1–4
    17.Proakis JG, Manolakis DG (2007) Digital signal processing: [principles, algorithms and applications], 4th edn. Pearson Prentice Hall, Upper Saddle River
    18.Raake A, Schoenenberg K, Skowronek J, Egger S (2013) Predicting speech quality based on interactivity and delay. In: Proceedings of the INTERSPEECH 2013. Lyon, pp 1549–1552
    19.Radhakrishnan R, Terry K, Bauer C (2008) Audio and video signatures for synchronization. In: 2008 IEEE international conference on multimedia and expo, pp 1549–1552
    20.Savage C (1997) A survey of combinatorial gray codes. SIAM Rev 39(4):605–629MathSciNet CrossRef MATH
    21.Shen Z, Luo J, Zimmermann R, Vasilakos A (2011) Peer-to-peer media streaming: insights and new developments. Proc IEEE 99(12):2089–2109CrossRef
    22.Steinmetz R (1996) Human perception of jitter and media synchronization. IEEE Sel Areas Commun 14(1):61–72CrossRef
    23.Winkler S, Mohandas P (2008) The evolution of video quality measurement: from psnr to hybrid metrics. IEEE Trans Broadcast 54(3):660–668CrossRef
    24.Yamagishi K, Hayashi T (2005) Analysis of psychological factors for quality assessment of interactive multimodal service. In: Electronic imaging, vol 5666, pp 130–138
    25.You J, Reiter U, Hannuksela MM, Gabbouj M, Perkis A (2010) Perceptual-based quality assessment for audiovisual services: a survey. Signal Process Image Commun 25(7):482–501CrossRef
    26.Zhou L, Chao HC, Vasilakos A (2011) Joint forensics-scheduling strategy for delay-sensitive multimedia applications over heterogeneous networks. IEEE J Sel Areas Commun 29(7):1358–1367CrossRef
    27.Zhou L, Xiong N, Shu L, Vasilakos AV, Yeo SS (2010) Context-aware middleware for multimedia services in heterogeneous networks. IEEE Intell Syst 25(2):40–47CrossRef
  • 作者单位:Andreas Rossholm (1)
    Benny Lövström (1)

    1. Blekinge Institute of Technology, 371 79, Karlskrona, Sweden
  • 刊物类别:Computer Science
  • 刊物主题:Multimedia Information Systems
    Computer Communication Networks
    Data Structures, Cryptology and Information Theory
    Special Purpose and Application-Based Systems
  • 出版者:Springer Netherlands
  • ISSN:1573-7721
文摘
One of the main contributions to the quality of experience in streaming services or in two-way communication of audio and video applications is synchronization. This has been shown in several studies and experiments but methods to measure synchronization are less frequent, especially for situations without internal access to the application and independent of platform and device. In this paper we present a method for measuring synchronization skewness as well as delay for audio and video. The solution incorporates audio and video reference streams, where audio and video frames are marked with frame numbers which are decoded on the receiver side to enable calculation of synchronization and delay. The method has been verified in a two-way communication application in a transparent network with and without inserting known delays, as well as in a network with 5 and 10 % packet loss levels. The method can be used for both streaming and two-way communication services, both with and without access to the internal structures, and enables measurements of applications running on e.g. smartphones, tablets, and laptops under various conditions.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700