Microsoft Word - 15-陈天石_oee170742_xyqeZy.doc

Opo-Elecronic Engineering 光电工程 Aricle 2018 年, 第 45 卷, 第 6 期一种用于 VR 场景的声音渲染优化方法陈天石, 帖云 *, 齐林, 陈恩庆 450001 摘要 : 对于包含成百上千可移动声源的虚拟场景, 由于聚类阶段所需运算代价过高, 传统的空间声音渲染方案往往需要占用过多的运算资源这已经成为 VR 音频渲染技术发展的瓶颈本文在声音采样的过程中运用分数阶傅里叶变换这一工具, 降低了模数转换阶段的量化噪声此外, 通过在聚类这一步骤中添加平均角度偏差阈值的方法提高了声音处理的运算速度, 改善了整个系统的运算效率设计并进行一项感知用户实验, 证实了在可视情况下, 人对不同类声源聚类产生的空间误差更加敏感这一观点根据这一结论, 本文提出了一种新的空间声音聚类方法, 在可视情况下降低了不同类声源聚类为一组的可能性关键词 : 声音渲染 ; 聚类 : 感知用户实验 ; 平均角度误差中图分类号 :O436.3 文献标志码 :A 引用格式 : 陈天石, 帖云, 齐林, 等. 一种用于 VR 场景的声音渲染优化方法 [J]. 光电工程,2018,45(6): 170742 An improved mehod o render he sound of VR scene Chen Tianshi, Tie Yun *, Qi Lin, Chen Enqing School of Informaion Engineering, Zhengzhou Universiy, Zhengzhou, Henan 450001, China Absrac: Based on he virual scene conaining hundreds of movable sound sources, due o he high compuaional cos of clusering sage, he radiional spaial sound rendering schemes ofen ake up oo much compuing resources, which have become a boleneck in he developmen of VR audio rendering echnology. In his paper, we use fracional Fourier ransform (FRFT) as a ool in sound sampling o reduce he quanizaion noise during he ADC conversion sage. Moreover, we improve he processing speed of sound rendering and he operaion efficiency of he enire sysem by adding he average angle deviaion hreshold in he clusering sep. In addiion, we design and implemen a percepual user experimen, and validaes he noion ha people are more suscepible o spaial errors in differen ypes of sound sources, especially if i is visible. Based on his conclusion, his paper proposes an improved mehod of sound clusering, which reduces he possibiliy of clusering differen ypes of sound sources. Keywords: sound rendering; clusering; perceived user experimens; average angle error Ciaion: Chen T S, Tie Y, Qi L, e al. An improved mehod o render he sound of VR scene[j]. Opo-Elecronic Engineering, 2018, 45(6): 170742 收稿日期 :2017-12-30; 收到修改稿日期 :2018-03-15 作者简介 : (1994-) E-mail 1243325667@qq.com 通信作者 : (1973-) E-mail ieyie@zzu.edu.cn 170742-1

1 引言 VR VR Tsingos N [1] Moeck T [2] Schissler C [3] [4] 2 经典渲染过程及其改进工作 2.1 经典的空间声音渲染方案 1) 2) 3) [1] ( ) ( ) 4) ( (head relaed ransfer funcion HRTF) ) HRTF 2.2 改进工作 1) [4] 2) 3) 1 图 1 声音渲染方案的流程 Fig. 1 The process of sound rendering program 170742-2

3 声音信号采集及聚类方式的优化 3.1 声音信号采集优化 [4] [4] [4] 3.2 聚类方式的优化 (Tsingos ) θ θ = k θi θ, (1) i= 1 k θ i i θ 30 θ 30 θ > 30 1 CPU CPU 5 4 基于声源标签化的聚类算法 4.1 主观实验分析 ( ) ( 23 45 ) ( ) HTC vive HRTF HRTF 表 1 传统方法与新提出的方法的比较 Table 1 Comparison of wo mehods /% /( ) CPU /% 200 64 20 17.3 27.8 18 12 400 56 23 21.1 25.4 36 20 600 60 19 16.5 28.6 68 33 800 52 29 22.4 27.9 87 46 170742-3

1(all ambien) 2(all musics) 3(all speech) 4(ambien+music) 5(ambien+speech) 6(music+ speech) 20 [5] A B R AB R A B R AB R (90 ~120 ) (60 ~90 )(30 ~60 ) (0~30 ) 2 20 120 4.2 聚类算法的改进 4.2.1 [1] Tsingos L S k C n (2) k C n 1 d( C = + n, ) L β log10 γ (1 Cn ) (2) 2 [6] 4.2.2 1) e e 1 2 3 e 90 80 75 80 72 70 60 53 50 40 30 47 43 20 10 0 1 2 3 4 5 6 图 2 两种条件对声音渲染的影响 Fig. 2 The influence of wo condiions on sound rendering 170742-4

a, if e( Sn ) = e( ) x( Sn, ) =, (3) b, else x( S n, ) n k a b e( S n ) n e( S k ) k 2) k d C, ) = n L β log ( 10 C S n k 1 + γ (1 Cn ) + μ(1 x( Sn, )), (4) 2 μ C n Sn 3 3 3(a) 3(b) 5 实验与分析 1) ( ) 2) ( ) Nvidia GTX 970 i7 44.1 khz 1200 2 769 893 578 30% 40% CPU CPU 2 (a) (b) 图 3 两种聚类算法的计算结果 (a) 传统聚类算法 ;(b) 新型聚类算法 Fig. 3 The resuls of wo clusering algorihms. (a) Tradiional algorihm; (b) Improved algorihm 表 2 三种场景下算法的性能表现 Table 2 The performance of algorihm in hree scenes / /% /ms /(f s -1 ) 769 56 25 0.78 30 578 64 25 0.62 34 893 43 25 0.83 27 170742-5

(frames per second, FPS) VR 6 结论 [4] 参考文献 [1] Tsingos N, Gallo E, Dreakis G. Percepual audio rendering of complex virual environmens[j]. ACM Transacions on Graphics, 2003, 23(3): 249 258. [2] Moeck T, Bonneel N, Tsingos N, e al. Progressive percepual audio rendering of complex scenes[c]//proceedings of he 2007 Symposium on Ineracive 3D Graphics and Games, 2007: 189 196. [3] Schissler C, Manocha D. Ineracive sound propagaion and rendering for large muli-source scenes[j]. ACM Transacions on Graphics, 2016, 36(4): 121 139 [4] Lu M F, Ni G Q, Bai T Z, e al. A novel mehod for suppressing he quanizaion noise based on fracional Fourier ransform[j]. Transacions of Beijing Insiue of Technology, 2015, 35(12): 1285 1290. 鲁溟峰, 倪国强, 白廷柱, 等. 基于分数阶傅里叶变换的量化噪声抑制方法 [J]. 北京理工大学学报, 2015, 35(12): 1285 1290. [5] ITU. Mehods for he subjecive assessmen of small impairmens in audio sysems including mulichannel sound sysems: ITU-R BS.1116-1[R]. Geneva: ITU, 1994: 1128 1136. [6] Hochbaum D S, Shmoys D B. A bes possible heurisic for he k-cener problem[j]. Mahemaics of Operaions Research, 1985, 10(2): 180 184. [7] Schissler C, Nicholls A, Mehra R. Efficien HRTF-based spaial audio for area and volumeric sources[j]. IEEE Transacions on Visualizaion and Compuer Graphics, 2016, 22(4): 1356 1366. [8] Takala T, Hahn J. Sound rendering[j]. ACM SIGGRAPH Compuer Graphics, 1992, 26(2): 211 220. [9] Schissler C, Lofin C, Manocha D. Acousic classificaion and opimizaion for muli-modal rendering of real-world scenes[j]. IEEE Transacions on Visualizaion and Compuer Graphics, 2017, 24(3): 1246 1259. [10] Taylor M T, Chandak A, Anani L, e al. RESound: ineracive sound rendering for dynamic virual environmens[c]//proceedings of he 17h ACM Inernaional Conference on Mulimedia, 2009: 271 280. [11] Raghuvanshi N, Snyder J, Mehra R, e al. Precompued wave simulaion for real-ime sound propagaion of dynamic sources in complex scenes[j]. ACM Transacions on Graphics (TOG), 2010, 29(4): 142 149. [12] Grelaud D, Bonneel N, Wimmer M, e al. Efficien and pracical audio-visual rendering for games using crossmodal percepion[c]//proceedings of he 2009 Symposium on Ineracive 3D Graphics and Games, 2009: 177 182. [13] Innami S, Kasai H. On-demand soundscape generaion using spaial audio mixing[c]//proceedings of 2011 IEEE Inernaional Conference on Consumer Elecronics, 2011: 29 30. 170742-6

An improved mehod o render he sound of VR scene Chen Tianshi, Tie Yun *, Qi Lin, Chen Enqing School of Informaion Engineering, Zhengzhou Universiy, Zhengzhou, Henan 450001, China Average angular error Audio daa & he posiion of sound source and lisen Sound source culling Percepual clusering Spaial sound rendering Oupu Sound source label parameers The process of sound rendering program Overview: In he field of VR, spaial sound rendering echnology plays an increasingly imporan role. The spaial sense of sound plays a very imporan role in enhancing he immersive sense of he user in he VR scene. A presen, he advanced space sound rendering scheme is mainly divided ino wo ypes: he sound waveform and he ray-based racking algorihm. Recenly, imporan research includes he following poins: Tsingos proposed a dynamic spaial sound rendering mehod based on culling and clusering, which made i possible o process large-scale sound sources in real ime. Moeck considered he visual facors during clusering, which reduced he clusering cos. The new algorihm proposed by Schissler eliminaed he impac of obsacles on he clusering resuls. Tao e al. proposed quanizaion noise suppression mehod based on fracional Fourier ransform o improve he qualiy of he audio signal sampling. Based on he virual scene conaining hundreds of movable sound sources, due o he high compuaional cos of clusering sage, he radiional spaial sound rendering scheme ofen akes up oo much compuing resources, which has become a boleneck in he developmen of VR audio rendering echnology. In his paper, we improve he processing speed of sound rendering and he operaion efficiency of he enire sysem by adding he average angle deviaion hreshold in he clusering sep. In addiion, we design and implemen a percepual user experimen ha validaes he noion ha people are more suscepible o spaial errors in differen ypes of sound sources, especially if i is visible. Based on his conclusion, his paper proposes an improved mehod of sound clusering, which reduces he possibiliy of differen ypes of sound sources clusering. Summarized as follows: focusing on rendering of complex virual audiory scenes comprising hundreds of moving sound sources using spaial audio mixing, we propose a new clusering algorihm considering average angle error. We presened an effeciveness of our algorihm over specific condiion o reduce compuaional coss caused by frequenly clusering. In addiion, he resul of subjecive experimens expresses ha he clusering of differen ypes of sound sources will cause more spaial informaion errors. Using his resul, his paper proposes a mehod based on sound source label parameers, which solves he problem of clusering differen kinds of sound sources. In he end, hree scene experimens verified he feasibiliy of he new mehod. Ciaion: Chen T S, Tie Y, Qi L, e al. An improved mehod o render he sound of VR scene[j]. Opo-Elecronic Engineering, 2018, 45(6): 170742 * E-mail: ieyie@zzu.edu.cn 170742-7