1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
|
\documentclass[12pt,a4paper,oneside,titlepage]{paper}
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{textcomp}
\usepackage{listings}
\lstdefinelanguage{Ini}{basicstyle=\ttfamily\tiny,
columns=fullflexible,
tag=[s]{[]},
tagstyle=\color{blue}\bfseries,
usekeywordsintag=true
}[html]
\lstdefinelanguage{bash}{basicstyle=\ttfamily\tiny}
\usepackage{ulem}
\usepackage{lmodern}
\usepackage{multirow}
\usepackage{url}
\usepackage{graphicx}
\usepackage{pdfpages}
\usepackage{float}
\floatstyle{boxed}
\restylefloat{figure}
\usepackage[usenames,dvipsnames,svgnames,table]{xcolor}
\definecolor{osc-out}{RGB}{150,0,255}
\definecolor{osc-in}{RGB}{0,0,255}
\definecolor{audio-in}{RGB}{255,0,0}
\definecolor{audio-out}{RGB}{0,206,0}
%\usepackage{color}
\usepackage{hyperref}
\hypersetup{hidelinks, colorlinks = false}
\usepackage[font=scriptsize]{caption}
\usepackage[authoryear]{natbib}
% glossary
\usepackage[acronym,nonumberlist,toc]{glossaries}
\newacronym{bs}{BS}{Binaural Synthesis}
\newacronym{hoa}{HOA}{Higher Order Ambisonics}
\newacronym{ip}{IP}{Internet Protocol}
\newacronym{jack}{JACK}{JACK Audio Connection Kit}
\newacronym{oop}{OOP}{Object-oriented Programming}
\newacronym{osc}{OSC}{Open Sound Control}
\newacronym{pubsub}{PubSub}{Publish-subscribe message pattern}
\newacronym{pd}{Pd}{PureData}
\newacronym{ssr}{SSR}{SoundScape Renderer}
\newacronym{tcp}{TCP}{Transmission Control Protocol}
\newacronym{vbap}{VBAP}{Vector Based Amplitude Panning}
\newacronym{wfs}{WFS}{Wave Field Synthesis}
\newacronym{xml}{XML}{Extensible Markup Language}
\makeindex
\makeglossaries
\graphicspath{{../images//}}
\begin{document}
\begin{titlepage}
\centering
\includegraphics[width=0.3\textwidth]{tu-berlin-logo.pdf}\par\vspace{1cm}
{\scshape\LARGE Technische Universität Berlin\par}
\vspace{1cm}
{\scshape\Large Master Thesis\par}
\vspace{1.5cm}
{\huge\bfseries A Networking Extension for the SoundScape Renderer\par}
\vspace{2cm}
{\Large\itshape David Runge\par}
\href{dave@sleepmap.de}{dave@sleepmap.de}
\vfill
supervised by\par
Henrik von Coler and Stefan Weinzierl
\vfill
{\large \today\par}
\end{titlepage}
\pagestyle{empty}
\section*{Eidesstattliche Erklärung}
\vspace{1cm}
Hiermit erkläre ich, dass ich die vorliegende Arbeit selbstständig und
eigenhändig sowie ohne unerlaubte fremde Hilfe und ausschließlich unter
Verwendung der aufgeführten Quellen und Hilfsmittel angefertigt habe.\\
Berlin, den \today\par\\
\vspace{2cm}
\noindent\ldots\ldots\ldots\ldots\ldots\ldots\ldots\ldots\ldots\ldots\ldots\\
David Runge
\begin{abstract}
\gls{wfs} as a technological concept has been around for
many years now and all over the world several institutions run small and
some even large scale setups ranging from single speaker lines to those
facilitating a couple of hundred loudspeakers respectively.\\
The still evolving implementations are driven by several rendering
engines, of which two free and open-source ones, namely sWONDER and
SoundScape Renderer, have (partially) been developed at TU Berlin.\\
The latter due to its current design is not yet able to render for large
scale setups, ie.\ those using several computers to render audio on a
loudspeaker setup, due to the high amount of channels.\\
Its solid codebase however, which additionally offers a framework for many
more renderering types, and the ongoing development, deems further work on
this application a good future investment.\\
This work is about the extension of the SoundScape Renderer's functionality
to turn it into a networking application for large scale \gls{wfs} setups.
\end{abstract}
\tableofcontents
\cleardoublepage
\pagestyle{headings}
\setcounter{page}{1}
\section{Introduction}
\label{sec:introduction}
\cleardoublepage
\section{Free and open-source spatial audio renderers}
\label{sec:freespatialaudiorenderers}
To date there exist three (known of) free and open-source spatial audio
renderers, which are all \href{http://jackaudio.org/}{\gls{jack}}
\citep{website:jackaudio2016} clients:
\begin{itemize}
\item \href{https://sourceforge.net/projects/swonder/}{sWONDER}
\citep{website:swonder2016}, developed by Technische Universität
Berlin, Germany
\item \href{https://github.com/GameOfLife/WFSCollider}{WFSCollider}
\citep{website:wfscollider2016}, developed by
\href{http://gameoflife.nl/en}{Game Of Life Foundation}
\citep{website:gameoflife2016}, The Hague, Netherlands
\item \href{http://spatialaudio.net/ssr/}{\gls{ssr}}
\citep{website:ssr2016}, developed by Quality \& Usability Lab,
Deutsche Telekom Laboratories and TU Berlin and Institut für
Nachrichtentechnik, Universität Rostock
\end{itemize}
Currently only WFSCollider and the \gls{ssr} are actively maintained and
developed, thus sWONDER, although used in some setups, loses significance.
Generally it can be said, that different concepts apply to the three
renderers, which are about to be explained briefly in the following
sections.
\subsection{Spatial audio renderers and their appliance}
\label{subsec:spatialaudiorenderersandtheirappliance}
\subsubsection{Wave Field Synthesis}
\label{subsubsec:wavefieldsynthesis}
\gls{wfs} describes a spatial technique for rendering
audio. As such it aims at synthesizing a sound field of desired acoustic
preference in a given listening area, assuming a planar reproduction to be
most suitable for most applications.\\
\gls{wfs} is typically implemented using a curved or linear loudspeaker array
surrounding the listening area.\\
Several free and open-source renderer applications exist for \gls{wfs}
environments, with varying stages of feature richness.\\
The proposed work will focus on one of them and its extension towards \gls{wfs}
on large scale systems.
\subsubsection{\gls{hoa} and \gls{vbap}}
\label{subsubsec:hoaandvbap}
\subsubsection{\gls{bs}}
\label{subsubsec:binaural}
\subsection{WONDER}
\label{subsec:WONDER}
sWONDER \citep{baalman2007} consists of a set of C++ applications that
provide \gls{bs} and \gls{wfs} rendering. In 2007 it was specifically
redesigned \citep{baalmanetal2007} to cope with large scale \gls{wfs} setups in
which several (computer) nodes, providing several speakers each, drive a
system together.\\
In these setups each node receives all available audio streams (which
represent one virtual audio source respectively) redundantly and a master
application signals which node is responsible for rendering what source
on which speaker.\\
It uses \gls{osc} for messaging between its parts and for
setting its controls. Apart from that, it can be controlled through a
Graphical User Interface (GUI), that was specifically designed for it.
Unfortunately sWONDER has not been actively maintained for several years,
has a complex setup chain and many bugs, that are not likely to get fixed
any time soon.
\subsection{HOA-Pd}
\label{subsec:hoapd}
\subsection{WFSCollider}
\label{subsec:wfscollider}
WFSCollider was built on top of
\href{https://supercollider.github.io}{SuperCollider} 3.5
\citep{website:supercollider2016} and is also capable of driving large
scale systems. It uses a different approach in
doing so, though: Whereas with sWONDER all audio streams are distributed
to each node, WFSCollider usually uses the audio files to be played on
all machines simultaneously and synchronizes between them.\\
It has a feature-rich GUI in the ``many window'' style, making available
time lines and movement of sources through facilitating what the sclang
(SuperCollider programming language) has to offer.\\
As WFSCollider basically is SuperCollider plus extra features, it is also
an \gls{osc} enabled application and can thus also be used for mere
multi-channel playback of audio.\\
Although it has many useful features, it requires MacOSX (Linux version
still untested) to run, is built upon a quite old version of
\href{https://supercollider.github.io}{SuperCollider} and is likely never
to be merged into it, due to many core changes to it.
\subsection{SoundScape Renderer}
\label{subsec:soundscaperenderer}
\gls{ssr}, also a C++ application, running on Linux and
MacOSX, is a multi-purpose spatial audio renderer, as it is not only
capable of \gls{bs} and \gls{wfs}, but also \gls{hoa}
and \gls{vbap}.\\
It can be used with a GUI or headless (without one), depicting the
virtual sources, their volumes and positions, alongside which speakers
are currently used for rendering a selected source.
\gls{ssr} uses TCP/IP sockets for communication and is therefore not directly
\gls{osc} enabled. This functionality can be achieved using the capapilities of
other applications such as \gls{pd} \citep{website:puredata2016} in
combination with it though.\\
Unlike the two renderers above, the \gls{ssr} is not able to run large-scale
\gls{wfs} setups, as it lacks the features to communicate between instances of
itself on several computers, while these instances serve a subset of the
available loudspeakers.
\cleardoublepage
\section{Methods}
\label{sec:methods}
The \gls{ssr}, due to its diverse set of rendering engines, which are made
available through an extensible framework, and its relatively clean
codebase, is a good candidate for future large scale \gls{wfs} setups. These type
of features are not yet implemented though and will need testing.\\
Therefore I propose the implementation and testing of said feature, making
the \gls{ssr} capable of rendering on large scale \gls{wfs} setups with many nodes,
controlled by a master instance.\\
The sought implementation is inspired by the architecture of sWONDER, but
instead of creating many single purpose applications, the master/node
feature will be made available through flags to the \gls{ssr} executable, when
starting it. This behavior is already actively harnessed eg.\ for selecting
one of the several rendering engines.
\begin{figure}[!htb]
\centering
\includegraphics[scale=0.9, trim = 31mm 190mm 24mm 8mm, clip]
{ssr-networking.pdf}
\caption{A diagram displaying the \gls{ssr} master/node setup with TCP/IP
socket connections over network (green lines), audio channels (red
dashed lines) and \gls{osc} connection (blue dashed line). Machines are
indicated as red dashed rectangles and connections to audio hardware
as outputs of \gls{ssr} nodes as black lines below them.}
\label{fig:ssr-networking}
\end{figure}
While the \gls{ssr} already has an internal logic to know which loudspeaker will
be used for what virtual audio source, this will have to be extended to be
able to know which renderer node has to render what source on which
loudspeaker (see Figure~\ref{fig:ssr-networking}).
To achieve the above features, the \gls{ssr}'s messaging (and thus also settings)
capabilities have to be extended alongside its internal logic concerning
the selection of output channels (and the master to node notification
thereof). To introduce as little redundant code as possible, most likely a
``the client knows all'' setup is desirable, in which each node knows about
the whole setup, but is also set to only serve its own subset of
loudspeakers in it. This will make sure that the rendering engine remains
functional also in a small scale \gls{wfs} setup.\\
The lack of a direct \gls{osc} functionality, as provided by the two other
renderers, will not be problematic, as master and nodes can communicate
through their builtin TCP/IP sockets directly and the master can, if
needed, be controlled via \gls{osc}.
\subsection{Prelimenaries}
\label{subsec:preliminaries}
In preparation to the work an implement a side-by-side
installation, using Arch Linux on a medium scale setup, facilitating the
\gls{wfs} system of the Electronic Studio at TU Berlin. Unfortunately the
proprietary Dante driver, that is used in that system is very complex to be
built, as well as underdeveloped and thus keeps the system from being
easily updated, which is needed for testing purposes (finding a suitable
real-time, low-latency Linux kernel), trying out new software features,
building new software and keeping a system safe. The driver will most
likely require changes to the hardware due to implemention of hardware
branding by the vendor and dire testing before usage.\\
Although eventually using a proper \gls{wfs} setup for testing will be necessary,
it is luckily not needed for implementing the features, as they can already
be worked out using two machines running Linux, \gls{jack} and the development
version of \gls{ssr}.\\
The hardware of the large scale setup at TU Berlin in H0104 is currently
about to be updated and therefore a valuable candidate for testing of the
sought after \gls{ssr} features.
\subsection{Outline}
\label{subsec:outline}
Initially extending the \gls{ssr}'s features was aimed at
\subsubsection{Remote controlling a server}
\label{subsubsec:remote_controlling_a_server}
\subsubsection{Remote controlling clients}
\label{subsubsec:remote_controlling_a_client}
\subsubsection{Rendering on dedicated speakers}
\label{subsubsec:rendering_on_dedicated_speakers}
\subsection{Publisher/Subscriber interface}
\label{subsec:publisher_subscriber_interface}
The \gls{ssr} internally uses a \gls{pubsub}, which is a design pattern,
to implement control through and over several parts of its components.\\
In \gls{oop} \gls{pubsub} - also called observer, listener messaging - is
usually comprised of a publisher class, handling the messages, without
explicitely implementing how they will be used and a subscriber class,
that allows for its implementations to subscribe to the messages
provided. Filtering takes place to enable subscribers to only receive a
certain subset of the messages.\\
The \gls{ssr} implements a content-based filtering system, in which each
subscriber evaluates the messages received and acts depending on its own
constraints to implement further actions upon it.\\
The abstract class Publisher defines the messages possible to send and
provides means to subscribe to them. The global Controller class is its
only implementation within the \gls{ssr}.\\
The abstract class Subscriber in turn defines the messages understood,
while its implementations in RenderSubscriber, Scene, OscSender and
NetworkSubscriber take care of how they are used.\\
This system enables a versatile messaging layout, in which components can
call the publisher functionality in Controller, which in turn will send
out messages to all of its subscribers.
\subsection{\gls{ip} interface}
\label{subsec:ip-interface}
The \gls{ssr} from early on incorporated a network interface, that
accepts specially terminated \gls{xml}-formatted strings over a \gls{tcp}
port, called “\gls{ip} interface”. This has the benefit of reusing the
same \gls{xml} parser code in use for scene and reproduction
description.\\
A downside is however, that - from the perspective of other software - it
is complicated to use, as a conversion to \gls{xml} has to be attempted
before sending a message to the \gls{ssr}. Additionally the message has
to be linted (error checked) before sending and again parsed, after
receiving an answer from the application.\\
\paragraph{OSC through PureData}
\label{par:osc_through_puredata}
To allow \gls{osc} communication, the \gls{ssr} incorporates a Lua
based \gls{pd} external. It uses two externals (iemnet and pdlua)
alongside a Lua library for parsing and creating \gls{xml} (SLAXML).
\paragraph{Sending and receiving}
\label{par:sending_and_receiving}
As mentioned in
section~\nameref{subsec:publisher_subscriber_interface}, the
NetworkSubscriber class (part of the \gls{ip} interface) implements the
subscriber interface. This means: The network interface subscribes to
the messages the publisher (the Controller instance) has to offer.
Every time a function of the \gls{ssr}'s Controller instance, that was
inherited from Publisher, is called, it will issue the call on all of
its subscribers, too.\\
\cleardoublepage
\section{Results}
\label{sec:results}
\subsection{\gls{osc} interface}
\label{subsec:osc-interface}
\subsubsection{liblo}
\label{subsubsec:liblo}
\subsubsection{Client-Server setup}
\label{subsubsec:client_server_setup}
\begin{figure}[!htb]
\centering
\includegraphics[scale=1.0, trim = 20mm 204mm 10mm 10mm, clip]
{ssr-client-server-shared-output.pdf}
\caption{A diagram displaying a \gls{ssr} client/server setup, in
which the server and the clients render audio collectively (e.g.
\gls{wfs}). The server instance is not controlled via \gls{osc},
but controls its clients through it.\\
{\color{osc-in}\textbf{--}} \gls{osc} input
{\color{osc-out}\textbf{--}} \gls{osc} output
{\color{audio-in}\textbf{--}} Audio input
{\color{audio-out}\textbf{--}} Audio output
}
\label{fig:ssr-client-server-shared-output}
\end{figure}
\subsubsection{Layered clients}
\label{subsubsec:layered_clients}
\subsubsection{Message interface}
\label{subsubsec:message_interface}
\cleardoublepage
\section{Discussion}
\label{sec:discussion}
\paragraph{Stress testing the \gls{osc} interface}
\label{par:stress_testing_the_osc_interface}
\paragraph{Implementing a NullRenderer}
\label{par:implementing_a_nullrenderer}
\paragraph{Implementing AlienLoudspeaker}
\label{par:implementing_alienloudspeaker}
\paragraph{Interpolation of moving sources}
\label{par:interpolation_of_moving_sources}
\pagestyle{empty}
\cleardoublepage
\addcontentsline{toc}{section}{\listfigurename}
\listoffigures
\cleardoublepage
\addcontentsline{toc}{section}{\listtablename}
\listoftables
\cleardoublepage
\printindex
\glsaddall
\printglossaries
\cleardoublepage
\bibliographystyle{plainnat}
\bibliography{../bib/ssr-networking}
\end{document}
|