expose/expose.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265

\documentclass[12pt,a4paper,oneside,titlepage]{article}
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{textcomp} % Sonderzeichen, z.B. €
%\usepackage{citep}
\usepackage{listings}
\lstdefinelanguage{Ini}{basicstyle=\ttfamily\tiny,
  columns=fullflexible,
  tag=[s]{[]},
  tagstyle=\color{blue}\bfseries,
  usekeywordsintag=true
}[html]
\lstdefinelanguage{bash}{basicstyle=\ttfamily\tiny}
\usepackage{ulem}
\usepackage{lmodern}
\usepackage{multirow}
\usepackage{url}
\usepackage{graphicx} % for PDF scaling
\usepackage{pdfpages}
\usepackage{float}
\floatstyle{boxed}
\restylefloat{figure}
\usepackage{color}
\usepackage{bbding}
\usepackage{hyperref}
\usepackage[font=scriptsize]{caption}
\usepackage[numbers]{natbib}
\graphicspath{{images//}}

\begin{document}
  \title{Exposé: SoundScape Renderer Networking}
  \author{David Runge\\
    Audiokommunikation und -technologie\\
    Fachgebiet Audiokommunikation\\
    Technische Universität Berlin\\
    \href{dave@sleepmap.de}{dave@sleepmap.de}
  }
  \date{\today}
  \maketitle
  \begin{abstract}
    Wave Field Synthesis (WFS) as a technological concept has been around for
    many years now and all over the world several institutions run small and
    some even large scale setups ranging from single speaker lines to those
    facilitating a couple of hundred loudspeakers respectively.\\ 
    The still evolving implementations are driven by several rendering
    engines, of which two free and open-source ones, namely sWONDER and
    SoundScape Renderer, have (partially) been developed at TU Berlin.\\
    The latter due to its current design is not yet able to render for large
    scale setups, ie.\ those using several computers to render audio on a
    loudspeaker setup, due to the high amount of channels.\\
    Its solid codebase however, which additionally offers a framework for many
    more renderers, and the ongoing development, deems further work on this
    application a good future investment.\\ 
    The proposed work seeks to extend the SoundScape Renderer functionality to
    turn it into a networking application for large scale WFS setups.
\end{abstract}
  \section{Introduction}
    Wave Field Synthesis (WFS) describes a spatial technique for rendering audio. As
    such it aims at synthesizing a sound field of desired acoustic preference
    in a given listening area, assuming a planar reproduction to be most
    suitable for most applications.\\
    WFS is typically implemented using a curved or linear loudspeaker array
    surrounding the listening area.\\
    Several free and open-source renderer applications exist for WFS
    environments, with varying stages of feature richness.\\
    The proposed work will focus on one of them and its extension towards WFS
    on large scale systems.

  \section{Free and open-source wave field synthesis renderers}
    To date there exist three (known of) free and open-source Wave Field
    Synthesis renderers, which are all \href{http://jackaudio.org/}{JACK Audio
    Connection Kit (JACK)} \citep{website:jackaudio2016} clients: 
    \begin{itemize}
      \item \href{https://sourceforge.net/projects/swonder/}{sWONDER} \citep{website:swonder2016},
      developed by Technische Universität Berlin, Germany 
      \item \href{https://github.com/GameOfLife/WFSCollider}{WFSCollider} \citep{website:wfscollider2016},
        developed by \href{http://gameoflife.nl/en}{Game Of Life Foundation} \citep{website:gameoflife2016},
        The Hague, Netherlands
      \item \href{http://spatialaudio.net/ssr/}{SoundScape Renderer (SSR)} \citep{website:ssr2016},
        developed by Quality \& Usability Lab, Deutsche Telekom Laboratories
        and TU Berlin and Institut für Nachrichtentechnik, Universität Rostock 
    \end{itemize}
    Currently only WFSCollider and the SSR are actively maintained and
    developed, thussWONDER, although used in some setups, loses significance.
    Generally it can be said, that different concepts apply to the three
    renderers, which are about to be explained briefly in the following
    sections.

    \subsection{WONDER}
      sWONDER \citep{baalman2007} consists of a set of C++ applications that provide binaural and
      WFS rendering. In 2007 it was specifically redesigned
      \citep{baalmanetal2007} to cope with large scale WFS setups in which
      several (computer) nodes, providing several speakers each, drive a system
      together.\\
      In these setups each node receives all available audio streams (which
      represent one virtual audio source respectively) redundantly and a master
      application signals which node is responsible for rendering what source
      on which speaker.\\
      It uses Open Sound Control (OSC) for messaging between its parts and for
      setting its controls. Apart from that, it can be controlled through a
      Graphical User Interface (GUI), that was specifically designed for it.
      Unfortunately sWONDER has not been actively maintained for several years,
      has a complex setup chain and many bugs, that are not likely to get fixed
      any time soon.

    \subsection{WFSCollider}
      WFSCollider was built on top of
      \href{https://supercollider.github.io}{SuperCollider} 3.5
      \citep{website:supercollider2016} and is also capable of driving large
      scale systems. It uses a different approach in
      doing so, though: Whereas withsWONDER all audio streams are distributed
      to each node, WFSCollider usually uses the audio files to be played on
      all machines simultaneously and synchronizes between them.\\
      It has a feature-rich GUI in the ``many window'' style, making available
      time lines and movement of sources through facilitating what the sclang
      (SuperCollider programming language) has to offer.\\
      As WFSCollider basically is SuperCollider plus extra features, it is also
      an OSC enabled application and can thus also be used for mere
      multi-channel playback of audio.\\
      Although it has many useful features, it requires MacOSX (Linux version
      still untested) to run, is built upon a quite old version of
      \href{https://supercollider.github.io}{SuperCollider} and is likely never
      to be merged into it, due to many core changes to it.

    \subsection{SoundScape Renderer}
      SoundScape Renderer (SSR), also a C++ application, running on Linux and
      MacOSX, is a multi-purpose spatial audio renderer, as it is not only
      capable of Binaural Synthesis and WFS, but also Higher-Order Ambisonics
      and Vector Base Amplitude Panning.\\
      It can be used with a GUI or headless (without one), depicting the
      virtual sources, their volumes and positions, alongside which speakers
      are currently used for rendering a selected source.
      SSR uses TCP/IP sockets for communication and thus is not directly OSC
      enabled. This functionality can be achieved using the capapilities of
      other applications such as \href{http://puredata.info}{PureData}
      \citep{website:puredata2016} in combination with it though.\\
      Unlike the two renderers above, the SSR is not able to run large-scale
      WFS setups, as it lacks the features to communicate between instances of
      itself on several computers, while these instances serve a subset of the
      available loudspeakers.

  \section{Extending Sound Scape Renderer functionality}
    The SSR, due to its diverse set of rendering engines, which are made
    available through an extensible framework, and its clean codebase, is a
    good candidate for future large scale WFS setups. These type of features
    are not yet implemented though and will need testing.\\
    Therefore I propose the implementation and testing of said feature, making
    the SSR capable of rendering on large scale WFS setups with many nodes,
    controlled by a master instance.\\
    The sought implementation is inspired by the architecture of sWONDER, but
    instead of creating many single purpose applications, the master/node
    feature will be made available through flags to the ssr executable, when
    starting it. This behavior is already actively harnessed eg.\ for selecting
    one of the several rendering engines.
    \begin{figure}[!htb]
      \centering
      \includegraphics[scale=0.9, trim = 31mm 190mm 24mm 8mm, clip]{ssr-networking.pdf}
      \caption{A diagram displaying the SSR master/node setup with TCP/IP
        socket connections over network (green lines), audio channels (red dashed
        lines) and OSC connection (blue dashed line). Machines are indicated as red
        dashed rectangles and connections to audio hardware as outputs of SSR
        nodes as black lines below them.} 
      \label{fig:ssr-networking}
    \end{figure}
    While the SSR already has an internal logic to know which loudspeaker will
    be used for what virtual audio source, this will have to be extended to be
    able to know which renderer node has to render what source on which
    loudspeaker (see Figure~\ref{fig:ssr-networking}).
    To achieve the above features, the SSR's messaging (and thus also settings)
    capabilities have to be extended alongside its internal logic concerning
    the selection of output channels (and the master to node notification
    thereof). To introduce as little redundant code as possible, most likely a
    ``the client knows all'' setup is desirable, in which each node knows about
    the whole setup, but is also set to only serve its own subset of
    loudspeakers in it. This will make sure that the rendering engine remains
    functional also in a small scale WFS setup.\\
    The lack of a direct OSC functionality, as provided by the two other
    renderers, will not be problematic, as master and nodes can communicate
    through their builtin TCP/IP sockets directly and the master can, if
    needed, be controlled via OSC.

  \section{Prelimenaries}
    In preparation to the exposé I tried to implement a side-by-side
    installation, using Arch Linux on a medium scale setup, facilitating the
    WFS system of the Electronic Studio at TU Berlin. Unfortunately the
    proprietary Dante driver, that is used in that system is very complex to be
    built, as well as underdeveloped and thus keeps the system from being
    easily updated, which is needed for testing purposes (finding a suitable
    real-time, low-latency Linux kernel), trying out new software features,
    building new software and keeping a system safe. The driver will most
    likely require changes to the hardware due to implemention of hardware
    branding by the vendor and dire testing before usage.\\
    Although eventually using a proper WFS setup for testing will be necessary,
    it is luckily not needed for implementing the features, as they can already be
    worked out using two machines running Linux, JACK and the development version
    of SSR.\\
    The hardware of the large scale setup at TU Berlin in H0104 is currently
    about to be updated and therefore a valuable candidate for testing of the
    sought SSR features.

  \section{Schedule}
    I propose a six month schedule for the implementation and testing of the
    changes to the source code and writing of an accompanying thesis. The
    following rough schedule should serve as a guideline for the realization of
    the work:\\
    \begin{tabular}{|l|l|l|l|}
      \hline
      \multicolumn{4}{|c|}{\textbf{Schedule}}\\
      \hline
      \textbf{Week} & \textbf{Implementation} & \textbf{Tests} & \textbf{Thesis} \\
      \hline
      1 & Reading into codebase & & \\
      \hline
      2 & Reading into codebase & & \\
      \hline
      3 & Reading into codebase & & \\
      \hline
      4 & Reading into codebase & & \\
      \hline
      5 & Assessing changes & & Documentation \\
      \hline
      6 & Assessing changes & & Documentation \\
      \hline
      7 & Implementing changes & & \\
      \hline
      8 & Implementing changes & & \\
      \hline
      9 & Implementing changes & & \\
      \hline
      10 & Implementing changes & & \\
      \hline
      11 & Implementing changes & & \\
      \hline
      12 & Implementing changes & & \\
      \hline
      13 & Implementing changes & & Preparation\\
      \hline
      14 & Implementing changes & & Preparation\\
      \hline
      15 & & Small scale setup & Writing\\
      \hline
      16 & & Large scale setup & Writing\\
      \hline
      17 & & Large scale setup & Writing\\
      \hline
      18 & & Large scale setup & Writing\\
      \hline
      19 & Large scale setup (scripts) & & Writing\\
      \hline
      20 & Large scale setup (scripts) & & Writing\\
      \hline
      21 & Large scale setup (scripts) & & Writing\\
      \hline
      22 & & & Writing\\
      \hline
      23 & & & Writing\\
      \hline
      24 & & & Writing\\
      \hline
    \end{tabular}
  \pagebreak
  \bibliographystyle{plainnat}
  \bibliography{bib/ssr-networking}
\end{document}