From f05ef473df532d5095f227bed5304a754b90f339 Mon Sep 17 00:00:00 2001 From: David Runge Date: Fri, 30 Jun 2017 18:29:38 +0200 Subject: thesis/thesis.tex: Expanding and fixing subsections about the various spatial audio renderers. --- thesis/thesis.tex | 187 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 133 insertions(+), 54 deletions(-) diff --git a/thesis/thesis.tex b/thesis/thesis.tex index ba23607..6516265 100644 --- a/thesis/thesis.tex +++ b/thesis/thesis.tex @@ -56,6 +56,7 @@ parskip=never]{paper} \newacronym{fdl}{FDL}{GNU Free Documentation License} \newacronym{gpl}{GPL}{GNU General Public License} \newacronym{gui}{GUI}{Graphical user interface} +\newacronym{ide}{IDE}{Integrated Development Environment} \newacronym{lgpl}{LGPL}{GNU Lesser General Public License} \newacronym{lts}{LTS}{Long Term Support} \newacronym{hoa}{HOA}{Higher Order Ambisonics} @@ -89,7 +90,29 @@ parskip=never]{paper} \newglossaryentry{stdout}{ name={stdout}, description={The standard output is a stream where a program writes its - output data to. This can be a log file or a terminal}, + output data to. This can be a log file or a terminal} +} +\newglossaryentry{faust}{ + name={FAUST}, + description={Functional Audio Stream is a functional programming language + specifically designed for realtime signal processing and synthesis} +} +\newglossaryentry{quark}{ + name={Quark}, + description={Name for Classes extending the SuperCollider programming + language, usually developed in a separate version controlled code + repository}, + plural=Quarks +} +\newglossaryentry{supercollider}{ + name={SuperCollider}, + description={A programming language, \gls{ide} and synthesis server for + realtime audio processing and synthesis} +} +\newglossaryentry{qt4}{ + name={Qt4}, + description={Version 4 (legacy) of the cross-platform application framework + for creating desktop applications.} } @@ -218,13 +241,12 @@ parskip=never]{paper} \begin{itemize} \item sWONDER \citep{website:swonder2016}, developed by Technische Universität Berlin, Germany - \item WFSCollider \citep{website:wfscollider2016}, developed by - \href{http://gameoflife.nl/en}{Game Of Life Foundation} - \citep{website:gameoflife2016}, The Hague, Netherlands - \item HoaLibrary for \gls{pd} \citep{website:hoalibraryforpd} developed - at \gls{cicm}, Paris, France - \item 3Dj for SuperCollider \citep{thesis:perezlopez3dj2014}, developed - at Universitat Pompeu Fabra, Barcelona + \item WFSCollider \citep{website:wfscollider}, developed by the Game Of + Life Foundation \citep{website:gameoflife}, The Hague, Netherlands + \item HoaLibrary for \gls{pd} \citep{github:hoalibraryforpd} developed at + \gls{cicm}, Paris, France + \item 3Dj for \gls{supercollider} \citep{thesis:perezlopez3dj2014}, + developed at Universitat Pompeu Fabra, Barcelona \item \gls{ssr} \citep{website:ssr2016}, developed by Quality \& Usability Lab, Telekom Innovation Laboratories, TU Berlin and Institut für Nachrichtentechnik, Universität Rostock and Division of Applied @@ -237,20 +259,21 @@ parskip=never]{paper} \subsection{Spatial audio renderers and their appliance} \label{subsec:spatialaudiorenderersandtheirappliance} + \subsubsection{Wave Field Synthesis} \label{subsubsec:wavefieldsynthesis} - \gls{wfs} describes a spatial technique for rendering - audio. As such it aims at synthesizing a sound field of desired acoustic - preference in a given listening area, assuming a planar reproduction to be - most suitable for most applications.\\ - \gls{wfs} is typically implemented using a curved or linear loudspeaker array - surrounding the listening area.\\ - Several free and open-source renderer applications exist for \gls{wfs} - environments, with varying stages of feature richness.\\ - The proposed work will focus on one of them and its extension towards \gls{wfs} - on large scale systems. + \gls{wfs} describes a spatial technique for rendering + audio. As such it aims at synthesizing a sound field of desired acoustic + preference in a given listening area, assuming a planar reproduction to be + most suitable for most applications.\\ + \gls{wfs} is typically implemented using a curved or linear loudspeaker array + surrounding the listening area.\\ + Several free and open-source renderer applications exist for \gls{wfs} + environments, with varying stages of feature richness.\\ + \subsubsection{Higher order ambisonics and vector based amplitude panning} \label{subsubsec:hoaandvbap} + \subsubsection{Binaural (Room) Synthesis} \label{subsubsec:binaural} @@ -265,54 +288,107 @@ parskip=never]{paper} represent one virtual audio source respectively) redundantly and a master application signals which node is responsible for rendering what source on which speaker.\\ - It uses \gls{osc} for messaging between its parts and for - setting its controls. Apart from that, it can be controlled through a - \gls{gui}, that was specifically designed for it. - Unfortunately sWONDER has not been actively maintained for several years, - has a complex setup chain and many bugs, that are not likely to get fixed - any time soon. + It uses \gls{osc} for messaging between its components and for setting + its controls. Additionally, it can be controlled through a \gls{gui}, + that was specifically designed for it.\\ + Sound sources can be moved dynamically, or according to a \gls{xml} based + score.\\ + sWONDER has been in use for the medium and large scale \gls{wfs} systems + in the Electronic Music Studio \citep{website:tu-electronic_studio} and + lecture hall H0103 \citep{website:tu-wfs} at Technical University of + Berlin and a medium scale system at the Wave Field Synthesis Lab at HAW + in Hamburg \citep{Fohl2013}.\\ + The included convolution engine fWonder has found application in + “Assessing the Authenticity of Individual Dynamic Binaural Synthesis” + \citep[pp. 223-246]{lindau2014}.\\ + Unfortunately, the spatial audio renderer has not been actively + maintained for several years, is limited to its two rendering algorithms + and has many bugs, that are not likely to get fixed in the future.\\ \subsection{HoaLibrary (PureData extension)} \label{subsec:hoalibrary_puredata_extension} + The HoaLibrary is “a collection of C++ and \gls{faust} classes and + objects for Max, PureData and VST destined to high order ambisonics sound + reproduction” \citep{website:hoalibrary}. By using its \gls{pd} + extension, it enables for \gls{hoa} reproduction, while harnessing the + rich feature set of the audio programming language still enables for + implementing other forms of spatial rendering alongside the HoaLibrary.\\ + \gls{pd} is \gls{osc} capable with the help of extensions, such as + \textit{mrpeach}\footnote{ \href{https://puredata.info/downloads/mrpeach} + {https://puredata.info/downloads/mrpeach}} or \textit{IEMnet}\footnote{ + \href{https://puredata.info/downloads/iemnet} + {https://puredata.info/downloads/iemnet}}.\\ + \subsection{3Dj (SuperCollider Quark)} \label{subsec:3dj_supercollider_quark} + 3Dj is a \gls{supercollider} \gls{quark} conceived in the course of a Master + Thesis at Universitat Pompeu Fabra, Barcelona + \citep{thesis:perezlopez3dj2014} for interactive performance live + spatialization purposes. It implements \gls{hoa} and \gls{vbap} rendering + \citep[p 45]{thesis:perezlopez3dj2014} and uses a specific scene format + \citep[pp. 45-46]{thesis:perezlopez3dj2014} to allow sound sources to + have static, linear, random, brownian, simple harmonic and orbital + motion.\\ + Due to being a language extension to \gls{sclang}, 3Dj can be used in + conjunction with other spatial rendering algorithms provided by + \gls{supercollider} or any of its \glspl{quark}.\\ + \gls{supercollider} is \gls{osc} enabled by default, which renders 3Dj a + dynamically accessible solution. \subsection{WFSCollider} \label{subsec:wfscollider} WFSCollider was built on top of \href{https://supercollider.github.io}{SuperCollider} 3.5 - \citep{website:supercollider} and is also capable of driving large - scale systems. It uses a different approach in - doing so, though: Whereas with sWONDER all audio streams are distributed - to each node, WFSCollider usually uses the audio files to be played on - all machines simultaneously and synchronizes between them.\\ - It has a feature-rich \gls{gui} in the \textit{many window} style, making available - time lines and movement of sources through facilitating what the - \gls{sclang} has to offer.\\ - As WFSCollider basically is SuperCollider plus extra features, it is also - an \gls{osc} enabled application and can thus also be used for mere - multi-channel playback of audio.\\ - Although it has many useful features, it requires MacOSX (Linux version - still untested) to run, is built upon a quite old version of - \href{https://supercollider.github.io}{SuperCollider} and is likely never - to be merged into it, due to many core changes to it. + \citep{website:supercollider} and as its name suggests, is an application + for \gls{wfs} reproduction. It “allows soundfiles, live input and + synthesis processes to be placed in a score editor where start times, and + durations can be set and trajectories or positions assigned to each + event. It also allows realtime changement of parameters and on the fly + starting and stopping of events via \gls{gui} or \gls{osc} control. Each + event can be composed of varous objects (“units”) in a processing chain“ + \citep{website:wfscollider}. According to its current manual, it is + also capable of using a \gls{vbap} renderer for other multi-speaker + setups \citep[p. 8]{manual:wfscollider}.\\ + ”WFSCollider is the driving software of the Wave Field Synthesis system + of the Game Of Life Foundation“ \citep{website:gameoflife}. In + multi-computer setups, it can synchronize the involved processes and a + dynamic latency can be introduced to account for high network throughput + \citep[p. 22]{manual:wfscollider}. WFSCollider by nature is \gls{osc} + capable and extendable by what \gls{sclang} has to offer. Its scores are + saved as \gls{supercollider} code, as well.\\ + It is currently only tested on MacOSX and is based upon a several year + old version of \href{https://supercollider.github.io}{SuperCollider}. \subsection{SoundScape Renderer} \label{subsec:soundscaperenderer} - \gls{ssr}, also a C++ application, running on Linux and - MacOSX, is a multi-purpose spatial audio renderer, as it is not only - capable of \gls{bs} and \gls{wfs}, but also \gls{hoa} - and \gls{vbap}.\\ - It can be used with a \gls{gui} or headless (without one), depicting the - virtual sources, their volumes and positions, alongside which speakers - are currently used for rendering a selected source. - \gls{ssr} uses TCP/IP sockets for communication and is therefore not directly - \gls{osc} enabled. This functionality can be achieved using the capapilities of + The \gls{ssr}, written in C++, is a multi-purpose spatial audio renderer, + that runs on Linux and MacOSX. Based on its underlying \gls{apf} + \citep{MatthiasGeierTorbenHohn1890}, it is able to use \gls{bs}, + \gls{brs}, \gls{aap}, \gls{wfs}, \gls{hoa} and \gls{vbap}.\\ + It can be used with a \gls{qt4} based \gls{gui} or headless (without + one), depicting the virtual sources, their volumes and positions. If a + loudspeaker based renderer is chosen, the \gls{gui} also illustrates + which speakers are currently used for rendering a selected source.\\ + The \gls{bs} and \gls{brs} renderers are frequently used in scientific + research, such as \citep{DavidAckermann1895} or + \citep{DmitryGrigoriev1896}. The \gls{wfs} renderer has been improved by + the work of several research papers, dealing with enhancements of spatial + aliasing, active listening room and loudspeaker compensation and active + noise control \citep{SaschaSporsRudolfRabensteinJensAhrens1822} and + analyzing and pre-equalizing in 2.5-dimensional \gls{wfs} + \citep{SaschaSporsJensAhrens1821}.\\ + The \gls{ssr} uses \gls{xml} based configuration files for reproduction + (i.e.\ how something is played back) and scene (i.e.\ what is played + back). The \gls{asdf} however is not (yet) able to represent dynamic + setups.\\ + The application can be controlled through a \gls{tcp}/\gls{ip} socket. + \gls{osc} functionality can only be achieved using the capapilities of other applications such as \gls{pd} \citep{website:puredata2016} in - combination with it though.\\ - Unlike the two renderers above, the \gls{ssr} is not able to run large-scale - \gls{wfs} setups, as it lacks the features to communicate between instances of - itself on several computers, while these instances serve a subset of the + combination with it.\\ + Unlike \nameref{subsec:swonder} or \nameref{subsec:wfscollider}, the + \gls{ssr} is not able to run medium or large-scale \gls{wfs} setups, as + it lacks the features to communicate between instances of itself on + several computers, while these instances serve a subset of the available loudspeakers. \subsection{Why free software matters and what its pitfalls are} @@ -522,7 +598,10 @@ parskip=never]{paper} \subsubsection{OSC through PureData} \label{subsubsec:osc_through_puredata} To allow \gls{osc} communication, the \gls{ssr} incorporates a Lua - based \gls{pd} external. It uses two externals (iemnet and pdlua) + based \gls{pd} external. It uses two externals + (\textit{IEMnet}\footnote{ + \href{https://puredata.info/downloads/iemnet} + {https://puredata.info/downloads/iemnet}} and pdlua) alongside a Lua library for parsing and creating \gls{xml} (SLAXML). \subsubsection{Sending and receiving} @@ -616,7 +695,7 @@ parskip=never]{paper} through that many use-cases in free and closed audio and video related applications (e.g. Ardour \citep{website:ardour}, Cubase \citep{website:steinberg}, Max/MSP \citep{website:cycling74}, - SuperCollider \citep{website:supercollider}) since then.\\ + \gls{supercollider} \citep{website:supercollider}) since then.\\ \gls{osc}'s syntax is defined by several parts, which are discussed briefly in this section.\\ -- cgit v1.2.3-70-g09d2