Programming Digital Audio Server backend with Raku

FOSDEM2021, 6th February 2021

Audio web services

When we talk about sound processing on the remote server or cloud, we assume the set of various web audio services: AI composers, recognizers (stylistic classifiers, plagiarism scanners, audio content reviewers) or co-creativity.

The core features, like sound processing or synthesizing, could be provided on server or client side.

Today we have a more or less rich client (browser side) tools such as Web Audio API and libraries like ToneJS and WavesJS.

Client or Server side

Processing on client is a good deal for P2P web audio services. Examples: audio file editing, audio streams mixing, adding effects or sound visualizing.

Processing on server is perfect for: AI music composing, mixing audio streams in many-to-one model, sound processing with specific algorithms (not supported at client side libraries).

Actually client side processing is good choice for decentralized tasks and server side processing is perfect for cases where we need to apply arbitrary control.

Server side specifics

The fundamental differences between client and server side processing are:

Server is on Linux platform;
We have no GUI;
We have TCP/IP stack as the only data transport.

Both client and server side cases requires specific software layer with required audio processing features. But server side case is much more flexible — cause we can use almost everything we want.

👉 We can define this layer on server side as headless audio backend.

What is audio backend

In general audio backend handles requests from clients, routes them to audio engine and performs the responses. Backend has at least 2 layers:

Client service;
Audio engine.

Client service speaks with browsers and performs full stack client management, e.g. auth, configuration, setup, billing, audio streaming, etc.

Audio engine is the heart of server side processing and actually performs all DSP features.

ABC schema

ABC — Audio Engine - Balancer - Client service — is the extended model of audio processing backend.

The Balancer basically is used to reduce the load of Audio Engine.

Also the Balancer should be used in front of the distributed network of Audio Engines — you can populate your sound processing engines over a dozen of servers and Balancer will be some king of task manager in this case.

Audio engine

In general Audio Engine is the shared library, which provides the generic audio processing features.

E.g. if the web audio service is used for frequency analysis, Audio Engine should has the set of APIs: FFT and IDFT algorithms, normalizer, resampler, transfer function, spectrogram, etc...

In case of hardware acceleration — audio processing is performing on specific devices — Audio Engines should have additional low layer with driver syscalls or inline assembler.

👉 Audio Engines are traditionally written in fast languages: C/C++.

Balancer

The Balancer basically is used to reduce the load of Audio Engine. Cases:

Single Audio Engine — actually no Balancer needed. But if Audio Engine APIs are single threaded, Balancer can run them simultaneously in parallel threads — this has sense on multicore platforms;
Multiple Audio Engines — Balancer is used to throw the task to idling engine (or idling processor core on some of the engines);
Hardware acceleration — Balancer is sorting the tasks and selecting ones to process on CPU or DSP.

All these cases require explicit implementation of task management in Balancer (policies, scheduling, task fetching/cancellation).

Client controller

As front end client controller we suppose something like regular Web content management system (CMS).

CMS allows non-technical users to make changes to an existing website with little or no training and actually is a Web-site maintenance tool for non-technical administrators. A few specific CMS features:

Presentation and administration layers;
Web page generation via templates;
Scalability and expandability via modules;
WYSIWYG editing tools;
Workflow management (access levels, roles).

What's JRP

Considering ABC model, we are talking about backend software components.

Considering JRP, we are talking about programming tools: languages, frameworks and libraries. We define audio processing backend as a JRP pipeline — JUCE + RAKU + PHEIX.

JUCE framework — fast, well-documented, noob-friedly audio processing framework written in C/C++;
Raku — highly capable, feature-rich programming language made for at least the next hundred years;
Pheix — content management system with data storing on Blockchain.

Why JUCE

The JUCE framework has the large set of the tools and features required for audio processing. It is one of the most well-documented, actively evolved and powerful audio frameworks with the Linux support.

JUCE is providing tools to create headless (non-GUI, console) audio processing applications, which could be used as stand alone instances or shared libraries.

JUCE framework has a lot of components. And there are not only audio related tools. JUCE includes JSON, cryptography, data structures, GUI and many other handy classes. The idea is: you can use JUCE as the base and only framework for you applications.

Why Raku

Once more: feature-rich programming language made for at least the next hundred years ❤️ ❤️ ❤️.

But seriously, Raku has very intuitive and clear Native call layer for integration with third-party libraries or applications. I have experience with SWIG and JNI, well Raku is the the simplest one, I think.

C language is quite nice supported by Raku's NativeCall. But C++ is marked as experimental or not as tested and developed as C support. Nevertheless, we can successfully use it. Here's superior HOW-TO do C++ calls from Raku by Andrew Shitov

Why Pheix

The Pheix content management system is used as Client Service. It is currently the only CMS with Ethereum Blockchain native support. For the audio industry, in terms of copyright protection, this is definitely a must-have solution — since all metadata passing through the Digital Audio Server can be stored in a distributed ledger (both private and public, for example, the Görli network) and used for copyright disputes in future.

👉 Pheix public β-release was announced on 25 Jan, 2021.

Why Pheix? Why not!

How to integrate

In JRP concept — Raku is the glue for the audio processing backend components on the one hand, and the high-level adapter for the JUCE shared library on the other.

Pheix is written in the Raku language, Audio Engine adapter connects to Pheix as the addon (external module): all we need is to implement it according to the addon development guidelines.

Pheix addons/modules are installing as regular Raku modules and need to be setup in Pheix global configuration file as the dependencies.

Sample JRP service

The sample JRP service is the passive one. We have no persistent process or any active entity. This example uses shared library with the set of APIs for our purposes. If we need the batch or loop of sequential JUCE shared library API calls:

Extend JUCE shared library with «accumulator» function, which performs additional logic;
Do batch or loop on the next level: at backend's Client Service or client browser (multiple calls to audio processing service).

👉 This relies on performance, bandwidth, etc...

JUCE app as a shared linux library

JUCE framework is based on Projucer — rich JUCE projects manager. It contains cross platform demo examples and generic tools for project development and contributing.

Projucer generates makefiles and provides build options for linux platforms.

Unfortunately, there is no option to build shared library out of the box in JUCE 4 or lower. In this case you need to patch auto-generated Makefile. The repo with details: https://gitlab.com/pheix-juce/simpleconsole-shared

In JUCE 5 we have the shared library build option — so all you need is to implement your APIs and run make.

Projucer

Simple shared library

As extremely simple example we consider the library with single call juce_shared():

#include <cstdio>
#include <juce_core/system/juce_StandardHeader.h>

int juce_shared (void)
{
    printf(
        "I am JUCE %d.%d.%d shared library\n",
        JUCE_MAJOR_VERSION,
        JUCE_MINOR_VERSION,
        JUCE_BUILDNUMBER
    );
    return 0;
}

By default Projucer generates Makefile with -fvisibility=hidden, so take care about this (I have just commented this line for tests).

Raku's NativeCall

As it was mentioned above, there is Andrew Shitov's superior article about calling CPP/Fortran from raku.

In case of C++ we should grep the shared library for correct symbol name of our function with nm libbuilt-in-shared.so | grep juce_shared.

It will output something like:

000000000029e5a9 T _Z11juce_sharedv

The _Z prefix (underscore + a capital letter) is to avoid conflicts with user-defined names, 11 is the length of the function name, and v stands for void in place of function parameters.

Sample Raku test script

This symbol name we should use in our Raku script calljuce.raku

use NativeCall;

sub juce_shared()
    is native('/usr/local/lib/libbuilt-in-shared.so')
    is symbol('_Z11juce_sharedv') {*}

juce_shared();

Running calljuce.raku will output:

[kostas@webtech-macbook rekucall]$ ./calljuce.raku
JUCE v5.4.5
I am JUCE 5.4.5 shared library

Implement frequency visualizer

The frequency visualizer provides next basic functions:

Resampling to 32 kHz sampling rate (it's handy to use ffmpeg also);
Write down the spectrogram;
Mixing audio channels in case of stereo;
FFT
Analysis;
Save data to JSON.

👉 Matlab/Octave model of analysis algo is presented here: https://gitlab.com/neuroix/audio-to-neuro/-/blob/devel/audio2neuro.m

Spectrogram & frequency analysis

Implement Raku adapter module

Raku adapter module generic features and requirements:

Installation via zef module manager;
Full JUCE audio backend APIs coverage.

Currently JUCE Audio Engine APIs are:

int save_spectrogram(char * filename);

double * do_resampling(unsigned int sample_rate, double * buffer);

double * do_mixing(double * buffer);

double * do_fft(double * buffer);

char * do_analysis(double * fft_buffer);

Integrate to Pheix and make it live

Pheix addons/modules are installing as regular Raku modules and need to be set up in Pheix global configuration file as the dependencies.

Pheix is based on concept of hybrid CMS. On one hand, it works as a legacy CMS while global templates rendering, on other — content for each page is fetching via async API requests and this is the case of headless CMS.

Sample addon module: https://gitlab.com/pheix/dcms-raku/-/blob/develop/lib/Pheix/Addons/Embedded/User.rakumod.

Module configuration

Our module should be configured. We need to specify the routes:

Route for file upload form, e.g. /das/upload;
Route for uploaded files index, e.g. /das;
Route for file details, e.g. /das/{fileid:<\d>+}.

Pheix uses Router::Right module as routing engine.

Router::Right is well-documented, you could find docs, examples, best-practices and demo routes in wiki: https://gitlab.com/pheix/router-right-perl6/wikis/home

Routes configuration

{
  "routes":{
    "das":{
      "route":{
        "path":"/das",
        "hdlr":{ "default":"das_index", "/api":"das_index_api" }
      }
    },
    "upload":{
      "route":{
        "path":"/das/upload",
        "hdlr":{ "default":"das_upload", "/api":"das_upload_api" }
      }
    },
  "browsefile":{
      "route":{
        "path":"/das/{fileid:<\d>+}",
        "hdlr":{ "default":"das_browsefile", "/api":"das_browsefile_api" }
      }
    }
  }
}

Module structure

In JRP concept — Raku is the glue for the Pheix CMS on the one hand, and the high-level adapter for the JUCE application on the other.

Well, it means that our module should:

Cover JUCE audio backend APIs;
Follow Pheix CMS integration guidelines.

Let's use the terms of private/public methods. Private methods should basically focused on JUCE audio backend APIs and public methods will provide soft integration to Pheix.

Global config and prototypes

Our module should be added to Pheix global configuration file:

{
  "addons":{
    "group":{
      "installed":{
        "freqviz":"Pheix::Addons::Frequency::Visualizer"
      }
    }
  }
}

The prototypes for handlers (see routes config):

method das_index(UInt :$tick, :%match! );

# the same are: das_upload() and das_browsefile()

method das_index_api(Str :$route, UInt :$tick, Hash :$sharedobj!, :%match! );

# the same are: das_upload_api() and das_browsefile_api()

It works!

Frequency visualizer web service prototype: https://neuroix.narkhov.pro
Matlab/Octave models: https://gitlab.com/neuroix/audio-to-neuro
JUCE workaround: https://gitlab.com/pheix-juce
Pheix integration: https://gitlab.com/pheix/dcms-raku/-/issues/106

Perspectives

There was the great Workshop «Writing applications with JUCE audio backend and JavaScript frontend (React Native/Electron)» at Audio Development Conference 2019.

The basic idea was to use static headless JUCE audio backend with GUI written in JavaScript with React Native/Electron libraries. This kind of GUI is platform independent: looks the same on iOS, Android, Windows, OSX, Linux.

JUCE backend and JavaScript GUI was run on the same workstation and now I'll try to deploy them to client-server infrastructure. The big deal to route the sound stream from server to client and to put Pheix in the middle.

WIP 😇

Open call and donate

I would like to invite you to JUCE, Raku and Pheix development process — code review, forks and merge requests are very welcome:

https://github.com/juce-framework/JUCE

https://github.com/Raku/roast

https://gitlab.com/pheix/dcms-raku

If you like any ideas or concepts presented in this talk — let's get in touch and discuss or just donate at: https://pheix.org/#donate

The End

Konstantin Narkhov
konstantin@narkhov.pro
https://narkhov.pro
https://www.linkedin.com/in/knarkhov/