[Interview] Daniel Iglesia, San Francisco (Audio Engineer, iOS/Android Developer)

Daniel Iglesia is an author of popular music software MobMuPlat, MiniMash, and SpaceLab. He is also a musician and a composer and his works have been showcased in prestigious venues and festivals worldwide. Daniel holds a doctorate in Music Composition from Columbia University and has taught at renowned institutions such as Columbia, Pratt, and Princeton. He is also a member of various ensembles and has developed innovative tools. He currently works at YouTube in the San Francisco Bay Area.

Everybody who knows PureData knows MobMuPlat, but we are not so educated on SpaceLap and MiniMash – can you give a brief explanation of those?

I was out of grad school in New York City around 2010 and needed a way to support myself. iOS development was still a pretty fresh field at the time, and I did some freelance work for artists and musicians who wanted to incorporate mobile phones into their practice.

MiniMash was my first venture into releasing my own app; I had experience dealing with time stretching and beat detection already, and so ported some existing analysis and playback code into the app. While building it, I also learned that engineers should not design UI; an initial UI looked horrible; an external designer remade the UI design (and the cool dot-matrix font). 

vimeo.com/25455527. It was a moderate success at the time, by the standard of solo app developers, and it led to more mobile-development freelance work (and eventual switch into being a full-time mobile developer). 

Spacelab (vimeo.com/38200058) was a follow up, and was the first time I used libpd to put PureData in a mobile app. I wanted to experiment with various synthesis interfaces in a mobile form factor; while there’s a traditional-looking keyboard, you could also play it with a big X/Y touch pad, or by blowing into the mic and using it like a wind controller. The synth engine is all in Pd, and that patch continued to evolve for use into other projects (i.e. pieces for ensembles).

Sadly, neither of those apps are currently available!  In the case of MiniMash, other apps caught up with similar features. Plus, as people switched from storing music locally to using streaming services, the app became less easy to use for most people (as it depends on having a local library of music files…something less common in the age of music streaming services).

In what IDE and what language did you program MobMuPlat? How hard was it to include pd-lib?

MobMuPlat on iOS was written in Objective-C; the MacOs UI editor was also in Objective-C, but on the desktop framework.

On Android, it’s in Java; the Cross-platform UI editor is in Java on the Swing UI framework (which is deprecated and probably won’t be officially supported for much longer).

So that’s 4 different codebases, which made feature development slow.

In the case of iOS, the libpd library (and its dependency on pure-data as a submodule) is straightforward to include as a compilation dependency. Getting that library dependency to work  on Android is a bit harder, and so I’ve depended on Tal Kirshboim to create and store a precompiled library binary for libpd-on-android in repositories, and I build against that. Not ideal in terms of keeping up with cutting-edge libpd/pd changes, though.

For the last few years, I’ve been doing most of my work in Flutter (cross-platform app development framework); I’m hoping that one day I can rewrite MobMuPlat _and_ the UI editor in Flutter, which would combine a ton of model objects, business logic, and UI logic into a single definition.

Do you think smartphones running Android are a good hardware platform for realtime acoustic software? Will Raspberry Pi eventually be a better target?

Complicated question! The fragmentation of the Android ecosystem means different chipsets and firmware have had wildly different latencies for real-time audio I/O. Over the last several years, firmware, OS, and library support has, for the most part, greatly improved this on average. 

If you’re a Linux hacker, though, you’re probably going to have more control, and a better chance at lowest-possible latency, on a more open linux system like RPi. I haven’t done much with realtime audio i/o on either platform, nor have I measured audio latency recently. 

Both platforms have been satisfactory for my use cases, and I’ve developed ensemble pieces for both. However, I tend to end up leaning towards using mobile phones (either iOS or Android) for my ensemble work; if I’m making a piece for 8 players, I’d rather they run the piece on their phone instead of my having to build 8 custom hardware systems (and mail it to them!). Plus, commercial phone hardware has everything built into one package (battery, wifi, touch screen, audio i/o); sourcing that for a custom Rpi system is probably more expensive than a used Android phone.

Some exceptions to this are when using a heavyweight computer vision algo in the piece; I’ve used the NVidia kits (e.g. a TX2 or a Jetson Nano, etc), and they are usually more powerful (and yield a higher framerate than a mobile phone camera + hardware. But getting those systems running, and working with custom camera input, is usually very time-consuming, if it works at all. And even in that case, I’ve fallen back on using a quartet of iPhones instead of building a quartet of custom systems; it was cheaper and far more portable.

I’m also sensitive to the learning curve involved in custom hardware/software; the Pd community is very invested in Linux systems. I love open systems and open-source software, but we need to be honest about the learning curve involved. Someone shouldn’t need to be a Linux hacker in order to make music with Pd. 

When I first was working on MobMuPlat, I told a prominent electronic ensemble leader (who insisted on only using open source OS and software for his work) proponent about my plan to help people run Pd on phones. He scoffed and said ‘who would want to do that’? For him, technological gatekeeping was more important than letting new audiences work with audio software.

Are you incorporating any smartphones hardware sensors in your instrument design? 

My most common interface, beyond the capabilities of the hardware itself (e.g. touch and tilt), is the tether controller; it’s a USB 6-axis continuous controller, originally for a golf video game, which has become widely used in the experimental ensemble world. Here’s a video of it in action in a piece from a few years ago, where it controls a range of synthesis values (which are visualized for 3d glasses: vimeo.com/353486861

I’ve recently been experimenting with video tracking, and made a piece (vimeo.com/859531145 ) using a combination of several computer vision models (hand, body/pose, face, etc) to control a patch. This was with MediaPipe (developers.google.com/mediapipe  which I also occasionally interact with in my day job) running on phones.

A long-running project of mine, with a few of my favorite collaborators, is The Gaits, a sound walk on the High Line (elevated walkway) in New York City, which has been running (almost) every winter solstice since 2011: icareifyoulisten.com/2011/12/the-gaits-a-soundtrack-for-the-high-line/

It hasn’t run during Covid, but we are working on a new version for this winter. It uses the accelerometers in the phone to detect steps, and uses location to determine where in the walkway you are.

Today when mobile devices are getting cheaper and it is easy to interconnect those, do you think MIDI over WIFI, or OSC over WIFI, or something else is the best choice for communication in a distributed musical system? 

Sideband (sidebandband.com, of which I am a member) and its parent ensemble, the Princeton Laptop Orchestra, have been dealing with this for a long time now. 

Typically, both groups have used OSC over wifi (on a local router); OSC typically uses UDP, which means packets can get dropped. That is a problem! Initially, most pieces used multicasting (blasting a message to all devices on a local network), but that is very likely to lose UDP packets. One early strategy was to emphasize direct connections (i.e. targeting a message to a set of explicit IP addresses) instead of multicasting. This (typing your ensemble members’ IP addresses into a patch, right before starting to play a piece) is cumbersome and error prone (and wastes concert time). 

So ensemble member Jascha Narveson devised a simple network protocol ‘Landini’ to automate this, and to bookkeep dropped packets and resend. This got the speed of UDP, with increased reliability of direct connections, plus resending of dropped messages. The downside is that network congestion quickly increases as the ensemble gets more members on the network. 

MobMuPlat has a few options for how to send network messages, based on how many people (including myself!) use it. Many people use it as a simple controller, and just want a direct connection to or from their laptop. There’s also ‘Ping and Connect’, which is a subset of Landini functionality, which just does the discovery and direct connection to other clients on the network. There’s also the full Landini client available.

Then there’s the problem of routers themselves; we discovered that routers can be temperamental in different environments, in terms of both connectivity and lag. For our most recent tour (earlier this year), Sideband switched to using a wired network (with an ethernet hub). This was ultimately more reliable. My own work tends to value the mobility and self-contained-ness of mobile devices, so I still often prefer to use wireless when feasible.

Sometimes we get performance issues (lags, broken sync, dropouts) in audio software we do, do you have any profiler (in which IDE?) that you have found useful? Have you ever recorded traces on a target device and analyzed them in Perfetto or similar software?

I only occasionally have used profilers, and usually just for RAM or CPU usage, not for examining audio bottlenecks.

Do you think Flutter is capable of building realtime audio software for Android and iOS and web? Are audio drivers integrated good enough? Are other sensors, like gyroscope, proximity, camera supported?

Flutter is an app development framework, but, for maximum flexibility and portability across platforms, does _not_ interact directly with some system resources like audio I/O. Nearly all interaction with platform-specific sensors or drivers is done through a ‘plugin’ architecture, where a developer writes platform-specific code which communicates with Flutter. (There’s also a method to write native C code directly within your app, and have it compile and run across platforms)

For basic audio playback (and some other platform functionality like display of media metadata on the screen), I wrote one a while ago (pub.dev/packages/audiofileplayer). There’s a ton of existing plugins for all kinds of platform-specific interaction.

Flutter is not a framework for media processing, so realtime audio or video is not going to happen within that app framework. But it’s easy to wrap your existing native code for use within a Flutter app.

What would you guess today is the best IDE or development setup for designing realtime audio software on Android and iOS?

I tend to stick with the recommended IDEs per mobile platform: Xcode for iOS, Android Studio for Android. I am going to try out VSCode as well, though. As for realtime audio software itself, my previous apps did the work in the native language (e.g. ObjC w/ CoreAudio or AVFoundation) or in PureData + libpd. 

While I’ve used MediaPipe for video analysis (and some basic visual painting), I’m hoping to take a look at that further for realtime audio; all the building blocks are there for defining modular graphs for realtime audio. I know there’s a bunch of other languages/frameworks that I haven’t gotten a chance to look at yet.

Where can we follow more of your work, both technical and artistic? And do you have any social links where you are happy to connect with readers?

Most of my creative work over the last few years is with Sideband (mentioned above); here’s a recent tour highlight reel: vimeo.com/manage/videos/867553308 which includes my most recent work (using computer vision input).

Now that it is publicly released, I can also finally mention my day job, which is working on the YouTube Create app: www.youtube.com/watch?v=dhLQS_XCG0g 

A lot of my existing experience is put to good use, such as a range of mobile languages and frameworks: Android Java, Objective C for iOS, Flutter, and occasional bits of native C/C++ for media processing.

I’ve mostly stopped social media (or, at best, am just an occasional lurker)! Maybe one day I’ll be more active again.