Download Physically Based Sound Synthesis and Control of Footsteps Sounds
We describe a system to synthesize in real-time footsteps sounds. The sound engine is based on physical models and physically inspired models reproducing the act of walking on several surfaces. To control the real-time engine, three solutions are proposed. The first two solutions are based on floor microphones, while the third one is based on shoes enhanced with sensors. The different solutions proposed are discussed in the paper.
Download Do You Hear a Bump or a Hole? An Experiment on Temporal Aspects in Footsteps Recognition
In this paper, we present a preliminary experiment whose goal is to assess the role of temporal aspects in sonically simulating the act of walking on a bump or a hole. In particular, we investigate whether the timing between heel and toe and the timing between footsteps affects the perception of walking on unflat surfaces. Results show that it is possible to simulate a bump or a hole by only using temporal information in the auditory modality.
Download A Preliminary Study on Sound Delivery Methods for Footstep Sounds
In this paper, we describe a sound delivery method for footstep sounds, investigating whether subjects prefer static rendering versus dynamic. In this case, dynamic means that the sound delivery method simulates footsteps following the subject. An experiment was run in order to assess subjects’ preferences regarding the sound delivery methods. Results show that static rendering is not significantly preferred to dynamic rendering, but subjects disliked rendering where footstep sounds followed a trajectory different from the one they were walking along.
Download Hard real-time onset detection of percussive instruments
To date, the most successful onset detectors are those based on frequency representation of the signal. However, for such methods the time between the physical onset and the reported one is unpredictable and may largely vary according to the type of sound being analyzed. Such variability and unpredictability of spectrum-based onset detectors may not be convenient in some real-time applications. This paper proposes a real-time method to improve the temporal accuracy of state-of-the-art onset detectors. The method is grounded on the theory of hard real-time operating systems where the result of a task must be reported at a certain deadline. It consists of the combination of a time-base technique (which has a high degree of accuracy in detecting the physical onset time but is more prone to false positives and false negatives) with a spectrum-based technique (which has a high detection accuracy but a low temporal accuracy). The developed hard real-time onset detector was tested on a dataset of single non-pitched percussive sounds using the high frequency content detector as spectral technique. Experimental validation showed that the proposed approach was effective in better retrieving the physical onset time of about 50% of the hits detected by the spectral technique, with an average improvement of about 3 ms and maximum one of about 12 ms. The results also revealed that the use of a longer deadline may capture better the variability of the spectral technique, but at the cost of a bigger latency.
Download Fast Approximation of the Lambert W Function for Virtual Analog Modelling
When modelling circuits one has often to deal with equations containing both a linear and an exponential part. If only a single exponential term is present or predominant, exact or approximate closed-form solutions can be found in terms of the Lambert W function. In this paper, we propose reformulating such expressions in terms of the Wright Omega function when specific conditions are met that are customary in practical cases of interest. This eliminates the need to compute an exponential term at audio rate. Moreover, we propose simple and real-time suitable approximations of the Omega function. We apply our approach to a static and a dynamic nonlinear system, obtaining digital models that have high accuracy, low computational cost, and are stable in all conditions, making the proposed method suitable for virtual analog modelling of circuits containing semiconductor devices.
Download Analysis and Emulation of Early Digitally-Controlled Oscillators Based on the Walsh-Hadamard Transform
Early analog synthesizer designs are very popular nowadays, and the discrete-time emulation of voltage-controlled oscillator (VCO) circuits is covered by a large number of virtual analog (VA) textbooks, papers and tutorials. One of the issues of well-known VCOs is their tuning instability and sensitivity to environmental conditions. For this reason, digitally-controlled oscillators were later introduced to provide stable tuning. Up to now, such designs have gained much less attention in the music processing literature. In this paper, we examine one of such designs, which is based on the Walsh-Hadamard transform. The concept was employed in the ARP Pro Soloist and in the Welson Syntex, among others. Some historical background is provided, along with a discussion on the principle, the actual implementation and a band-limited virtual analog derivation.
Download Bio-Inspired Optimization of Parametric Onset Detectors
Onset detectors are used to recognize the beginning of musical events in audio signals. Manual parameter tuning for onset detectors is a time consuming task, while existing automated approaches often maximize only a single performance metric. These automated approaches cannot be used to optimize detector algorithms for complex scenarios, such as real-time onset detection where an optimization process must consider both detection accuracy and latency. For this reason, a flexible optimization algorithm should account for more than one performance metric in a multiobjective manner. This paper presents a generalized procedure for automated optimization of parametric onset detectors. Our procedure employs a bio-inspired evolutionary computation algorithm to replace manual parameter tuning, followed by the computation of the Pareto frontier for multi-objective optimization. The proposed approach was evaluated on all the onset detection methods of the Aubio library, using a dataset of monophonic acoustic guitar recordings. Results show that the proposed solution is effective in reducing the human effort required in the optimization process: it replaced more than two days of manual parameter tuning with 13 hours and 34 minutes of automated computation. Moreover, the resulting performance was comparable to that obtained by manual optimization.
Download A Structural Similarity Index Based Method to Detect Symbolic Monophonic Patterns in Real-Time
Automatic detection of musical patterns is an important task in the field of Music Information Retrieval due to its usage in multiple applications such as automatic music transcription, genre or instrument identification, music classification, and music recommendation. A significant sub-task in pattern detection is the realtime pattern detection in music due to its relevance in application domains such as the Internet of Musical Things. In this study, we present a method to identify the occurrence of known patterns in symbolic monophonic music streams in real-time. We introduce a matrix-based representation to denote musical notes using its pitch, pitch-bend, amplitude, and duration. We propose an algorithm based on an independent similarity index for each note attribute. We also introduce the Match Measure, which is a numerical value signifying the degree of the match between a pattern and a sequence of notes. We have tested the proposed algorithm against three datasets: a human recorded dataset, a synthetically designed dataset, and the JKUPDD dataset. Overall, a detection rate of 95% was achieved. The low computational load and minimal running time demonstrate the suitability of the method for real-world, real-time implementations on embedded systems.
Download On the Challenges of Embedded Real-Time Music Information Retrieval
Real-time applications of Music Information Retrieval (MIR) have been gaining interest as of recently. However, as deep learning becomes more and more ubiquitous for music analysis tasks, several challenges and limitations need to be overcome to deliver accurate and quick real-time MIR systems. In addition, modern embedded computers offer great potential for compact systems that use MIR algorithms, such as digital musical instruments. However, embedded computing hardware is generally resource constrained, posing additional limitations. In this paper, we identify and discuss the challenges and limitations of embedded real-time MIR. Furthermore, we discuss potential solutions to these challenges, and demonstrate their validity by presenting an embedded real-time classifier of expressive acoustic guitar techniques. The classifier achieved 99.2% accuracy in distinguishing pitched and percussive techniques and a 99.1% average accuracy in distinguishing four distinct percussive techniques with a fifth class for pitched sounds. The full classification task is a considerably more complex learning problem, with our preliminary results reaching only 56.5% accuracy. The results were produced with an average latency of 30.7 ms.
Download A Comparison of Deep Learning Inference Engines for Embedded Real-Time Audio Classification
Recent advancements in deep learning have shown great potential for audio applications, improving the accuracy of previous solutions for tasks such as music transcription, beat detection, and real-time audio processing. In addition, the availability of increasingly powerful embedded computers has led many deep learning framework developers to devise software optimized to run pretrained models in resource-constrained contexts. As a result, the use of deep learning on embedded devices and audio plugins has become more widespread. However, confusion has been rising around deep learning inference engines, regarding which of these can run in real-time and which are less resource-hungry. In this paper, we present a comparison of four available deep learning inference engines for real-time audio classification on the CPU of an embedded single-board computer: TensorFlow Lite, TorchScript, ONNX Runtime, and RTNeural. Results show that all inference engines can execute neural network models in real-time with appropriate code practices, but execution time varies between engines and models. Most importantly, we found that most of the less-specialized engines offer great flexibility and can be used effectively for real-time audio classification, with slightly better results than a real-time-specific approach. In contrast, more specialized solutions can offer a lightweight and minimalist alternative where less flexibility is needed.