Download Transition-Aware: A More Robust Approach for Piano Transcription
Piano transcription is a classic problem in music information retrieval. More and more transcription methods based on deep learning have been proposed in recent years. In 2019, Google Brain published a larger piano transcription dataset, MAESTRO. On this dataset, Onsets and Frames transcription approach proposed by Hawthorne achieved a stunning onset F1 score of 94.73%. Unlike the annotation method of Onsets and Frames, Transition-aware model presented in this paper annotates the attack process of piano signals called atack transition in multiple frames, instead of only marking the onset frame. In this way, the piano signals around onset time are taken into account, enabling the detection of piano onset more stable and robust. Transition-aware achieves a higher transcription F1 score than Onsets and Frames on MAESTRO dataset and MAPS dataset, reducing many extra note detection errors. This indicates that Transition-aware approach has better generalization ability on different datasets.
Download An Audio-Visual Fusion Piano Transcription Approach Based on Strategy
Piano transcription is a fundamental problem in the field of music information retrieval. At present, a large number of transcriptional studies are mainly based on audio or video, yet there is a small number of discussion based on audio-visual fusion. In this paper, a piano transcription model based on strategy fusion is proposed, in which the transcription results of the video model are used to assist audio transcription. Due to the lack of datasets currently used for audio-visual fusion, the OMAPS data set is proposed in this paper. Meanwhile, our strategy fusion model achieves a 92.07% F1 score on OMAPS dataset. The transcription model based on feature fusion is also compared with the one based on strategy fusion. The experiment results show that the transcription model based on strategy fusion achieves better results than the one based on feature fusion.