Card image

A bridge between creativity, technology, and participatory science

Marco Raoul Marini
April 9, 2025, 8:45 a.m.

Author: Marco Raoul Marini, University of Rome "La Sapienza", Computer Science Department, Visionlab, UXLab DigiLab 


An innovative example of 
citizen science, where Artificial Intelligence merges with geometric drawing, smartphone technology, and music to engage the public in co-creating knowledge. Developed in the VisionLab of Sapienza University of Rome’s Computer Science Department, the project is supported by the Anna Maria Catalano Foundation and selected for Fondazione TIM’s 2023 “Call for Ideas.” Beyond technological innovation, DSM stands out for its democratic and inclusive approach, transforming users into active contributors to research.


How it works

The app converts hand-drawn sketches (e.g., a paper keyboard) into interactive musical instruments. Using the smartphone’s camera, computer vision algorithms and AI detect keys and hand movements to generate realistic sounds. The final version, free and ad-free, will launch by May 2025 on www.musikeyrtual.it and the Google Play Store.


Technical deep dive:

  1. Keyboard detection module

    • Optimised workflow: Reduced calibration steps from 2 to 1 by eliminating background subtraction, streamlining processing time.

    • Contour-based detection: A redesigned algorithm identifies keys via area and contour analysis (instead of error-prone horizontal/vertical line detection), improving accuracy. Results are overlaid as a virtual keyboard for real-time user feedback.

    • Preprocessing: Enhances contrast between black keys and white paper to minimize false positives.

  2. Hand tracking with MediaPipeUnityPlugin

    • Leverages Google’s MediaPipe machine learning library for real-time hand tracking. Integrated via the open-source MediaPipeUnityPlugin (MUP) for Unity/C# compatibility.

    • Challenges included limited MUP documentation and low-level MediaPipe integration, requiring extensive testing across Android devices of varying computational power.

  3. Depth estimation and stabilisation

    • MediaPipe’s unstable depth data (for hand-to-camera distance) is smoothed using Simple Moving Average (SMA) and Exponential Moving Average (EMA).

    • Experimental results showed both methods insufficient alone, prompting ongoing research to combine depth data with scene-mapping techniques.

  4. Beta features

    • Multi-finger support: Users can activate any combination of fingers (from single to all) for polyphonic play. Currently in testing.

    • Dynamic perspective: Detects keyboards at 90° (top-down), then switches to 0° (flat) for optimal hand tracking.

    • MIDI integration: Supports 128 base MIDI instruments, with options for preset selections or random assignments.


Citizen science in action

DSM integrates participatory research in two ways:

  1. Collaborative learning
    Users improve AI algorithms by sharing drawings and usage data. The system adapts to diverse graphic styles and gestures, evolving through community input.

  2. Co-designed education
    Teachers, therapists, and students propose new modules (e.g., motor rehabilitation exercises or STEM lessons), ensuring the app remains flexible and needs-driven.


Target audience and roadmap

Beyond educators and therapists, DSM engages all ages—especially youth—in collective learning. Key milestones include:

  • Public workshop (April 2025): Gather user feedback for UI/UX refinements.

  • Scientific conference (May 5, 2025): Present results highlighting community-generated data’s role in AI training.

  • “Là Fuori” Festival (June 2025): Interactive installations for collaborative “sound maps,” where participants compose shared melodies.


Technical UI features

A new settings menu allows users to:

  • Select active fingers/hands for play.

  • Switch between preset instruments or randomize via MIDI.

  • Adjust the piano’s starting octave.


Social impact and sustainability

DSM democratises access to music and scientific research by replacing costly hardware with paper and smartphones. Applications include:

  • Affordable motor/cognitive rehabilitation tools.

  • Inclusive music creation for individuals with disabilities or limited resources.

  • Creative STEM education to spark interest in coding, AI, and acoustics.


Call to action

This project is more than an app—it’s an open science experiment. Every drawing, feedback note, and shared melody builds a future where technology and creativity are co-designed.


Download, experiment, contribute: Tomorrow’s music is in your hands.



Technical development led by Sapienza University’s VisionLab, with open-source contributions to MediaPipeUnityPlugin. Funded by Fondazione TIM and Anna Maria Catalano Foundation.


x
This website is using cookies. More info. That's Fine