mirror of
https://github.com/Ponce/slackbuilds
synced 2024-11-20 19:41:34 +01:00
audio/SongRec: Simplify README.
Signed-off-by: Willy Sudiarto Raharjo <willysr@slackbuilds.org>
This commit is contained in:
parent
31321eeede
commit
e37c190f6c
1 changed files with 0 additions and 194 deletions
|
@ -17,197 +17,3 @@ thinking that it is the concerned song.
|
||||||
A (command-line only) Python version, which I made before rewriting in
|
A (command-line only) Python version, which I made before rewriting in
|
||||||
Rust for performance, is also available for demonstration purposes. It
|
Rust for performance, is also available for demonstration purposes. It
|
||||||
supports file recognition only.
|
supports file recognition only.
|
||||||
|
|
||||||
## How it works
|
|
||||||
|
|
||||||
For useful information about how audio fingerprinting works, you may
|
|
||||||
want to read [this article](http://coding-geek.com/how-shazam-works/).
|
|
||||||
To be put simply, Shazam generates a spectrogram (a time/frequency 2D
|
|
||||||
graph of the sound, with amplitude at intersections) of the sound, and
|
|
||||||
maps out the frequency peaks from it (which should match key points of
|
|
||||||
the harmonics of voice or of certains instruments).
|
|
||||||
|
|
||||||
Shazam also downsamples the sound at 16 KHz before processing, and cuts
|
|
||||||
the sound in four bands of 250-520 Hz, 520-1450 Hz, 1450-3500 Hz,
|
|
||||||
3500-5500 Hz (so that if a band is too much scrambled by noise,
|
|
||||||
recognition from other bands may apply). The frequency peaks are then
|
|
||||||
sent to the servers, which subsequently look up the strongest peaks in
|
|
||||||
a database, in order look for the simultaneous presence of neighboring
|
|
||||||
peaks both in the associated reference fingerprints and in the
|
|
||||||
fingerprint we sent.
|
|
||||||
|
|
||||||
Hence, the Shazam fingerprinting algorithm, as implemented by the
|
|
||||||
client, is fairly simple, as much of the processing is done
|
|
||||||
server-side. The general functionment of Shazam has been documented in
|
|
||||||
public [research
|
|
||||||
papers](https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf) and
|
|
||||||
patents.
|
|
||||||
|
|
||||||
|
|
||||||
Note: It is not mandatory, but if you want to be able to recognize more
|
|
||||||
formats than WAV, OGG, FLAC and MP3, you should ensure that you have
|
|
||||||
the `ffmpeg` package installed.
|
|
||||||
|
|
||||||
## Compilation
|
|
||||||
|
|
||||||
(**WARNING**: Remind to compile the code in "--release" mode for
|
|
||||||
correct performance.)
|
|
||||||
|
|
||||||
### Installing Rust
|
|
||||||
|
|
||||||
First, you need to [install the Rust compiler and package
|
|
||||||
manager](https://www.rust-lang.org/tools/install). It has been observed
|
|
||||||
to work with `rustc` 1.43.0 to the current rustc 1.47.0.
|
|
||||||
|
|
||||||
Install Rust and put it in path, for all distributions:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Type
|
|
||||||
"1"
|
|
||||||
# Login and reconnect to add Rust to the $PATH, or run:
|
|
||||||
source $HOME/.cargo/env
|
|
||||||
|
|
||||||
# If you already installed Rust, then update it:
|
|
||||||
rustup update
|
|
||||||
```
|
|
||||||
|
|
||||||
### Install dependent libraries (nothing exotic)
|
|
||||||
|
|
||||||
Debian:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sudo apt install build-essential libasound2-dev libgtk-3-dev libssl-dev
|
|
||||||
```
|
|
||||||
|
|
||||||
Void Linux (libressl):
|
|
||||||
|
|
||||||
```shell
|
|
||||||
sudo xbps-install base-devel alsa-lib-devel gtk+3-devel libressl-devel
|
|
||||||
```
|
|
||||||
|
|
||||||
Void Linux (openssl):
|
|
||||||
|
|
||||||
```shell
|
|
||||||
sudo xbps-install base-devel alsa-lib-devel gtk+3-devel openssl-devel
|
|
||||||
```
|
|
||||||
|
|
||||||
### Compiling the project
|
|
||||||
|
|
||||||
This will compile and run the projet:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# For the stable release:
|
|
||||||
cargo install songrec
|
|
||||||
songrec
|
|
||||||
|
|
||||||
# For the Github tree:
|
|
||||||
git clone git@github.com:marin-m/songrec.git
|
|
||||||
cd songrec
|
|
||||||
cargo run --release
|
|
||||||
```
|
|
||||||
|
|
||||||
For the latter, you will then find the project's binary (that you will
|
|
||||||
be able to move or execute directly) at `target/release/songrec`.
|
|
||||||
|
|
||||||
## Sample usage
|
|
||||||
|
|
||||||
Passing no arguments or using the `gui` subcommand will launch the GUI,
|
|
||||||
and try to recognize audio real-time as soon as the application is
|
|
||||||
launched:
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec
|
|
||||||
./songrec gui
|
|
||||||
```
|
|
||||||
|
|
||||||
Using the `gui-norecording` subcommand will launch the GUI without
|
|
||||||
recognizing audio as soon as the software is started (you will need to
|
|
||||||
click the "Turn on microphone recognition" button to do so):
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec gui-norecording
|
|
||||||
```
|
|
||||||
|
|
||||||
The GUI allows you to recognize songs either from your microphone,
|
|
||||||
speakers (on compatible PulseAudio setups), or from an audio file. The
|
|
||||||
MP3, FLAC, WAV and OGG formats should be accepted for audio files if
|
|
||||||
FFMpeg is not installed, and any audio or video formats supported by
|
|
||||||
FFMpeg should be accepted if FFMpeg is installed.
|
|
||||||
|
|
||||||
The following commands allow to recognize sound from your microphone or
|
|
||||||
from a file using the command line (`listen` runs while the microphone
|
|
||||||
is usable while `recognize` recognizes only one song), use the `-h`
|
|
||||||
flag in order to see all the available options:
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec listen -h
|
|
||||||
./songrec recognize -h
|
|
||||||
```
|
|
||||||
|
|
||||||
By default, only the artist and track name of the concerned song are
|
|
||||||
displayed to the standard output, and other information may be
|
|
||||||
displayed to the error output. The `--csv` and `--json` options allow
|
|
||||||
to display more programmatically usable information to the standard
|
|
||||||
output.
|
|
||||||
|
|
||||||
The above decribes the newer CLI interface of SongRec, but an older
|
|
||||||
interface, operating only on audio files or raw audio fingerprints, is
|
|
||||||
also available and described below.
|
|
||||||
|
|
||||||
The following subcommand will try to recognize audio from the middle of
|
|
||||||
an audio file, and print the JSON response from Shazam servers:
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec audio-file-to-recognized-song sound_file.mp3
|
|
||||||
```
|
|
||||||
|
|
||||||
The following subcommands will do the same with an intermediary step,
|
|
||||||
manipulating data-URI audio fingerprints as used by Shazam internally:
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec audio-file-to-fingerprint sound_file.mp3
|
|
||||||
./songrec fingerprint-to-recognized-song
|
|
||||||
'data:audio/vnd.shazam.sig;base64,...'
|
|
||||||
```
|
|
||||||
|
|
||||||
The following will produce back hearable tones from a given
|
|
||||||
fingerprint, that should be able to fool Shazam into thinking that this
|
|
||||||
is the original song (either to the default audio output device, or to
|
|
||||||
a .WAV file):
|
|
||||||
|
|
||||||
```
|
|
||||||
./songrec fingerprint-to-lure 'data:audio/vnd.shazam.sig;base64,...'
|
|
||||||
./songrec fingerprint-to-lure 'data:audio/vnd.shazam.sig;base64,...'
|
|
||||||
/tmp/output.wav
|
|
||||||
```
|
|
||||||
|
|
||||||
When using the application, you may notice that certain information
|
|
||||||
will be saved to `~/.local/share/SongRec` (or an equivalent directory
|
|
||||||
depending on your operating system), including the CSV-format list of
|
|
||||||
the last recognized songs and the last selected microphone input device
|
|
||||||
(so that it is chosen back when restarting the app). You may want to
|
|
||||||
delete this directory in case of persistent issues.
|
|
||||||
|
|
||||||
## Privacy
|
|
||||||
|
|
||||||
SongRec collects no data and contacts no other servers than Shazam's.
|
|
||||||
SongRec does not upload raw audio data anywhere: only fingerprints of
|
|
||||||
the audio are uploaded, which means sequences of frequency peaks
|
|
||||||
encoded in the form of "(frequency, amplitude, time)" tuples.
|
|
||||||
|
|
||||||
This does not suffice to represent anything hearable alone (use the
|
|
||||||
"Play a Shazam lure" button to see how much this is different from full
|
|
||||||
sound); that means that no actually hearable sound (e.g voice
|
|
||||||
fragments) is sent to servers, only metadata derived on the
|
|
||||||
characteristics of the sound that may only suffice to recognize a song
|
|
||||||
already known by Shazam is being sent.
|
|
||||||
|
|
||||||
## Legal
|
|
||||||
|
|
||||||
This software is released under the [GNU GPL
|
|
||||||
v3](https://www.gnu.org/licenses/gpl-3.0.html) license. It was created
|
|
||||||
with the intent of providing interoperability between the remote Shazam
|
|
||||||
services and Linux-based deskop systems.
|
|
||||||
|
|
||||||
Please note that in certain countries located outside of the European
|
|
||||||
Union, especially the United States, software patents may apply.
|
|
||||||
|
|
Loading…
Reference in a new issue