Expérimentations avec SOX


SOund eXange : Le "couteau suisse" des outils audio.
Le logiciel est utilisé par l'intermédiaire de trois commandes: sox, play, rec
Sox : lit et écrit les fichiers sonores dans les formats les plus usuels.
Play : lit les fichiers sonores et les envoie vers le haut-parleur
Rec: enregistre des fichiers sonores.

Consulter en ligne :

http://doc.ubuntu-fr.org/sox

http://billposer.org/Linguistics/Computation/SoxTutorial.html - en anglais

Lire en local :
tagadan@aeiouyz:~$ apropos sox
play (1) - Sound eXchange, the Swiss Army knife of audio ...
soxeffect (7)- Sound eXchange, the Swiss Army knife of audio manipula...
soxi (1) - Sound eXchange Information, display sound file metadata
sox (1) - Sound eXchange, the Swiss Army knife of audio manipula...
soxformat (7) - Sound eXchange, the Swiss Army knife of audio.

tagadan@aeiouyz:~$ man sox

On quitte le manuel en pressant la lettre q
Page rédigée en février 2011


Conversions avec sox.

Sox NE SAIT PAS convertir directement en mp3. Ainsi :
tagadan@aeiouyz:~$sox un.ogg un.mp3 donnera
sox FAIL formats: can't open output file `un.mp3': SoX was compiled without MP3 encoding support

Il faut procéder en deux étapes :

1. Convertir un.ogg en un.wav : tagadan@aeiouyz:~$sox un.ogg un.wav
2. Convertir un.wav en un.mp3 en utilisant le logiciel lame : tagadan@aeiouyz:~$slame un.wav un.mp3

Exemple de script permettant de convertir tous les fichiers d'un répertoire.
#!/bin/bash
for i in *.ogg; do
j="${i%.ogg}"
sox "$j.ogg" "$j.wav" && lame "$j.wav" "$j.mp3" && rm -f "$j.wav" && echo "$i réencodé en MP3."
done


Synthèse de sons avec sox ou play : le paramètre synth
Remarque sox envoie le résultat dans un fichier et play vers le haut-parleur

sox suivi de synth synthèse d'un son
sox un-truc synth
permet de faire le travail à partir de un-truc
sox -n synth
sans un-truc
synth 5 sine 500
son de 5 secondes signal sinusoïdal fréquence 500 HZ échantillonné à 48kHZ
synth 5 sine 300-4000
idem mais fréquence variant de 300 à 4000 HZ
-r 48000 ou -r 48k ou rate 48k
rate=fréquence d'échantillonnage

Exemples :

tagadan@aeiouyz:~$sox -n output.wav synth 3 sin 300-3000
tagadan@aeiouyz:~$sox -r 8000 -n output.wav 3 sine 3000-400
tagadan@aeiouyz:~$play -n synth 1 sine 900-300

Exemple de script générant une série de notes et les enregistrant dans un fichier

#/bin/bash
sox -t null /dev/null do.wav synth 1.0 sine 261.63
sox -t null /dev/null re.wav synth 1.0 sine 293.66
sox -t null /dev/null mi.wav synth 1.0 sine 329.63
sox -t null /dev/null fa.wav synth 1.0 sine 349.23
sox -t null /dev/null sol.wav synth 1.0 sine 392
sox -t null /dev/null la.wav synth 1.0 sine 440
sox -t null /dev/null si.wav synth 1.0 sine 493.88

Commande jouant la série de notes et enregistrant le résultat dans un fichier

sox do.wav re.wav mi.wav fa.wav sol.wav la.wav si.wav gamme.wav

Commande jouant trois notes constituant un accord enregistré dans un fichier

sox -m do.wav mi.wav sol.wav accord.wav

Si l'on veut mettre les sons les uns à la suite des autres :

sox sine1.wav sine2.wav sine3.wav troissin.wav

Guitare Linux ?

 for i in E A7 E2 B7 A E3 G A B C7 D;do play -n synth 1 pluck $i repeat 2;done 


Page de man. Rappel : man sox affiche le manuel en anglais

synth [-j KEY] [-n] [len [off [ph [p1 [p2 [p3]]]]]] {[type] [combine]
    [[%]freq[k][:|+|/|-[%]freq2[k]]] [off [ph [p1 [p2 [p3]]]]]}

Cet outil peut être utilisé pour générer des signaux de fréquences fixes ou variant entre deux valeurs fixes avec des formes d'onde différentes, ou pour générer à large bande des bruit de diverses couleurs.
De multiples effets peuvent être effectués successivement de façon à produire des signaux plus complexes.
A chaque étape, il est possible de choisir si la forme d'onde générée sera mélangée avec la suivante ou si le nouvel effet s'applique sur la sortie de l'étape précédente.
Chaque canal audio devant aboutir dans un dans un fichier audio multi-canal peut être synthétisé indépendamment.

Though this effect is used to generate audio, an input file must
still be given, the characteristics of which will be used to set
the synthesised audio length, the number of channels, and the
sampling rate; however, since the input file's audio is not nor‐
mally needed, a `null file' (with the special name -n) is often
given instead (and the length specified as a parameter to synth
or by another given effect that can has an associated length).

For example, the following produces a 3 second, 48kHz, audio
file containing a sine-wave swept from 300 to 3300 Hz:
sox -n output.wav synth 3 sine 300-3300
and this produces an 8 kHz version:
sox -r 8000 -n output.wav synth 3 sine 300-3300
Multiple channels can be synthesised by specifying the set of
parameters shown between braces multiple times; the following
puts the swept tone in the left channel and adds `brown' noise
in the right:
sox -n output.wav synth 3 sine 300-3300 brownnoise
The following example shows how two synth effects can be cas‐
caded to create a more complex waveform:
play -n synth 0.5 sine 200-500 synth 0.5 sine fmod 700-100
Frequencies can also be given in `scientific' note notation, or,
by prefixing a `%' character, as a number of semitones relative
to `middle A' (440 Hz). For example, the following could be
used to help tune a guitar's low `E' string:
play -n synth 4 pluck %-29
or with a (Bourne shell) loop, the whole guitar:
for n in E2 A2 D3 G3 B3 E4; do
play -n synth 4 pluck $n repeat 2; done

See the delay effect (above) and the reference to `SoX scripting
examples' (below) for more synth examples.
N.B. This effect generates audio at maximum volume (0dBFS),
which means that there is a high chance of clipping when using
the audio subsequently, so in many cases, you will want to fol‐
low this effect with the gain effect to prevent this from hap‐
pening. (See also Clipping above.) Note that, by default, the
synth effect incorporates the functionality of gain -h (see the
gain effect for details); synth's -n option may be given to dis‐
able this behaviour.

A detailed description of each synth parameter follows:

len is the length of audio to synthesise expressed as a time or
as a number of samples; 0=inputlength, default=0.

The format for specifying lengths in time is hh:mm:ss.frac. The
format for specifying sample counts is the number of samples
with the letter `s' appended to it.

type is one of sine, square, triangle, sawtooth, trapezium, exp,
[white]noise, tpdfnoise pinknoise, brownnoise, pluck;
default=sine.

combine is one of create, mix, amod (amplitude modulation), fmod
(frequency modulation); default=create.

freq/freq2 are the frequencies at the beginning/end of synthesis
in Hz or, if preceded with `%', semitones relative to A
(440 Hz); alternatively, `scientific' note notation (e.g. E2)
may be used. The default frequency is 440Hz. By default, the
tuning used with the note notations is `equal temperament'; the
-j KEY option selects `just intonation', where KEY is an integer
number of semitones relative to A (so for example, -9 or 3
selects the key of C), or a note in scientific notation.

If freq2 is given, then len must also have been given and the
generated tone will be swept between the given frequencies. The
two given frequencies must be separated by one of the characters
`:', `+', `/', or `-'. This character is used to specify the
sweep function as follows:

: Linear: the tone will change by a fixed number of hertz
per second.

+ Square: a second-order function is used to change the
tone.

/ Exponential: the tone will change by a fixed number of
semitones per second.
- Exponential: as `/', but initial phase always zero, and
stepped (less smooth) frequency changes.

Not used for noise.

off is the bias (DC-offset) of the signal in percent; default=0.

ph is the phase shift in percentage of 1 cycle; default=0. Not
used for noise.

p1 is the percentage of each cycle that is `on' (square), or
`rising' (triangle, exp, trapezium); default=50 (square, trian‐
gle, exp), default=10 (trapezium), or sustain (pluck);
default=40.

p2 (trapezium): the percentage through each cycle at which
`falling' begins; default=50. exp: the amplitude in multiples of
2dB; default=50, or tone-1 (pluck); default=20.

p3 (trapezium): the percentage through each cycle at which
`falling' ends; default=60, or tone-2 (pluck); default=90.

sample rate
The sample rate in samples per second (`Hertz' or `Hz'). For example, digital telephony traditionally uses a sample rate of 8000 Hz (8 kHz), though these days, 16 and even 32 kHz are
becoming more common; audio Compact Discs use 44100 Hz
(44.1 kHz); Digital Audio Tape and many computer systems use
48 kHz; professional audio systems often use 96 kHz.
-n, --null
This can be used in place of an input or output filename to
specify that a `null file' is to be used. Note that here, `null
file' refers to a SoX-specific mechanism and is not related to any operating-system mechanism with a similar name.

Using a null file to input audio is equivalent to using a normal
audio file that contains an infinite amount of silence, and as
such is not generally useful unless used with an effect that
specifies a finite time length (such as trim or synth).

Using a null file to output audio amounts to discarding the
audio and is useful mainly with effects that produce information
about the audio instead of affecting it (such as noiseprof or
stat).

The sampling rate associated with a null file is by default
48 kHz, but, as with a normal file, this can be overridden if
desired using command-line format options (see below).
-r, --rate RATE[k]
Gives the sample rate in Hz (or kHz if appended with `k') of the
file.
For an input file, the most common use for this option is to
inform SoX of the sample rate of a `raw' (`headerless') audio
file (see the examples in -b and -c above). Occasionally, it
may be useful to use this option with a `headered' file, in
order to override the (presumably incorrect) value in the header
- note that this is only supported with certain file types. For
example, if audio was recorded with a sample-rate of say 48k
from a source that played back a little, say 1.5%, too slowly,
then
sox -r 48720 input.wav output.wav
effectively corrects the speed by changing only the file header
(but see also the speed effect for the more usual solution to
this problem).

For an output file, this option provides a shorthand for speci‐
fying that the rate effect should be invoked in order to change
(if necessary) the sample rate of the audio signal to the given
value. For example, the following two commands are equivalent:
sox input.wav -r 48k output.wav bass -3
sox input.wav output.wav bass -3 rate 48k
though the second form is more flexible as it allows rate
options to be given, and allows the effects to be ordered arbitrarily.

----------------------------------------