smst documentation¶
SMS tools (smst
) is a Python library for spectral audio modelling, synthesis
and transformations. SMS stands for Spectral Modeling Synthesis.
Contents:
smst package¶
Subpackages¶
smst.models package¶
Submodules¶
smst.models.dft module¶
Functions that implement analysis and synthesis of sounds using the Discrete Fourier Transform.
For example usage check the smst.ui.models.dftModel_function module.
-
smst.models.dft.
apply_normalized_window
(samples, window)¶
-
smst.models.dft.
apply_zero_phase_window
(samples, window, fft_size)¶
-
smst.models.dft.
from_audio
(samples, window, fft_size)¶ Analyzes time-domain samples of a real signal using the Discrete Fourier Transform (DFT) into magnitude and phase spectrum of positive frequencies.
Parameters: - samples – samples of the input signal
- window – samples of the analysis window
- fft_size – size of the spectrum (power of two)
Returns: - magnitude_db_spectrum: magnitude spectrum (in decibels) of positive frequencies
- phase_spectrum: unwrapped phase spectrum of positive frequencies
-
smst.models.dft.
half_window_sizes
(window_size)¶
-
smst.models.dft.
select_magnitude_db_spectrum
(spectrum)¶ Computes magnitude spectrum in decibels from complex-valued spectrum.
-
smst.models.dft.
select_phase_spectrum
(spectrum, phase_eps=1e-14)¶ Computes unwrapped phase spectrum out of complex spectrum.
Parameters: - spectrum – complex-valued spectrum
- tol – threshold used to compute phase
-
smst.models.dft.
select_positive_spectrum
(spectrum)¶ Selects positive frequencies from a full spectrum.
-
smst.models.dft.
spectrum_from_phase_and_magnitude
(pos_magnitude_db_spectrum, pos_phase_spectrum, fft_size)¶
-
smst.models.dft.
to_audio
(magnitude_db_spectrum, phase_spectrum, window_size)¶ Synthesizes samples of windowed time-domain signal from the positive magnitude and phase spectrum using the Inverse Discrete Fourier Transform (IDFT).
Parameters: - magnitude_db_spectrum – positive magnitude spectrum in decibels
- phase_spectrum – positive phase spectrum
- window_size – window size (also size of the output signal)
Returns: samples: reconstructed samples of the windowed signal
-
smst.models.dft.
unapply_zero_phase_window
(fft_buffer, window_size)¶
smst.models.harmonic module¶
Functions that implement analysis and synthesis of sounds using the Harmonic Model.
-
smst.models.harmonic.
find_fundamental_freq
(x, fs, w, N, H, t, minf0, maxf0, f0et)¶ Finds fundamental frequencies of a sound using the TWM (Two-Way Mismatch) algorithm.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size
- t – threshold in negative dB
- minf0 – minimum f0 frequency in Hz
- maxf0 – maximum f0 frequency in Hz
- f0et – error threshold in the f0 detection (ex: 5)
Returns: f0: fundamental frequency
-
smst.models.harmonic.
find_harmonics
(pfreq, pmag, pphase, f0, nH, hfreqp, fs, harmDevSlope=0.01)¶ Finds harmonics of a frame from a set of spectral peaks using f0 to the ideal harmonic series built on top of a fundamental frequency.
Parameters: - pfreq – peak frequencies
- pmag – peak magnitudes
- pphase – peak phases
- f0 – fundamental frequency
- nH – number of harmonics
- hfreqp – harmonic frequencies of previous frame
- fs – sampling rate
- harmDevSlope – slope of change of the deviation allowed to perfect harmonic
Returns: hfreq, hmag, hphase: harmonic frequencies, magnitudes, phases
-
smst.models.harmonic.
find_peaks
(N, fs, t, w, x_frame)¶
-
smst.models.harmonic.
from_audio
(x, fs, w, N, H, t, nH, minf0, maxf0, f0et, harmDevSlope=0.01, minSineDur=0.02)¶ Analyzes a sound using the sinusoidal harmonic model.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size (minimum 512)
- t – threshold in negative dB
- nH – maximum number of harmonics
- minf0 – minimum f0 frequency in Hz
- maxf0 – maximum f0 frequency in Hz
- f0et – error threshold in the f0 detection (ex: 5)
- harmDevSlope – slope of harmonic deviation
- minSineDur – minimum length of harmonics
Returns: xhfreq, xhmag, xhphase: harmonic frequencies, magnitudes and phases
-
smst.models.harmonic.
is_f0_stable
(f0, f0_prev)¶ Indicates whether a fundamental frequency in this frame is stable (if it does not deviate much from the previous one).
Parameters: - f0 – fundamental frequency in this frame (0 if not stable)
- f0_prev – fundamental frequency in previous frame (0 if not stable)
-
smst.models.harmonic.
scale_frequencies
(hfreq, hmag, freqScaling, freqStretching, timbrePreservation, fs)¶ Scales the frequencies of the harmonics of a sound.
Parameters: - hfreq – frequencies of input harmonics
- hmag – magnitudes of input harmonics
- freqScaling – scaling factors, in time-value pairs (value of 1 no scaling)
- freqStretching – stretching factors, in time-value pairs (value of 1 no stretching)
- timbrePreservation – 0 no timbre preservation, 1 timbre preservation
- fs – sampling rate of input sound
Returns: yhfreq, yhmag: frequencies and magnitudes of output harmonics
smst.models.hpr module¶
Functions that implement analysis and synthesis of sounds using the Harmonic plus Residual Model.
-
smst.models.hpr.
from_audio
(x, fs, w, N, H, t, minSineDur, nH, minf0, maxf0, f0et, harmDevSlope)¶ Analyzes a sound using the harmonic plus residual model.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size
- t – threshold in negative dB
- minSineDur – minimum duration of sinusoidal tracks
- nH – maximum number of harmonics
- minf0 – minimum fundamental frequency in sound
- maxf0 – maximum fundamental frequency in sound
- f0et – maximum error accepted in f0 detection algorithm
- harmDevSlope – allowed deviation of harmonic tracks, higher harmonics have higher allowed deviation
Returns: - hfreq, hmag, hphase: harmonic frequencies, magnitude and phases
- xr: residual signal
-
smst.models.hpr.
to_audio
(hfreq, hmag, hphase, xr, N, H, fs)¶ Synthesizes a sound using the sinusoidal plus residual model.
Parameters: - tfreq – sinusoidal frequencies
- tmag – sinusoidal amplitudes
- tphase – sinusoidal phases
- stocEnv – stochastic envelope
- N – synthesis FFT size
- H – hop size
- fs – sampling rate
Returns: y: output sound, yh: harmonic component
smst.models.hps module¶
Functions that implement analysis and synthesis of sounds using the Harmonic plus Stochastic Model.
In this model the signal is first modeled using the harmonic model. Then the residual is modeled using the stochastic model.
-
smst.models.hps.
from_audio
(x, fs, w, N, H, t, nH, minf0, maxf0, f0et, harmDevSlope, minSineDur, Ns, stocf)¶ Analyzes a sound using the harmonic plus stochastic model.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size
- t – threshold in negative dB,
- nH – maximum number of harmonics
- minf0 – minimum f0 frequency in Hz,
- maxf0 – maximum f0 frequency in Hz
- f0et – error threshold in the f0 detection (ex: 5),
- harmDevSlope – slope of harmonic deviation
- minSineDur – minimum length of harmonics
Returns: - hfreq, hmag, hphase: harmonic frequencies, magnitude and phases
- stocEnv: stochastic residual
-
smst.models.hps.
morph
(hfreq1, hmag1, stocEnv1, hfreq2, hmag2, stocEnv2, hfreqIntp, hmagIntp, stocIntp)¶ Morphs between two sounds using the harmonic plus stochastic model.
Parameters: - hmag1, stocEnv1 (hfreq1,) – hps representation of sound 1
- hmag2, stocEnv2 (hfreq2,) – hps representation of sound 2
- hfreqIntp – interpolation factor between the harmonic frequencies of the two sounds, 0 is sound 1 and 1 is sound 2 (time,value pairs)
- hmagIntp – interpolation factor between the harmonic magnitudes of the two sounds, 0 is sound 1 and 1 is sound 2 (time,value pairs)
- stocIntp – interpolation factor between the stochastic representation of the two sounds, 0 is sound 1 and 1 is sound 2 (time,value pairs)
Returns: yhfreq, yhmag, ystocEnv: hps output representation
-
smst.models.hps.
scale_time
(hfreq, hmag, stocEnv, timeScaling)¶ Scales the harmonic plus stochastic model of a sound in time.
Parameters: - hfreq – harmonic frequencies
- hmag – harmonic magnitudes
- stocEnv – residual envelope
- timeScaling – scaling factors, in time-value pairs
Returns: yhfreq, yhmag, ystocEnv: hps output representation
-
smst.models.hps.
to_audio
(hfreq, hmag, hphase, stocEnv, N, H, fs)¶ Synthesizes a sound using the harmonic plus stochastic model.
Parameters: - hfreq – harmonic frequencies
- hmag – harmonic amplitudes
- stocEnv – stochastic envelope
- Ns – synthesis FFT size
- H – hop size
- fs – sampling rate
Returns: - y: output sound
- yh: harmonic component
- yst: stochastic component
smst.models.sine module¶
Functions that implement analysis and synthesis of sounds using the Sinusoidal Model.
-
smst.models.sine.
clean_sinusoid_tracks
(track_freqs, min_frames=3)¶ Deletes short fragments of a collection of sinusoidal tracks.
Parameters: - track_freqs – frequencies of sinusoidal tracks
- min_frames – minimum duration of a track (in number of frames)
Returns: cleaned frequencies of tracks
-
smst.models.sine.
create_synth_window
(N, H)¶
-
smst.models.sine.
from_audio
(x, fs, w, N, H, t, maxnSines=100, minSineDur=0.01, freqDevOffset=20, freqDevSlope=0.01)¶ Analyzes a sound using the sinusoidal model with sine tracking.
Parameters: - x – input array sound
- w – analysis window
- N – size of complex spectrum
- H – hop-size
- t – threshold in negative dB
- maxnSines – maximum number of sines per frame
- minSineDur – minimum duration of sines in seconds
- freqDevOffset – minimum frequency deviation at 0Hz
- freqDevSlope – slope increase of minimum frequency deviation
Returns: xtfreq, xtmag, xtphase: frequencies, magnitudes and phases of sinusoidal tracks
-
smst.models.sine.
scale_frequencies
(sfreq, freqScaling)¶ Scales sinusoidal tracks in frequency.
Parameters: - sfreq – frequencies of input sinusoidal tracks
- freqScaling – scaling factors, in time-value pairs (value of 1 is no scaling)
Returns: ysfreq: frequencies of output sinusoidal tracks
-
smst.models.sine.
scale_time
(sfreq, smag, timeScaling)¶ Scales sinusoidal tracks in time.
Parameters: - sfreq – frequencies of input sinusoidal tracks
- smag – magnitudes of input sinusoidal tracks
- timeScaling – scaling factors, in time-value pairs
Returns: ysfreq, ysmag: frequencies and magnitudes of output sinusoidal tracks
-
smst.models.sine.
to_audio
(tfreq, tmag, tphase, N, H, fs)¶ Synthesizes a sound using the sinusoidal model.
Parameters: - tfreq – frequencies of sinusoids
- tmag – magnitudes of sinusoids
- tphase – phases of sinusoids
- N – synthesis FFT size
- H – hop size
- fs – sampling rate
Returns: y: output array sound
-
smst.models.sine.
track_sinusoids
(pfreq, pmag, pphase, tfreq, freqDevOffset=20, freqDevSlope=0.01)¶ Tracks sinusoids from one frame to the next.
Parameters: - pfreq – frequencies of current frame
- pmag – magnitude of current frame
- pphase – phases of current frame
- tfreq – frequencies of incoming tracks from previous frame
- freqDevOffset – minimum frequency deviation at 0Hz
- freqDevSlope – slope increase of minimum frequency deviation
Returns: tfreqn, tmagn, tphasen: frequency, magnitude and phase of tracks
smst.models.spr module¶
Functions that implement analysis and synthesis of sounds using the Sinusoidal plus Residual Model.
-
smst.models.spr.
from_audio
(x, fs, w, N, H, t, minSineDur, maxnSines, freqDevOffset, freqDevSlope)¶ Analyzes a sound using the sinusoidal plus residual model.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size
- t – threshold in negative dB
- minSineDur – minimum duration of sinusoidal tracks
- maxnSines – maximum number of parallel sinusoids
- freqDevOffset – frequency deviation allowed in the sinusoids from frame to frame at frequency 0
- freqDevSlope – slope of the frequency deviation, higher frequencies have bigger deviation
Returns: hfreq, hmag, hphase: harmonic frequencies, magnitude and phases; xr: residual signal
-
smst.models.spr.
to_audio
(tfreq, tmag, tphase, xr, N, H, fs)¶ Synthesizes a sound using the sinusoidal plus residual model.
Parameters: - tfreq – sinusoidal frequencies
- tmag – sinusoidal amplitudes
- tphase – sinusoidal phases
- stocEnv – stochastic envelope
- N – synthesis FFT size
- H – hop size
- fs – sampling rate
Returns: - y: output sound
- ys: sinusoidal component
smst.models.sps module¶
Functions that implement analysis and synthesis of sounds using the Sinusoidal plus Stochastic Model.
In this model the signal is first modeled using the sinusoidal model. Then the residual is modeled using the stochastic model.
-
smst.models.sps.
from_audio
(x, fs, w, N, H, t, minSineDur, maxnSines, freqDevOffset, freqDevSlope, stocf)¶ Analyzes a sound using the sinusoidal plus stochastic model.
Parameters: - x – input sound
- fs – sampling rate
- w – analysis window
- N – FFT size
- t – threshold in negative dB
- minSineDur – minimum duration of sinusoidal tracks
- maxnSines – maximum number of parallel sinusoids
- freqDevOffset – frequency deviation allowed in the sinusoids from frame to frame at frequency 0
- freqDevSlope – slope of the frequency deviation, higher frequencies have bigger deviation
- stocf – decimation factor used for the stochastic approximation
Returns: - hfreq, hmag, hphase: harmonic frequencies, magnitude and phases
- stocEnv: stochastic residual
-
smst.models.sps.
to_audio
(tfreq, tmag, tphase, stocEnv, N, H, fs)¶ Synthesizes a sound using the sinusoidal plus stochastic model.
Parameters: - tfreq – sinusoidal frequencies
- tmag – sinusoidal amplitudes
- tphase – sinusoidal phases
- stocEnv – stochastic envelope
- N – synthesis FFT size
- H – hop size
- fs – sampling rate
Returns: - y: output sound
- ys: sinusoidal component
- yst: stochastic component
smst.models.stft module¶
Functions that implement analysis and synthesis of sounds using the Short-Time Fourier Transform.
For example usage check the stftModel_function module.
-
smst.models.stft.
filter
(x, fs, w, N, H, filter)¶ Applies a spectral filter to a sound by using the STFT.
Parameters: - x – input sound
- w – analysis window
- N – FFT size
- H – hop size
- filter – magnitude response of filter with frequency-magnitude pairs (in dB)
Returns: y - output sound
-
smst.models.stft.
from_audio
(x, w, N, H)¶ Analyzes an input signal using the short-time Fourier transform into a spectrogram.
Parameters: - x – input signal
- w – analysis window
- N – FFT size
- H – hop size
Returns: mag_spectrogram, phase_spectrogram - magnitude and phase spectrograms
-
smst.models.stft.
iterate_analysis_frames
(x, H, hM1, hM2)¶ Iterate over frames of input signal for analysis.
Parameters: - x – input signal
- H – hop size
- hM1 – half analysis window size by rounding
- hM2 – half analysis window size by floor
Returns: generator over frames of input signal
-
smst.models.stft.
morph
(x1, x2, fs, w1, N1, w2, N2, H1, smoothf, balancef)¶ Morphs two sounds using the STFT.
Parameters: - x2 (x1,) – input sounds
- fs – sampling rate
- w2 (w1,) – analysis windows
- N2 (N1,) – FFT sizes
- H1 – hop size
- smoothf – smooth factor of sound 2, bigger than 0 to max of 1, where 1 is no smoothing,
- balancef – balance between the 2 sounds, from 0 to 1, where 0 is sound 1 and 1 is sound 2
Returns: y: output sound
-
smst.models.stft.
pad_signal
(x, hM2)¶
-
smst.models.stft.
to_audio
(mY, pY, M, H)¶ Synthesizes an output signal from a spectrogram using the inverse short-time Fourier transform.
Parameters: - mY – magnitude spectrogram
- pY – phase spectrogram
- M – window size
- H – hop-size
Returns: y - output signal
smst.models.stochastic module¶
Functions that implement analysis and synthesis of sounds using the Stochastic Model.
In this model the sound is modeled using a magnitude envelope of the signal’s spectrum. The magnitude envelope is decimated. On reconstruction the phases are generated randomly.
-
smst.models.stochastic.
from_audio
(x, H, N, stocf)¶ Analyzes a sound using the stochastic model.
Parameters: - x – input array sound
- H – hop size
- N – FFT size
- stocf – decimation factor of mag spectrum for stochastic analysis, bigger than 0, maximum of 1
Returns: stocEnv: stochastic envelope
-
smst.models.stochastic.
scale_time
(stocEnv, timeScaling)¶ Scales the stochastic model of a sound in time.
Parameters: - stocEnv – stochastic envelope
- timeScaling – scaling factors, in time-value pairs
Returns: ystocEnv: stochastic envelope
-
smst.models.stochastic.
to_audio
(stocEnv, H, N)¶ Synthesizes sound from a stochastic model.
Parameters: - stocEnv – stochastic envelope
- H – hop size
- N – fft size
Returns: y: output sound
Module contents¶
smst.utils package¶
Submodules¶
smst.utils.audio module¶
-
smst.utils.audio.
play_wav
(filename)¶ Plays a wav audio file from system using OS calls.
Parameters: filename – name of file to read
-
smst.utils.audio.
read_wav
(filename)¶ Reads a sound file and converts it to a normalized floating point array.
Parameters: filename – name of file to read Returns: - fs: sampling rate of file
- x: floating point array
-
smst.utils.audio.
write_wav
(y, fs, filename)¶ Writes a sound file from an array with the sound and the sampling rate. Creates the directory for the file if it does not exist.
Parameters: - y – floating point array of one dimension
- fs – sampling rate
- filename – name of file to create (can be a path)
smst.utils.files module¶
-
smst.utils.files.
ensure_directory
(dir)¶
-
smst.utils.files.
strip_file
(filePath)¶ Extracts file name without directories and extension.
smst.utils.math module¶
-
smst.utils.math.
from_db_magnitudes
(magnitudes_db)¶
-
smst.utils.math.
is_power_of_two
(num)¶ Checks if num is power of two
-
smst.utils.math.
rmse
(x, y)¶ Root mean square error.
Parameters: - x – numpy array
- y – numpy array
Returns: RMSE(x,y)
-
smst.utils.math.
to_db_magnitudes
(amplitudes)¶
smst.utils.peaks module¶
-
smst.utils.peaks.
clean_sinusoid_track
(track, minTrackLength=3)¶ Deletes fragments of one single track smaller than minTrackLength.
Parameters: - track – array of values
- minTrackLength – minimum duration of tracks in number of frames
Returns: cleanTrack: array of clean values
-
smst.utils.peaks.
find_fundamental_twm
(pfreq, pmag, ef0max, minf0, maxf0, f0t=0)¶ Function that wraps the f0 detection function TWM, selecting the possible f0 candidates and calling the function TWM with them.
Parameters: - pfreq – peak frequencies
- pmag – peak magnitudes
- ef0max – maximum error allowed
- minf0 – minimum allowed f0
- maxf0 – maximum allowed f0
- f0t – f0 of previous frame if stable
Returns: f0: fundamental frequency in Hz
-
smst.utils.peaks.
find_fundamental_twm_py
(pfreq, pmag, f0c)¶ Two-way mismatch algorithm for f0 detection (by Beauchamp&Maher). Better to use the C version of this function: native_twm().
Parameters: - pfreq – peak frequencies in Hz
- pmag – peak magnitudes
- f0c – frequencies of f0 candidates
Returns: f0, f0Error: fundamental frequency detected and its error
-
smst.utils.peaks.
find_peaks
(mX, t)¶ Detects spectral peak locations.
Parameters: - mX – magnitude spectrum
- t – threshold
Returns: ploc: peak locations
-
smst.utils.peaks.
interpolate_peaks
(mX, pX, ploc)¶ Interpolates peak values using parabolic interpolation.
Parameters: - mX – magnitude spectrum
- pX – phase spectrum
- ploc – locations of peaks
Returns: iploc, ipmag, ipphase: interpolated peak location, magnitude and phase values
smst.utils.residual module¶
-
smst.utils.residual.
subtract_sinusoids
(x, N, H, sfreq, smag, sphase, fs)¶ Subtracts sinusoids from a sound.
Parameters: - x – input sound
- N – FFT size
- H – hop size
- sfreq – sinusoidal frequencies
- smag – sinusoidal magnitudes
- sphase – sinusoidal phases
Returns: xr: residual sound
-
smst.utils.residual.
subtract_sinusoids_with_stochastic_residual
(x, N, H, sfreq, smag, sphase, fs, stocf)¶ Subtracts sinusoids from a sound and approximate the residual with an envelope.
Parameters: - x – input sound
- N – FFT size
- H – hop size
- sfreq – sinusoidal frequencies
- smag – sinusoidal magnitudes
- sphase – sinusoidal phases
- fs – sampling rate
- stocf – stochastic factor, used in the approximation
Returns: stocEnv: stochastic approximation of residual
smst.utils.synth module¶
-
smst.utils.synth.
spectrum_for_sinusoids
(ipfreq, ipmag, ipphase, N, fs)¶ Generates a spectrum from a series of sine values, calling a C function.
Parameters: - ipfreq – sine peaks frequencies
- ipmag – sine peaks magnitudes
- ipphase – sine peaks phases
- N – size of the complex spectrum to generate
- fs – sampling frequency
Returns: Y: generated complex spectrum of sines
-
smst.utils.synth.
spectrum_for_sinusoids_py
(ipfreq, ipmag, ipphase, N, fs)¶ Generates a spectrum from a series of sine values. Python implementation.
Parameters: - ipfreq – sine peaks frequencies
- ipmag – sine peaks magnitudes
- ipphase – sine peaks phases
- N – size of the complex spectrum to generate
- fs – sampling frequency
Returns: Y: generated complex spectrum of sines
-
smst.utils.synth.
synthesize_sinusoid
(freqs, amp, H, fs)¶ Synthesizes one sinusoid with time-varying frequency.
Parameters: - freqs – frequencies of sinusoids
- amps – amplitudes of sinusoids
- H – hop size
- fs – sampling rate
Returns: y: output sound
smst.utils.window module¶
-
smst.utils.window.
blackman_harris_lobe
(x)¶ Generates the main lobe of a Blackman-Harris window.
Parameters: x – bin positions to compute (real values) Returns: y: main lobe os spectrum of a Blackman-Harris window
-
smst.utils.window.
sinc
(x, N)¶ Generates the main lobe of a sinc function (Dirichlet kernel).
Parameters: - x – array of indexes to compute
- N – size of FFT to simulate
Returns: y: samples of the main lobe of a sinc function