pynaviz.audiovideo.AudioHandler#
- class pynaviz.audiovideo.AudioHandler(audio_path, stream_index=0, time=None)[source]#
Bases:
BaseAudioVideoHandler for reading and decoding audio frames from a file.
This class uses PyAV to access audio frames from a given file. It allows querying time-aligned audio samples between two timepoints, and provides audio shape and total length information.
- Parameters:
audio_path (str | pathlib.Path) – Path to the audio file.
stream_index (int) – Index of the audio stream to decode (default is 0).
time (Optional[NDArray]) – Optional 1D time axis to associate with the samples. Must match the number of sample points in the audio file.
- Raises:
ValueError – If the provided time axis is not 1D or does not match the number of sample points in the audio file.
Examples
>>> from pynaviz.audiovideo import AudioHandler >>> ah = AudioHandler("example.mp3") >>> # Get audio samples between 1.5 and 2.5 seconds. >>> audio_trace = ah.get(1.5, 2.5) >>> # Shape: (n_samples, n_channels) >>> audio_trace.shape (44100, 2)
Methods
__init__(audio_path[, stream_index, time])close()Close the audiovideo stream.
get(start, end)Extract decoded frames from a video between two timestamps.
Attributes
Time axis corresponding to the audio samples.
Shape of the audio data.
Time axis corresponding to the audio samples.
Total duration of the audio in seconds.
- close()#
Close the audiovideo stream.
- get(start, end)[source]#
Extract decoded frames from a video between two timestamps.
This method decodes and returns the raw video frames corresponding to the time interval
[start, end]in seconds. Decoding begins from the nearest keyframe at or beforestart, and proceeds sequentially until the end timestamp is reached or exceeded. If the last decoded frame extends beyondend, trailing samples are truncated so that the returned array aligns with the requested time range.- Parameters:
- Returns:
A 2D NumPy array containing the decoded frames for the requested interval. The exact shape depends on the video format and frame size, with the first dimension corresponding to time (frame index or samples) and the remaining dimensions containing audio channels.
- Return type:
numpy.typing.NDArray
Notes
The returned frames are decoded in sequence and concatenated before being transposed so that time is the first dimension.
If
endfalls between two frames, the last frame is partially trimmed to match the requested duration.
See also
- av.AudioFrame <https://pyav.org/docs/stable/api/audio.html#module-av.audio.frame>
The PyAV frame object.
- property index: numpy.typing.NDArray#
Time axis corresponding to the audio samples.
- Returns:
Array of timestamps with shape (num_samples,).
- property shape: Tuple[int, int]#
Shape of the audio data.
- Returns:
Tuple (num_samples, num_channels) describing the audio shape.
- property t: numpy.typing.NDArray#
Time axis corresponding to the audio samples.
- Returns:
Array of timestamps with shape (num_samples,).