The SA file format specification

SA
Filename extension:.sa
Shebangsat line (optional):#!/@ -tsa
Magic:(see shebangsat)
Encoding:text (UTF-8)
Type of format:audio signal
Application of:DA

Introduction

SA is a file format for the representation of sound and other sampled signals. SA can be used both for persistent storage and for transmitting a data stream of indefinite length. SA is an application of DA.

Why?

The computing world is riddled with sound file formats, most of which do not offer any technical advantages over previously introduced formats. Their raison d'etre seems to be based on marketing and politics rather than technical innovation. In contrast, SA offers a set of features that is unique among the industry.

Benefits of SA over other sound file formats

Infinite dynamics. Samples can represent any amplitude. As far as the file format is concerned, no amount of summing will cause clipping. This removes the need for repetitive normalization of the signal. (You will of course need to normalize before entering D/A and before printing to a distribution medium.)

Infinite resolution. Samples can represent any resolution, no matter what the gain level. As far as the file format is concerned, there will never be rounding errors. This means that SA does not suffer, for example, from the distortion inherent in binary floating point representation. If your program performs fixed point calculation, 256-bit (or better) floating point calculation, or implements a true real number calculation engine, SA can store the results with no loss of accuracy.

Any sampling rate. Any sampling rate is possible. While it is usually advisable to stick to standard rates, signal processing research sometimes calls for the need to work in odd sampling rates. SA will not get in the way of this.

Note that the first three features listed above mean that SA is able to represent sampled sound at a theoretical infinite quality. No other sound file format in common use has this capability.

Unlimited length. The length of an SA file is only limited by storage medium. In the case of a stream, it can be infinite.

Unlimited channels. Channel count is unlimited.

Adaptive sample size. The storage size of an individual sample adapts to resolution. A single SA file can mix samples of different resolution and hence different storage size. A zero sample (silence) occupies two bytes.

Unix savvy. With SA files, many edit operations can be implemented using standard Unix tools (awk, sed, dc, ...) and take just a couple of lines of program code.

Self documenting. The SA file is text-based, making the file header easily viewable and editable.

Costs of SA over other sound file formats

Some parsing needed. In SA, samples are stored in an open, text-based, variable length format. A program reading an SA file needs to parse each sample and convert it to the program's internal representation. This consumes extra CPU cycles compared to if the samples were readily stored in, say, IEEE 754 binary floating point (for a program that uses IEEE 754 for calculation).

Takes more space. An SA file takes approximately 2.5 times as much space as a file format that stores samples in a processor specific floating point representation.

Given that disk space, I/O bandwidth and even CPU power are no longer considered primary issues in sound processing, the interchangeability of the open format and the benefits listed above should more than make up for these costs. And where these costs, for whatever reason, can not be beared, SA can still serve as an unparalleled interchange format between otherwise incompatible processing environments.

Basic structure

If the first byte of the file is a number sign (#), all bytes up to and including the first newline are ignored.

T.B.S. (This document is under construction.)

All fields are optional with the exception of sampling-data.

Example file

#!/@ -tsa
title: Sine Wave (1Khz)
date: 2004-10-22T06:13
author: Timo Lehtinen
copyright:
version: 1
generator: awk
channel-count: 2
channels/1/name: Sine L
channels/1/function: stereo-left
channels/1/encoding: pcm-linear
channels/1/value-min: -1.0
channels/1/value-max: 1.0
channels/2/name: Sine R
channels/2/function: stereo-right
channels/2/encoding: pcm-linear
channels/2/value-min: -1.0
channels/2/value-max: 1.0
frame-count: 44100
sampling-rate: 44100
sampling-data: <<EOF
0.000000 0.000000
0.141357 0.141357
0.279968 0.279968
0.412903 0.412903
0.537537 0.537537
0.651367 0.651367
0.752136 0.752136
0.837769 0.837769
0.906555 0.906555
0.957092 0.957092
0.988403 0.988403
0.999878 0.999878
0.991272 0.991272
0.962708 0.962708
0.914795 0.914795
0.848450 0.848450
0.765137 0.765137
0.666382 0.666382
0.554260 0.554260
0.430969 0.430969
0.299072 0.299072
0.161072 0.161072
0.019897 0.019897
-0.121582 -0.121582
-0.260742 -0.260742
-0.394653 -0.394653
-0.520630 -0.520630
-0.636108 -0.636108
-0.738831 -0.738831
-0.826721 -0.826721
-0.897949 -0.897949
-0.951111 -0.951111
-0.985229 -0.985229
-0.999512 -0.999512
-0.993652 -0.993652
-0.967896 -0.967896
-0.922668 -0.922668
-0.858826 -0.858826
-0.777832 -0.777832
-0.681091 -0.681091
-0.570740 -0.570740
-0.448914 -0.448914
-0.317993 -0.317993
-0.180725 -0.180725
-0.039856 -0.039856

... removed 44015 lines ...

0.045959 0.045959
0.186768 0.186768
0.323792 0.323792
0.454346 0.454346
0.575745 0.575745
0.685608 0.685608
0.781616 0.781616
0.862000 0.862000
0.924988 0.924988
0.969421 0.969421
0.994324 0.994324
0.999268 0.999268
0.984131 0.984131
0.949219 0.949219
0.895203 0.895203
0.823242 0.823242
0.734680 0.734680
0.631348 0.631348
0.515381 0.515381
0.389038 0.389038
0.254822 0.254822
0.115540 0.115540
-0.026062 -0.026062
-0.167114 -0.167114
-0.304871 -0.304871
-0.436523 -0.436523
-0.559326 -0.559326
-0.670959 -0.670959
-0.769043 -0.769043
-0.851685 -0.851685
-0.917236 -0.917236
-0.964355 -0.964355
-0.992065 -0.992065
-0.999817 -0.999817
-0.987488 -0.987488
-0.955322 -0.955322
-0.903931 -0.903931
-0.834412 -0.834412
-0.748047 -0.748047
-0.646729 -0.646729