Project 1: Specification of Required Functions

This section gives the specifications for the functions you are required to implement. The name and parameters of each function should not be changed. While the specifications may look long and, it may seem, will take a lot of code to implement, the functions are, in fact, quite short. Be sure to understand and follow the conventions given next.


The conventions below are used to define the functions in the specification.

For functions that modify a single array of sample data, the data parameter is always passed first. Next any parameters that indicate quantities such as amplitude, duration, or frequency are listed. If a sampler function is required, it is always the last parameter and is optional (the default is to use a sine wave).

Parameter Names

  • data - Corresponds to input data and should be an array of samples.
  • amp - Corresponds to amplitude for waves. Amplitude should always be between 0 and 1, inclusive.
  • dur - Duration in seconds
  • sampler - Function to be used to generate input samples.

In most cases, parameters that indicate a quantity are taken to be percentages, unless specifically noted otherwise or the parameter refers to a duration. For example, the amplitude refers to the percent of the maximum allowable amplitude.


  • WAVE_MAX - The maximum allowable value of a single sample.
  • WAVE_MIN - The minimum allowable value of a single sample.
  • SAMPLE_FREQUENCY - The sampling frequency, in Hertz.

Function Specifications

scale_volume(data, factor)

This function takes digital audio in array data and scales it by the value factor. To accomplish this scaling, every element of the array data is multiplied by factor. This multiplication increases or decreases the volume of the wave (corresponding to factors greater than 1 or less than 1, respectively) and may result in values that are out of range. Values out of range need to be clipped:

  • If a scaled sample is greater than WAVE_MAX, set it to WAVE_MAX
  • If a scaled sample is less than WAVE_MIN, set it to WAVE_MIN

Note that that factor may have a floating point value (e.g. 0.5) and thus the resulting new value may not be an integer, as needed. Use the built-in int function, which converts a float to an int by truncating decimal places. For illustration:

x = 3.5
print x,int(x)


3.5 3


This function maximizes the possible volume of the sound given in array data. This operation is similar to scale_volume() except that now the value of parameter factor is to be determined in such a way that the volume is maximized and no clipping is needed after scaling. You need to address the special case where all samples are 0.

The following functions may be helpful:

 max(a) # returns the largest element of a
 min(a) # returns the smallest element of a
 abs(x) # returns the absolute value of x

echo(data, delay, level)

This function takes as parameters a sound array data, a delay in seconds, and an echo level and it returns a new sound array. The generated array will be longer than array data.

Remember that the delay given in seconds is to be translated into SAMPLE_FREQUENCY*delay samples (i.e., array locations). The level is a value between 0 and 1. The level indicates the intensity of the echo in proportion to the original sound; i.e., the original wave is scaled by a factor of (1 - level).

To create an echo effect, consider the original wave and a copy of the wave. Now, shift the second wave forward by the number of seconds specified in variable delay. The two waves are then combined using the following weighting:

result = (1-level)*orig_sample + level*echo_sample

The pieces of the waves that do not overlap are matched by silence (0-valued samples). This technique produces a wave that has the sound of both the wave and the wave shifted by some amount of time, an echo. Generally, the echo is quieter than the original sound (level < .5), but that's not required.

Useful functions:

zeros(n) # return an array of n zeros.
append(a,b) # create a new array by appending b to a.

NOTE: Remember that delay may be a floating point number, so it is important to make sure that when calculating the number of samples in the delay that the result is an integer.

sin_sample(freq, amp, dur)

This function returns a sine wave with frequency freq, amplitude amp, and duration dur (in seconds).

The amplitude should be specified with a range of 0 to 1 where 1 represents the maximum amplitude. To obtain the actual amplitude in the sample use the WAVE_MAX value to scale it to an appropriate value.


This function takes a list consisting of sound files (i.e., a list of arrays) and combines them into a single sound. All sound files must have the same length. The function returns the created sound in a new array.

To calculate the combined sound, take the sum of each wave sample divided by the number of wave samples (i.e., compute the mean).


Return an array consisting of dur seconds of silence.

equal_scale(freq, amp, dur, sampler=sin_sample)

Return an ascending chromatic scale starting on freq where each tone is dur seconds long. A short silence should be inserted between notes. The function sampler may optionally be passed into the function to determine which waveform should be used (if none is passed, sin_sample you wrote earlier is used).

A chromatic scale is a 12-tone scale. In equal temperament, each note is equally spaced up the octave. An octave is reached when the frequency doubles. It follows, then, that the i-th note of the scaled is

f(i) = freq*2^(i/12)

Note that, despite being a 12-tone scale, the last note is a repeat of the first, but an octave higher. So, in total there will be 13 tones.

Useful functions:

a = append(a,b) # append array b to a and assign the result to a

split_on_silences(data, len_thresh=.25, vol_thresh=.1)

This function splits sound samples stored in array data into a list of arrays, each containing a segment of sound from the original array. The segments are divided on their silences. A silence is defined as a segment of the sound where the absolute value of all samples is less than or equal to vol_thresh multiplied by the maximum sample value for a duration of at least len_thresh seconds. Silences should not be included in the results.

Depending on the wave chosen, len_thresh and vol_thresh may require some tweaking. However, the default values used should work in the majority of situations.

cs190c/more_detail.txt · Last modified: 2008/01/31 17:47 (external edit)
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki