Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs190c:more_detail [2008/01/31 17:47] (current)
Line 1: Line 1:
 +===== Project 1: Specification of Required Functions =====
  
 +This section gives the specifications for the functions you are required to implement. ​ The name and parameters of each function should not be changed. ​ While the specifications may look long and, it may seem, will take a lot of code to implement, the functions are, in fact, quite short. Be sure to understand and follow the conventions given next.
 +
 +
 +
 +===== Conventions =====
 +
 +The conventions below are used to define the functions in the specification. ​
 +
 +For functions that modify a single array of sample data, the data parameter is always passed first. Next any parameters that indicate quantities such as amplitude, duration, or frequency are listed. If a sampler function is required, it is always the last parameter and is optional (the default is to use a sine wave).
 +
 +==== Parameter Names ====
 +  * **data** - Corresponds to input data and should be an array of samples.
 +  * **amp** - Corresponds to amplitude for waves. Amplitude should always be between 0 and 1, inclusive.
 +  * **dur** - Duration in seconds
 +  * **sampler** - Function to be used to generate input samples.
 +
 +In most cases, parameters that indicate a quantity are taken to be percentages,​ unless specifically noted otherwise or the parameter refers to a duration. ​ For example, the amplitude refers to the percent of the maximum allowable amplitude.
 +
 +
 +==== Constants ====
 +  * **WAVE_MAX** - The maximum allowable value of a single sample.
 +  * **WAVE_MIN** - The minimum allowable value of a single sample.
 +  * **SAMPLE_FREQUENCY** - The sampling frequency, in Hertz.
 +
 +==== Function Specifications ==== 
 +
 +**scale_volume(data,​ factor)**
 +
 +This function takes digital audio in array data and scales it 
 +by the value factor. To accomplish this scaling, every element of the array data
 +is multiplied by factor. This multiplication increases or decreases the volume of the wave (corresponding to factors
 +greater than 1 or less than 1, respectively) and may result in values that are out of range. Values out of range need to be clipped: ​
 +  * If a scaled sample is greater than WAVE_MAX, set it to WAVE_MAX
 +  * If a scaled sample is less than WAVE_MIN, set it to WAVE_MIN
 +
 +Note that that factor may have a floating point value
 +(e.g. 0.5) and thus the resulting new value may not be an integer, as needed. ​
 +Use the built-in ''​int''​ function, which converts a float to an int by truncating decimal places. ​ For illustration: ​
 +
 +  x = 3.5
 +  print x,int(x)
 +
 +prints:
 +
 +  3.5 3
 +
 +----
 +
 +**normalize(data)**
 +
 +This function maximizes the possible volume of the sound given in array data.  This operation is similar
 +to scale_volume() except that now the value of parameter factor is to be determined in such a way that the volume is maximized and no clipping is needed after scaling. ​ You need to address the special case where all samples are 0. 
 +
 +The following functions may be helpful:
 +
 +   ​max(a) # returns the largest element of a
 +   ​min(a) # returns the smallest element of a
 +   ​abs(x) # returns the absolute value of x
 +
 +----
 +
 +**echo(data,​ delay, level)**
 +
 +This function takes as parameters a sound array data, a delay in seconds,
 +and an echo level and it returns a new sound array. ​ The generated array will be longer than array data. 
 +
 +Remember that the delay given in seconds is to be translated into SAMPLE_FREQUENCY*delay samples (i.e., array locations).  ​
 +The level is a value between 0 and 1. The level indicates the intensity
 +of the echo in proportion to the original sound; i.e., the
 +original wave is scaled by a factor of (1 - level).
 +
 +To create an echo effect, consider the original wave and a copy of the wave. Now, shift the second wave forward by the number of seconds specified in variable delay. The two waves are then combined using the following weighting:
 +
 +  result = (1-level)*orig_sample + level*echo_sample
 +
 +The pieces of the waves that do not overlap are matched by
 +silence (0-valued samples). This technique produces a wave that has the sound of both the wave and the wave shifted by some amount of time, an echo. Generally, the echo is quieter than the original sound (level < .5), but that's not required.
 +
 +Useful functions:
 +
 +  zeros(n) # return an array of n zeros.
 +  append(a,b) # create a new array by appending b to a.
 +
 +**NOTE:** Remember that delay may be a floating point number, so it is important to make
 +sure that when calculating the number of samples in the delay that the
 +result is an integer. ​
 +
 +----
 +
 +**sin_sample(freq,​ amp, dur)**
 +
 +This function returns a sine wave with frequency freq, amplitude amp, and duration dur (in seconds). ​
 +
 +The amplitude should be specified
 +with a range of 0 to 1 where 1 represents the maximum amplitude.
 +To obtain the actual amplitude in the sample use the WAVE_MAX value
 +to scale it to an appropriate value.
 +
 +
 +----
 +
 +**combine(list_of_sounds)**
 +
 +This function takes a list consisting of sound files (i.e., a list of arrays) and combines them into a 
 +single sound. All sound files must have the same
 +length. The function returns the created sound in a new array.
 +
 +To calculate the combined sound, take the sum of each wave sample
 +divided by the number of wave samples (i.e., compute the mean).
 +
 +----
 +
 +**silence(dur)**
 +
 +Return an array consisting of dur seconds of silence.
 +
 +----
 +
 +**equal_scale(freq,​ amp, dur, sampler=sin_sample)**
 +
 +Return an ascending chromatic scale starting on freq where
 +each tone is dur seconds long.  A short silence should be inserted between
 +notes. ​ The function sampler may optionally be passed into the function to
 +determine which waveform should be used (if none is passed, sin_sample you wrote earlier is used).
 +
 +A chromatic scale is a 12-tone scale. In equal temperament,​ each
 +note is equally spaced up the octave. An octave is reached when
 +the frequency doubles. It follows, then, that the i-th note of the
 +scaled is
 +
 +f(i) = freq*2^(i/​12)
 +
 +Note that, despite being a 12-tone scale, the last note is a repeat
 +of the first, but an octave higher. So, in total there will be 13 tones.
 +
 +Useful functions:
 +
 +  a = append(a,b) # append array b to a and assign the result to a
 +
 +----
 +
 +**split_on_silences(data,​ len_thresh=.25,​ vol_thresh=.1)**
 +
 +This function splits sound samples stored in array data into a list of arrays, each containing a segment of sound from the original array. The segments are divided on their silences. ​ A silence is defined as a segment of the sound where the absolute value of all samples is less than or equal to vol_thresh multiplied by the maximum sample value for a duration of at least len_thresh seconds. ​ Silences should not be included in the results. ​
 +
 +Depending on the wave chosen, len_thresh and vol_thresh may require some tweaking. However, the default values used should work in the majority of situations.
 
cs190c/more_detail.txt ยท Last modified: 2008/01/31 17:47 (external edit)
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki