Differences

This shows you the differences between two versions of the page.

 — cs190c:more_detail [2008/01/31 17:47] (current) 2008/01/31 17:47 seh created 2008/01/31 17:47 seh created Line 1: Line 1: + ===== Project 1: Specification of Required Functions ===== + This section gives the specifications for the functions you are required to implement. ​ The name and parameters of each function should not be changed. ​ While the specifications may look long and, it may seem, will take a lot of code to implement, the functions are, in fact, quite short. Be sure to understand and follow the conventions given next. + + + + ===== Conventions ===== + + The conventions below are used to define the functions in the specification. ​ + + For functions that modify a single array of sample data, the data parameter is always passed first. Next any parameters that indicate quantities such as amplitude, duration, or frequency are listed. If a sampler function is required, it is always the last parameter and is optional (the default is to use a sine wave). + + ==== Parameter Names ==== + * **data** - Corresponds to input data and should be an array of samples. + * **amp** - Corresponds to amplitude for waves. Amplitude should always be between 0 and 1, inclusive. + * **dur** - Duration in seconds + * **sampler** - Function to be used to generate input samples. + + In most cases, parameters that indicate a quantity are taken to be percentages,​ unless specifically noted otherwise or the parameter refers to a duration. ​ For example, the amplitude refers to the percent of the maximum allowable amplitude. + + + ==== Constants ==== + * **WAVE_MAX** - The maximum allowable value of a single sample. + * **WAVE_MIN** - The minimum allowable value of a single sample. + * **SAMPLE_FREQUENCY** - The sampling frequency, in Hertz. + + ==== Function Specifications ==== + + **scale_volume(data,​ factor)** + + This function takes digital audio in array data and scales it + by the value factor. To accomplish this scaling, every element of the array data + is multiplied by factor. This multiplication increases or decreases the volume of the wave (corresponding to factors + greater than 1 or less than 1, respectively) and may result in values that are out of range. Values out of range need to be clipped: ​ + * If a scaled sample is greater than WAVE_MAX, set it to WAVE_MAX + * If a scaled sample is less than WAVE_MIN, set it to WAVE_MIN + + Note that that factor may have a floating point value + (e.g. 0.5) and thus the resulting new value may not be an integer, as needed. ​ + Use the built-in ''​int''​ function, which converts a float to an int by truncating decimal places. ​ For illustration: ​ + + x = 3.5 + print x,int(x) + + prints: + + 3.5 3 + + ---- + + **normalize(data)** + + This function maximizes the possible volume of the sound given in array data.  This operation is similar + to scale_volume() except that now the value of parameter factor is to be determined in such a way that the volume is maximized and no clipping is needed after scaling. ​ You need to address the special case where all samples are 0. + + The following functions may be helpful: + + ​max(a) # returns the largest element of a + ​min(a) # returns the smallest element of a + ​abs(x) # returns the absolute value of x + + ---- + + **echo(data,​ delay, level)** + + This function takes as parameters a sound array data, a delay in seconds, + and an echo level and it returns a new sound array. ​ The generated array will be longer than array data. + + Remember that the delay given in seconds is to be translated into SAMPLE_FREQUENCY*delay samples (i.e., array locations).  ​ + The level is a value between 0 and 1. The level indicates the intensity + of the echo in proportion to the original sound; i.e., the + original wave is scaled by a factor of (1 - level). + + To create an echo effect, consider the original wave and a copy of the wave. Now, shift the second wave forward by the number of seconds specified in variable delay. The two waves are then combined using the following weighting: + + result = (1-level)*orig_sample + level*echo_sample + + The pieces of the waves that do not overlap are matched by + silence (0-valued samples). This technique produces a wave that has the sound of both the wave and the wave shifted by some amount of time, an echo. Generally, the echo is quieter than the original sound (level < .5), but that's not required. + + Useful functions: + + zeros(n) # return an array of n zeros. + append(a,b) # create a new array by appending b to a. + + **NOTE:** Remember that delay may be a floating point number, so it is important to make + sure that when calculating the number of samples in the delay that the + result is an integer. ​ + + ---- + + **sin_sample(freq,​ amp, dur)** + + This function returns a sine wave with frequency freq, amplitude amp, and duration dur (in seconds). ​ + + The amplitude should be specified + with a range of 0 to 1 where 1 represents the maximum amplitude. + To obtain the actual amplitude in the sample use the WAVE_MAX value + to scale it to an appropriate value. + + + ---- + + **combine(list_of_sounds)** + + This function takes a list consisting of sound files (i.e., a list of arrays) and combines them into a + single sound. All sound files must have the same + length. The function returns the created sound in a new array. + + To calculate the combined sound, take the sum of each wave sample + divided by the number of wave samples (i.e., compute the mean). + + ---- + + **silence(dur)** + + Return an array consisting of dur seconds of silence. + + ---- + + **equal_scale(freq,​ amp, dur, sampler=sin_sample)** + + Return an ascending chromatic scale starting on freq where + each tone is dur seconds long.  A short silence should be inserted between + notes. ​ The function sampler may optionally be passed into the function to + determine which waveform should be used (if none is passed, sin_sample you wrote earlier is used). + + A chromatic scale is a 12-tone scale. In equal temperament,​ each + note is equally spaced up the octave. An octave is reached when + the frequency doubles. It follows, then, that the i-th note of the + scaled is + + f(i) = freq*2^(i/​12) + + Note that, despite being a 12-tone scale, the last note is a repeat + of the first, but an octave higher. So, in total there will be 13 tones. + + Useful functions: + + a = append(a,b) # append array b to a and assign the result to a + + ---- + + **split_on_silences(data,​ len_thresh=.25,​ vol_thresh=.1)** + + This function splits sound samples stored in array data into a list of arrays, each containing a segment of sound from the original array. The segments are divided on their silences. ​ A silence is defined as a segment of the sound where the absolute value of all samples is less than or equal to vol_thresh multiplied by the maximum sample value for a duration of at least len_thresh seconds. ​ Silences should not be included in the results. ​ + + Depending on the wave chosen, len_thresh and vol_thresh may require some tweaking. However, the default values used should work in the majority of situations.

cs190c/more_detail.txt · Last modified: 2008/01/31 17:47 (external edit)        