Friday, 19 October 2012

Week 4: 19/10/2012


In today’s lecture we discussed the human ear and how it processes sound.

To start off we discussed the structure of the human ear. Below is a sectional view of the human ear.


What does the human ear do?
The ear will receive the sound then it will change it into something that our brain can understand, similar to how a computer will change it to binary.  The sound will go through the ear canal to the ear drum, this will then vibrate. Due to the vibrations, the ossicles will start to do its bit and send the vibrations and their frequencies through the cochlea.

The cochlea then decides if the frequency is high, medium or low using the small hair cells within it. High frequencies are picked up at the begin of the cochlea as they die of quicker than lower ones, which are picked up later on. Below shows the cochlea structure:



Below shows the frequency response of the cochlea:



Below shows the process that happens from hearing the sound and it getting to the auditory nerve:



A few features of the auditory process are that:
  -It separates the left and right ear signals.
-It separates low and high frequency information.
-It also separates timing from intensity information.
-A two channel set of time-domain signals in contiguous and non-linearly spaced frequency bands.
-At various specialised processing centres in the hierarchy it can re-integrate and re- distribute.

Different animals have different audible frequency ranges, for example bats have such a high range that they use there hearing to map out a landscape but humans do not possess this ability as our hearing range is lower. Below is a graph showing some ranges for a few animals:



We discussed the “normal” hearing in humans and the ranges that we have. Below is a slide from the lecture showing these.



The MPEG/MP3 audio coding process uses lossy compression. This is where data the human would not perceive, if it was kept, is discarded by the computer to create space and get rid of useless information. It also uses psychoacoustic models which is a model of the human hearing. Below is a diagram of the process:



During the lab we used Soundbooth to edit a sound file so we could gain some knowledge on how the software works.

We then tried out some effects on the file, the following are what each effect did to the file as well as an explanation of what they do (for future reference):

Analogue Delay: This effect makes both echoes and subtle effects to the track.
Delays of 35 milliseconds or more create discrete echoes.
Delays of 15–35 milliseconds create a simple chorus or flanging effect. (The results won’t be as effective as the Chorus/Flanger effect, because the delay settings don’t change over time.)
Further reducing a delay to 10–15 milliseconds adds stereo depth to a mono sound.

Chorus/Flanger: This is a combination of two delay-based effects.
The chorus effect will stimulate several voices or instruments played at once by adding multiple short delays with a small amount of feedback.
This makes the edited track sound fuller and richer (like a chorus in a song).
Use this effect to enhance vocal tracks or add stereo spaciousness to mono audio.
The Flanger effect makes psychedelic, phase‑shifted sounds by mixing a varying, short delay with the original signal.
This makes the edited track sounds like the pitch is being slid up and down which creates the psychedelic feel.

Compressor: This effect will reduce the dynamic range, producing consistent volume levels and increasing perceived loudness.
Compression is particularly effective for voice-overs, because it helps the speaker stand out over musical soundtracks and background audio.
An Example would be classical music isn’t compressed and has dips in the volume where newer music has been fully compressed and has a consistent volume level.

Convolution Reverb: This effect will change the echoes in a track to make it sound like it is in a different space (closet, concert hall etc.).
Sound is bounced of surfaces like the ceiling, walls and floor when it is travelling to your ears. These reach your ears at almost the same time meaning that you don’t hear them and separate echoes, but as a sonic ambience that creates an impression of space. (Hall or cupboard)
Convolution-based reverbs use impulse files to simulate acoustic spaces. The results are incredibly realistic and life-like.

Distortion: Use the Distortion effect to simulate blown car speakers, muffled microphones, or overdriven amplifiers.

Dynamics: This effect is used as a compressor, limiter and expander.
 As a compressor and limiter, this effect reduces dynamic range, producing consistent volume levels.
As an expander, it increases dynamic range by reducing the level of low‑level signals. (With extreme expander settings, you can totally eliminate noise that falls below a specific amplitude threshold.)

EQ: Graphics: This effect boosts or cuts specific frequency bands and provides a visual representation of the resulting EQ curve.
Unlike the parametric equalizer, the graphic equalizer uses preset frequency bands for quick and easy equalization.
An example would be changing it to sound like someone is talking to you through an old telephone (muffled). Or changing the sound for a voice over.

EQ: Parametric: This effect provides maximum control over tonal equalization.
Unlike the graphics equalizer, that only gives a fixed number of frequencies to you, this one gives you total control over the frequencies.
For example, you can simultaneously reduce a small range of frequencies centered around 1000 Hz, boost a broad low-frequency shelf starting around 80 Hz, and insert a 60-Hz notch filter.

Mastering: This effect of optimizes audio files for a particular medium, such as radio, video, CD, or the web.
Before mastering audio, consider the requirements of the destination medium. If the destination is the web, for example, the file will likely be played over computer speakers that poorly reproduce bass sounds. To compensate, you can boost bass frequencies during the equalization stage of the mastering process.

Phaser: This effect is similar to flanging, it phasing shifts the phase of an audio signal and recombines it with the original, creating psychedelic effects.
But unlike the Flanger effect, which uses variable delays, the Phaser effect sweeps a series of phase-shifting filters to and from an upper frequency.
Phasing can dramatically alter the stereo image, creating unearthly sounds.

Vocal Enhancer : This will quickly improve the quality of voice over recordings.
It reduces sibilance and plosives, as well as microphone handling noise(low rumbles).
It will give vocals a characteristic radio sound.
The Music mode optimizes soundtracks so they better complement a voice-over.

No comments:

Post a Comment