Friday 14 December 2012

Week 12: 14/12/2012


Today was the last day of the trimester and it consisted of us sitting the last assessment for the class.

I felt that I did not too badly in the final test as I scored 11.67 out of 20. This was an improvement on the last two test as I score 10 out of 20 on both of them.

I feel that this class gave me a better insight on sound and images as I now understand how to make both a better quality as well as sound and image techniques that make the respecting item sound or look a certain way.
Also it gave me a better understanding on file sizes and the best compression to use for both video and images so that the quality is not affected. I feel that this class what very beneficial.

Friday 7 December 2012

Week Eleven: 07/12/2012


This week in the lecture we did some tutorial questions as there are no more lectures to do.

The questions where are follows:

Q1. A true colour image
640x512 pixels
RAW uncompressed
What is the minimum storage space required?

Ans. Total = 640x512x3
=~ 0.9mb

Q2. If a video player played images of the above type at 30fps
What is the bitrate?

Ans. 30x0.9mb/s
30x8x0.9mb/s (the 8 is due to the bitrate)
=235mb/s

Q3. Video(including audio) is transmitted
1080x1024
120fps
3bytes per pixel
3D
AUDIO 24 bit sample
96kHz sample rate
7 channel surroundsound
Lasts 90mins
If no compression, what is the minimum file size?

Ans. 262069862400 bytes

These might be useful for revision for the test next week.

LAB
In the lab I went through all my blogs to check they were up-to-date and put in some links and picture that I had previously forgot to put in.
I also uploaded the video from last week to YouTube, I was having a problem with it at home.
I feel that my blogs have enough information to help me pass my test next week.

AT HOME 
Between now and next week i will look at the example tests on moodle and revise all the Powerpoint and my blog to ensure I pass the test next week.

Week 10: 30/11/2012


Today in the lecture we spoke about moving images and digital videos.

Warning - There is a lot of text here! BUT it is VERY useful so look at it!

Moving Images
We spoke about persistence vision and how it is a theory that states that the human eye will remember and image for one twenty-fifth of a second on the retina and this gives the brain the illusion that the image is moving.
This, however, is and old idea and is now regarded as the myth of persistence of vision.
A more plausible theory to explain motion perception are two distinct perceptual illusions:



Digital Video
We also spoke about the amount of space need for videos.
Uncompressed HD video files could be large, for example, 3bytes per pixel, 1920x1080 by 60 frames per second. This equals 373.2 megabytes per second.
This is approximately 1 gigabyte every three seconds.
Even as we stand today this is an extreme amount of data.
This is why we have many varieties of compression algorithms and standards to dramatically reduce the amount of data used in video storage, processing, streaming and transmission.

Below is some VERY IMPORTANT  terminology:



We then spoke of different video file formats:

MPEG-1 - Development of the MPEG-1 standard began in 1988, Finalised in 1992 and the first MPEG-1 decoder was made available. Compressing video to about 26:1 and audio 6:1, the MPEG-1 format was designed to compress VHS quality raw digital video and CD audio with a minimum of quality loss.
Today, it is the most widely compatible lossy compression format in the world. (ie very blocky compression artifacts) The MPEG-1 standard is part of the same standard that gives us the MP3 audio format. Fortunately, the MPEG-1 video and Layer I/II audio can be now be implimented in applications royalty free and without license fees, since the patents expired in 2003.



MPEG-2 - The MPEG-2 format was an improvement on the MPEG-1 format. The MPEG-1 format had less efficient audio compression, and was restricted when it came to the packet types it accepted. It also did not support interlaced video.
MPEG-2 is the format of choice for digital televison broadcasts.
Work on the MPEG-2 began 1990 — before the first draft of MPEG-1 was ever written. It was intended to extend the MPEG-1 format to provide full broadcast quality video at high bitrates, between 3 and 5 Mbits/s.



MPEG-4 - Is essentially a patented collection of methods to define compression of video and audio, designating a standard for a group of audio and video codecs.(coder/decoder) MPEG-4 encompasses many of the features of MPEG-1 and MPEG-2, while adding support for 3-D rendering, Digital Rights Management (DRM), and other types of interactivity.

QuickTime - Appeared in 1991 under a proprietary license from Apple, beating Microsoft’s Video for Windows to the "market" by nearly a full year.
QuickTime video playback. Possibly the best of the Linux programs that can handle playback of most QuickTime files are VLC and MPlayer, both of which are in the PCLinuxOS repository.
In 1998 the ISO approved the QuickTime file format as the basis of the MPEG-4 file format. The benefit is that MOV and MP4 files (containers) are interchangeable on a QuickTime-only environment (meaning running in an "official" QuickTime player, like QuickTime on the Mac OS X or QuickTime for Windows), since both use the same MPEG-4 codecs.

AVI (Audio Video Interleave) - appeared in1992 by Microsoft as a part of its Video for Windows technology. It is basically a file container the allows synchronized audio and video playback.
Since AVI files do not contain pixel aspect ration information, and many players render AVI files with square pixels, the frame (image) may appear stretched or squeezed horizontally when played back. However, VLC and MPlayer have solved most problems related to the playback of AVI files.
Although being "older" technology, there is a benefit to using AVI files. Because of it being around for so long, coupled with Microsoft’s market penetration, AVI files can frequently be played back on the widest variety of systems and software, second only to MPEG-1. It has gained widespread acceptance and adoption throughout the computer industry, and can be successfully played back, so long as the end user has the proper codec installed to decode the video properly. Additionally, the AVI format is well documented, not only from Microsoft, but also many, many third parties.


WMV(Windows Media Video) - is made with several different proprietary codecs, made by Microsoft. It has gained adoption for use with BluRay discs.
The WMV files are often wrapped in the ASF, or Advanced Systems Format. WMV files, themselves, are not encoded. Rather, the ASF wrapper is often responsible for providing the support for Digital Rights Management, or DRM. Based on Windows Media 9, WMV files can also be placed inside an AVI container.  In that case, the WMV file claims the AVI file extension.
WMV files can be played on PCLinuxOS, using VLC, MPlayer, or most any other program that uses the FFmpeg implementation of the WMV codecs.


3GP - is actually two similar formats. The first, 3GPP, is designed as a container format for GSM phones (in the U.S., primary GSM wireless carriers are AT&T and T-Mobile). The second, 3GPP2, is designed as a container format for CDMA phones (in the U.S., primary CDMA wireless carriers are Verizon and Sprint). 3GPP files will often carry a 3GP file extension, while 3GPP2 files will often carry a 3G2 file extension.
(A little complex) 3GP and 3G2 files store video streams using MPEG-4 Part 2, H.263, or AVC/H.264 codecs. Some cell phones will use the MP4 file extension to represent 3GP video. Both formats were designed to decrease storage and bandwidth requirements to accommodate mobile phones.
Software support under PCLinuxOS is, once again, achieved with VLC and MPlayer. Additionally, 3GP files (and most 3G2 files) can be encoded and decoded with FFmpeg.

FLV - Flash Video, are a file container format used primarily to deliver video over the Internet. In fact, it has become the defacto format of choice for such sites as YouTube, Google Video, Yahoo! Video, Metacafe, and many news outlets.While the FLV format is an open format, the codecs used to produce FLV files are mostly patented. The most common codecs used are the Sorenson Spark (H.263 codec variant) and On2’s VP6. FLV files can also be encoded as H.264 as in the more recent releases of Adobe Flash.



The great the compression on a video means the greater loss of information.
Algorithms that compress video predicatively still have problems with fast unpredictable and detailed motion!
Automatic Video Quality Assessment could be the solution.

LAB

In the lab we produced an edited video using clips and music given to us. Below is a link to the finished video that I made (there was an issue with time at the end so the music goes a bit strange sorry):



Friday 23 November 2012

Week 9: 23/11/2012


Today we spoke about Digital Image Processing and why we use it.

We use it for editing pictures, it provides a flexible environment for successive experimental attempts to achieve some desired effect.
It allows us to manipulate, enhance and transform photos that are not available when using darkroom based photography.

We also spoke about digital camera imaging system and digital camera image capture. Below are the slides from the powerpoint as they describe these better than I would:



We spoke about pixelization, this can be seen by the human eye if the sensor array resolution is too low. If you increase the number of cells in the sensor array then the resloution of the image will also increase. Modern sensor devices have more than one million cells.
Below is a picture of pixelization:


To capture images in colour, red, green and blue filters are placed over the photocells. 
Each cell is assigned three 8 bit numbers (giving 2^8 = 256 levels) corresponding to is red, green and blue brightness value e.g. A pixel has:
  • red brightness level of 227
  • green level of 166
  • blue level of 97
Below are the slides for the digital camera optics and the digital image fundamentals slides (for extra info):



Pixels are individually coloured, they are only ab approximation of the actual subject colour.

The Dynamic range of a visual scene is effectively the number of colours or shades of grey(grey scale).
However, the range of digitized images are fixed by the number of bits(bit-depth) the digital system uses to represent each pixel.
This determines the maximum number of colours or shades of grey in the palette.

Below is an image of a typical digital image processing system looks like:


We spoke about what digital image processing was:



Analysis:






Manipulation:




Enhancement:










NOTE: http://lodev.org/cgtutor/filtering.html A site that lets you look at how filtering works.

Transformation:



In the lab:

In the lab we looked at tutorial videos for Adobe Premiere Pro CS4.
The like to the tutorial site is :



Friday 16 November 2012

Week 8: 16/11/2012


This week in the lecture we discussed light and how it moves in the air.

What is light?
Light is a form of energy detected by the human eye, unlike sound, it does not require a medium to propagate. It can travel from the Sun through the vacuum of outer space to reach Earth.

Light is a transverse wave. [Traverse Wave - is like a rippling pond in which the water molecules move away from the disturbance.]

We also discussed light waves and the rage visible to humans. Below is a slide explaining this.


Below is the electromagnetic Spectrum:


We also discussed the velocity of light and how it changes when it passes through air or glass. In air light travels about one million times faster that the speed of sound(33 m/s).



We also briefly discussed the frequency and wavelength of a lightwave but we have already discussed this in week 2.

Also we talked about the visible light spectrum and how light bends when passed  through a prism, this is due to the density bending them.

We also discussed it affects on the environment such as reflection. I will be review these more at a later date.

In the Lab we took an image of a church tower and edited it using Photoshop.

We applied a default filter to the image and it cause it to get lighter and have white pixel areas on it.
We then applied preset masks and they did the following:

High Pass - made the image become grayer.
Maximum - made the image go lighter and blurry.
Minimum - made the image go darker and blurry.

I then created a custom filter that had all zeros except the center value, which was one, this cause nothing to happen to the image. This happens because when the filter is added to the pixel it doesn't change the value enough for us to notice a change.

I then created create a new filter with a two by two matrix of one’s at its center, this made the picture do so much bright the picture is nearly all white.

I then create a new filter with a two by two matrix with +one’s on one diagonal and –one’s on the other, this made the picture go almost entirely black, only some faint white outlines are left.



Friday 9 November 2012

Week 7: 09/11/2012


This week in the lecture we discussed lighting and how its position will affected pictures.

What positions of lights are there and what effect do they have?

Front lighting will cover the picture subject in light.

Side-lighting will create some shadows, this is good for drawing portraits.

Back-lighting will cover the picture subject in shadows.

A more in depth explanation of tall three:

Front light
Lighting a subject directly from the front removes quite a bit of depth from the resulting image. To accomplish a front lighting effect without losing your depth, have a light on each side of the camera, about 45 degrees upward, pointing down at the subject. This setup gives a wider front light that seems less intense and can preserve the depth of the subject.

Side light
Side light is great for emphasizing the shape and texture of an object. It clarifies an object's form, bringing out roughness and bumps. A blend between front and side light is common, as it communicates shape and form, while softening the flaws that direct side lighting can reveal.

Back light
Back light is wonderful for accentuating edges, and emphasizing the depth of an image. Back light often gives a thin edge of light around objects, called rim lighting, although it's hard to see it if the light is positioned directly behind the subject. Giving a foreground object a rim light will make it stand out from the background, accentuating the division in depth.


There is also:

Top light
Direct top light alone can make for a very sad and almost spooky feeling. Although we're used to seeing subjects lit from above (sunlight and most indoor lighting), there are usually other light sources filling in the shadows. Therefore, to achieve this effect, fill lights, if used, must be dramatically reduced in intensity.


Top-lighting
Bottom light
Bottom light is the light we're least accustomed to seeing. It has an intense impact when used, making objects look completely different and often sinister.

Bottom-lighting

We also spoke about how the higher the contrast on a picture the clearer it will look.


We then went to the Lab where we completed an exercise on editing a sound file to remove a unwanted sound and edit the file with effects to create the illusion of being in a room.

To do this I cut out the unwanted sound so only the speech was left. I then added a reverb onto the file to create to feel that the speaker was in a church or other large area.

I also increase the volume of the speaker to give the appearance that he was angry by making the sample louder and adding a fade in effect.
I feel the exercise was a success as ~I managed to complete it without any help and the file sounded great when it was done.

Friday 2 November 2012

Week 6: 02/11/2012


This week in the Lecture we did another test to see how much we have learned and improved since the last one.

We then went to the lab and used Soundbooth to edit the SopranAscendDescend file again.

When the Compression effect, For Voice – Moderate, was applied the waveforms amplitude decreased in size as well as the sound becoming quieter.

The Use of Audio Compressors - What are the uses?

The first, and mostly the only, reason to use compressors should be for the sound. If used properly, a compressor – or more correctly a limiter - will place an absolute cap on the maximum level that can be passed.
This is invaluable for preventing a large PA system from distorting, or making certain that the ADC (Analogue to Digital Converter) does not clip (exceed the maximum conversion voltage).
Digital distortion is extremely unpleasant, and is to be avoided, as with all forms of hard clipping.
There are many other reasons to use compression, for example, many instruments don’t have the sustain that musicians desire. So by using compression, as the signal fades, the compressor increases its gain, so the note lasts longer.
For more examples and more reasons on why it’s used visit: http://sound.westhost.com/compression.htm#why_use

Spectral Frequency Display



This is the spectral frequency display of the file we are editing; it’s consistent with the waveform.



This is the spectral frequency display of the new file, englishwords2, we are editing; it’s consistent with the waveform. Also it is in spikes rather than lines as the words are spoken and not a constant sound like singing.

Reverb
After applying convolution reverb, clean room – aggressive, I found that the file had taken on a more computerizes quality. It sounds more robotic.
After applying convolution reverb, roller disco - aggressive, I found that the file had taken a more echo like quality. It sounds like it was recorded in a large open room.

How reverb is created in a Room
It is created when the sound is bounced off the walls, floor and ceiling and returns back to the recording equipment making the reverb.

 How reverb is created by a computer on a wav file
 The computer takes the track and adds a distorted version of it to the original to create the illusion of reverb.

Computer Speech Transcription
 I tried out this feature on the computer and it managed to pick out a few words with little problem but some of the other words it had difficulty with. It thought some words were in fact two, for example freedom - free and.

The spectrogram, the one above, shows that most of the speech energy is coming from the middle if the word spoke. The begin and end of the word does not have the same amount of speech energy.