Monday, March 21, 2016


In the words of one of the founders of Unix,  the best thing about unix is it's community and not the tools, files, c language, portability or open source, and the worst thing about unix is that there are many communities.

Good, Bad and Ugly about Unix

Friday, March 18, 2016

telecine 2:3 pulldown

Films are shot and projected at 24 frames per second. NTSC based televisions and monitors display 60 pictures per second. The technique to convert from film fps to telivision fps is called telecine process.

First each input frame is divided into two fields, odd and even scan lines. This is called interlacing. It reduces the bandwidth to transmit by half.

Next it follows a field duplication method called 3:2 pull down. Since 60/24 = 2.5 = 5/2, we need to create five pictures out of every two frames. It repeats the first field two times and second field three times and so on for the odd and even number pictures.

If the input is already interlaced for broadcasting and we want to display on computer monitor, we need to deinterlace at the least to properly display it. It is not simple summing the fields into one picture and need some averaging of some pixels from adjacent fields. Pictures are not static and they are half pictures of two consecutive pictures of motion. If it is telecined, then deinterlacing is more than the averaging the adjacent pictures and involves finding the right pictures to average.

The term pulldown is actually about slow down. It was mandatory to have color television broadcasting to allow old black and white televisions to continue to work. They continued to transfer luminescence or lightness or brightness information along with chrominance or color information. They were also concerned about additional bandwidth required to transmit color information to black and white telivsions. They slowed down the frames per second by transmitting only 1000 frames during the interval 1001 frames were transmitted earlier.

Wikipedia page 3:2 pulldown process
Apple tutorial telecine process
Miscrosoft doc temporal rate conversion
Framerate follies trick

Saturday, March 12, 2016

Gamma coding

Humans are more sensitive to relative differences between darker tones compared to relative differences between lighter ones.  This is some kind of power law. 

If we simply linearly encode the image for all hues, we need to allocate too many bits to store it and also cause too much bandwidth to transmit it.  Assuming humans can respond to luminescence values 100 to 1 and can detect the contrast ratio of two values that exceed 1% or 1.01 delta, we need to encode values 1, 1.01, 1.02, .... 99.09, 100 and the total number of codes needed is 99/0.01 or 9900 codes or 14 bits for each tone. We also end up using few bits to shadow values that humans are sensitive to and that will be lost opportunity to improve the quality. We use more bits to portions that humans are not sensitive to and that will be additional cost with no improvement to visual quality. So linear encoding is not good from both the compression and quality point of view.

If we use the observation that most highlights cannot be differentiated by humans, we can employ nonlinear encoding.  If we encode only ratios starting from 1,  1 + 0.01, (1+0.01)^2, ... , then we need just  log(100)/log(1.01) or around 462 codes or nine bits. If a television or display has only contrast ratio of 50:1, then it is just eight bits.

Power law :  Vout = Vinγ

Gamma(γ) is slope of log plot of input and output. Most of our images are encoded with 0.45 gamma value and decoded with 1/0.45 or 2.2  gamma value.  There is another power law in place in CRT based display monitors. The light produced on the display is approximately proportional to the applied voltage raised to the 2.5 power. This is also called gamma. This is amazing coincidence that vision gamma or image gamma is kind of inverse to display or monitor gamma. The net effect of applying both gammas is called system gamma and ideally should be 1.0. 


Gamma correction
Gamma FAQ
System gamma

Thursday, March 3, 2016

Mirrors and images

Characteristics of the image formed when an object is placed before a plane mirror.

  • Image size is same as the size of the object. Magnification factor is one.
  • Distance of image from mirror is same as the distance of object from mirror.
  • Image is upright and not inverted.
  • Image appears left and right reversed.
  • Image is virtual. It appears like it is formed in the back of the mirror where the light can not even reach.
Single mirror produces one image. When an object is placed between two mirrors, there will be more than one image. When the mirrors are placed together to form a right angle, then the number of images formed is three - one in each mirror and one in the crease. When the mirrors are placed with 60 degrees, then the number of images formed is five.  When the angle is theeta, the number of images will be (360/theeta -1). 

The more accurate formula depends on the angle and the position of the object.  This was from a paper by V.M. Kulkarni in 1960.

  • When 180/theeta is integer x, then the answer is 2*x - 1,
  • When 180/theeta is integer x + 0.5, then the answer is 2*x when the object is on angular bisector or symmetrically located with respect to two mirrors or 2*x + 1 otherwise.
  • When 180/theeta is integer x + (n/q), then the answer is 2*x or 2*x +1 depending on whether the object is located on central angular sector of (q-2*n) about the angle bisector.
  • When 180/theeta is ineger x + (n/q), then the answer is 2*x or 2*x +1 depending on whether the object is located on central angular bisector of (2*n-1) about the angle bisector.

Light from Physics Classroom
Patterns in Multiple Reflections