Tuesday, 28 February 2017

The Pixel Age


The most common and primitive method of video signal transmission is between any illuminated object and the human eye. Light from the sun, or any alternative source, reflects off an object towards us at, well, the speed of light. In today’s digital world, littered with video streaming wherever we turn, things happen, rather, at the speed of data packets.

Analogue signals will never be conquered by the digital world. This is based on pure physics. Analogue light and sound waves exist all around us, mostly without human interference. Another reason why we’ll always need the analogue spectrum is because humans are analogue beings. Our eyes and ears receive analogue waves in order to see and comprehend, and our vocal cords produce analogue vibrations for us to be audible. In fact, any digital communication device in the modern world requires A-D (Analogue to Digital) and D-A conversion at either ends of the system to make it usable for human beings. The digital part is purely the technology used to transport the information between end-points without quality loss.

Media streaming is nothing new. The most prevalent example of streamed video and audio is probably from the popular website, YouTube. Many other web services are also using the World Wide Web to distribute video content. Video streaming is however not limited to internet connectivity, and there are many applications which require video distribution over Local Area Networks (LAN) or Wide Area Networks (WAN) in an IP, point-to-point, or multipoint network. The conventional way to distribute video is over copper-cabled systems. From elementary Radio Frequency (RF) networks, to higher resolution video formats such as RGBHV (Red, Green, Blue, Horizontal, Vertical), commonly (and incorrectly) termed VGA signals. When digital video technologies surfaced, they brought along High Definition (HD) resolutions and formats such as HDMI (High Definition Media Interface), DVI (Digital Video Interface), Display Port and SDI (Serial Digital Interface), with the latter still being distributed on RG59 coaxial cable. Parallel to the digital video technology development, Information Technology (IT) also took off at a breakneck pace.
When an image is displayed optically with an overhead or slide projector, for example, the image is created from light projected through a filmstrip and a lens onto a distant surface, which then magnifies every little bit of detail showing perfect lines and curves. In order to reproduce the same image digitally, one would have to subdivide the entire frame in tiny blocks (pixels) and colour them individually to form a pixelated image. The more pixels used in the frame, the better the image quality will appear, but theoretically it’s impossible to produce a perfect curve by using square pixels. Even when the pixels are so tiny that the human eye cannot differentiate between them, a curve may appear smooth, but, in fact, it will always consist of tiny squares. Therefore, higher definitions are the best way to produce quality digital video images.
In order to convert the information of one digital pixel, the analogue wave is plotted on an X and Y axis and broken up into various samples, at a specific sample rate. The metric coordinates of each sample are then converted to a binary system – a numerical system that only utilises ones and noughts and most commonly uses eight digits simultaneously to represent any number between nought and 256. Each digit is known as a ‘bit’, and in the most generally used 8bit networks, eight bits equal one ‘byte’. The high number of pixels in an image, along with other relevant information, thus results in a large amount of data packets – kilobytes and megabytes. Moving video is made up of a series of still images which is displayed in a quick sequence to create the illusion of a moving object. Standard formats use 24, 25 and 30 Frames per Second (FPS). The amount of data packets for one still image thus needs to be multiplied by the refresh rate (FPS), which will suggest the bandwidth required to transmit the video signal in the IP network for every second that it’s playing – for example in Mbps (Megabytes per second).
Once a signal is digital, the challenge is that the bandwidth required in HD video signals exceeds the capacity of most commonly used IP networks. These networks are adequate for basic networking requirements, but video streaming will consume the available data flow and congest the entire network, making it dormant for any of the users. Higher capacity networks are available, but at an inflated cost, which is difficult to justify for video streaming only, if not purposely required.
Video Compression
This brings us to the reason why certain video signals are being compressed for IP distribution. Many different compression formats are available and currently in use. These are divided into lossless and lossy codecs. Many video codecs are necessarily lossy, simply because of the idea of eliminating information to reduce bandwidth. Lossy codecs compress video based on many algorithms. Basic lossy codecs are throwing away data at regular intervals, which is effective to reduce bandwidth, but may result in a much lower image quality. Another effective lossy compression format is based on an analysis of the nature of human vision, which then dismisses excess information that the human eye would find visually redundant, such as close colour variances. Human vision is much more sensitive to brightness (luma) than colour (chroma) and would thus not distinguish between pixels of closely relevant colour. The third is called ‘chroma subsampling’, which reduces colour space information at regular intervals. Colour sample ratios are often seen as 4:2:2, 4:2:0, 4:1:1 and many similar ratios, which indicate the changes between the chroma samples from one row to the next.
Lossless codecs, as the name indicates, do not discard any information and thus compress video signals by preventing duplicate pixel information from being transmitted. Duplicate pixels exist where large areas in an image are made up of the same pixel information, such as a large background of the same colour with little motion, or adjacent video frames from long scenes in a movie or fixed film sets such as filmed interviews. These will result in many frames having similar pixel information. In these frames, only the changing pixels will be transmitted. Another format of video compression is to group average pixel values together, where several pixels are averaged out into one large rectangular pixel of the same value.
Most of these systems allow the user to adjust variable settings prior to compression, such as a specific resolution, frame rate reduction and bit rate to be preconfigured for maximum quality versus maximum compression. Multiple formats have been developed over the years. JPEG, HDV and MPEG-2 are examples of lossy compression, where the latter compresses data over multiple frames (interframe) instead of individual frames (intraframe). Mpeg4000 and its version 10 (H.264) are some of the most popular compression formats used in current systems.
Uncompressed video is also seeing the light of day, but in order to distribute uncompressed video one would need a network with the required capacity, such as 10Gbps and who knows what else the future holds. All that we know for now, is that the future of video distribution lies in IP systems.