The
most common and primitive method of video signal transmission is
between any illuminated object and the human eye. Light from the sun, or
any alternative source, reflects off an object towards us at, well, the
speed of light. In today’s digital world, littered with video streaming
wherever we turn, things happen, rather, at the speed of data packets.
Analogue
signals will never be conquered by the digital world. This is based on
pure physics. Analogue light and sound waves exist all around us, mostly
without human interference. Another reason why we’ll always need the
analogue spectrum is because humans are analogue beings. Our eyes and
ears receive analogue waves in order to see and comprehend, and our
vocal cords produce analogue vibrations for us to be audible. In fact,
any digital communication device in the modern world requires A-D
(Analogue to Digital) and D-A conversion at either ends of the system to
make it usable for human beings. The digital part is purely the
technology used to transport the information between end-points without
quality loss.
Media streaming is nothing new. The most prevalent example of streamed video and audio is probably from the popular website, YouTube. Many other web services are also using the World Wide Web to distribute video content. Video streaming is however not limited to internet connectivity, and there are many applications which require video distribution over Local Area Networks (LAN) or Wide Area Networks (WAN) in an IP, point-to-point, or multipoint network. The conventional way to distribute video is over copper-cabled systems. From elementary Radio Frequency (RF) networks, to higher resolution video formats such as RGBHV (Red, Green, Blue, Horizontal, Vertical), commonly (and incorrectly) termed VGA signals. When digital video technologies surfaced, they brought along High Definition (HD) resolutions and formats such as HDMI (High Definition Media Interface), DVI (Digital Video Interface), Display Port and SDI (Serial Digital Interface), with the latter still being distributed on RG59 coaxial cable. Parallel to the digital video technology development, Information Technology (IT) also took off at a breakneck pace.
Media streaming is nothing new. The most prevalent example of streamed video and audio is probably from the popular website, YouTube. Many other web services are also using the World Wide Web to distribute video content. Video streaming is however not limited to internet connectivity, and there are many applications which require video distribution over Local Area Networks (LAN) or Wide Area Networks (WAN) in an IP, point-to-point, or multipoint network. The conventional way to distribute video is over copper-cabled systems. From elementary Radio Frequency (RF) networks, to higher resolution video formats such as RGBHV (Red, Green, Blue, Horizontal, Vertical), commonly (and incorrectly) termed VGA signals. When digital video technologies surfaced, they brought along High Definition (HD) resolutions and formats such as HDMI (High Definition Media Interface), DVI (Digital Video Interface), Display Port and SDI (Serial Digital Interface), with the latter still being distributed on RG59 coaxial cable. Parallel to the digital video technology development, Information Technology (IT) also took off at a breakneck pace.
When
an image is displayed optically with an overhead or slide projector,
for example, the image is created from light projected through a
filmstrip and a lens onto a distant surface, which then magnifies every
little bit of detail showing perfect lines and curves. In order to
reproduce the same image digitally, one would have to subdivide the
entire frame in tiny blocks (pixels) and colour them individually to
form a pixelated image. The more pixels used in the frame, the better
the image quality will appear, but theoretically it’s impossible to
produce a perfect curve by using square pixels. Even when the pixels are
so tiny that the human eye cannot differentiate between them, a curve
may appear smooth, but, in fact, it will always consist of tiny squares.
Therefore, higher definitions are the best way to produce quality digital video images.
In
order to convert the information of one digital pixel, the analogue
wave is plotted on an X and Y axis and broken up into various samples,
at a specific sample rate. The metric coordinates of each sample are
then converted to a binary system – a numerical system that only
utilises ones and noughts and most commonly uses eight digits
simultaneously to represent any number between nought and 256. Each
digit is known as a ‘bit’, and in the most generally used 8bit networks,
eight bits equal one ‘byte’. The high number of pixels in an image,
along with other relevant information, thus results in a large amount of
data packets – kilobytes and megabytes. Moving video is made up of a
series of still images which is displayed in a quick sequence to create
the illusion of a moving object. Standard formats use 24, 25 and 30
Frames per Second (FPS). The amount of data packets for one still image
thus needs to be multiplied by the refresh rate (FPS), which will
suggest the bandwidth required to transmit the video signal in the IP
network for every second that it’s playing – for example in Mbps (Megabytes per second).
Once
a signal is digital, the challenge is that the bandwidth required in HD
video signals exceeds the capacity of most commonly used IP networks.
These networks are adequate for basic networking requirements, but video
streaming will consume the available data flow and congest the entire
network, making it dormant for any of the users. Higher capacity
networks are available, but at an inflated cost, which is difficult to justify for video streaming only, if not purposely required.
Video Compression
This brings us to the reason why certain video signals are being compressed for IP distribution. Many different compression formats are available and currently in use. These are divided into lossless and lossy codecs. Many video codecs are necessarily lossy, simply because of the idea of eliminating information to reduce bandwidth. Lossy codecs compress video based on many algorithms. Basic lossy codecs are throwing away data at regular intervals, which is effective to reduce bandwidth, but may result in a much lower image quality. Another effective lossy compression format is based on an analysis of the nature of human vision, which then dismisses excess information that the human eye would find visually redundant, such as close colour variances. Human vision is much more sensitive to brightness (luma) than colour (chroma) and would thus not distinguish between pixels of closely relevant colour. The third is called ‘chroma subsampling’, which reduces colour space information at regular intervals. Colour sample ratios are often seen as 4:2:2, 4:2:0, 4:1:1 and many similar ratios, which indicate the changes between the chroma samples from one row to the next.
This brings us to the reason why certain video signals are being compressed for IP distribution. Many different compression formats are available and currently in use. These are divided into lossless and lossy codecs. Many video codecs are necessarily lossy, simply because of the idea of eliminating information to reduce bandwidth. Lossy codecs compress video based on many algorithms. Basic lossy codecs are throwing away data at regular intervals, which is effective to reduce bandwidth, but may result in a much lower image quality. Another effective lossy compression format is based on an analysis of the nature of human vision, which then dismisses excess information that the human eye would find visually redundant, such as close colour variances. Human vision is much more sensitive to brightness (luma) than colour (chroma) and would thus not distinguish between pixels of closely relevant colour. The third is called ‘chroma subsampling’, which reduces colour space information at regular intervals. Colour sample ratios are often seen as 4:2:2, 4:2:0, 4:1:1 and many similar ratios, which indicate the changes between the chroma samples from one row to the next.
Lossless
codecs, as the name indicates, do not discard any information and thus
compress video signals by preventing duplicate pixel information from
being transmitted. Duplicate pixels exist where large areas in an image
are made up of the same pixel information, such as a large background of
the same colour with little motion, or adjacent video frames from long
scenes in a movie or fixed film sets such as filmed interviews. These
will result in many frames having similar pixel information. In these
frames, only the changing pixels will be transmitted. Another format of
video compression is to group average pixel values together, where
several pixels are averaged out into one large rectangular pixel of the same value.
Most
of these systems allow the user to adjust variable settings prior to
compression, such as a specific resolution, frame rate reduction and bit
rate to be preconfigured for maximum quality versus maximum
compression. Multiple formats have been developed over the years. JPEG,
HDV and MPEG-2 are examples of lossy compression, where the latter
compresses data over multiple frames (interframe) instead of individual
frames (intraframe). Mpeg4000 and its version 10 (H.264) are some of the
most popular compression formats used in current systems.
Uncompressed
video is also seeing the light of day, but in order to distribute
uncompressed video one would need a network with the required capacity,
such as 10Gbps and who knows what else the future holds. All that we know for now, is that the future of video distribution lies in IP systems.