Video

The video format is not the same as the file extension of the video file, rather they represent the container of the file. i.e. Format = Codec + Container

Codec:

  • It means Coder-decoder, It’s the algorithm used to encode(compress) the video to bytes
  • There are two types of Compression algorithms
    1. Intraframe Compression - It stores the whole video, in each frame, high quality output. for example,
      • M-JPEG - Motion-jpeg, store each frame as jpeg image
      • ProRes - family of intraframe codecs, made by apple, used in films, these range from ProRes 422 Proxy(low data rate) to ProRes 4444 XQ(high data rate)
      • DNxHD - Digital Nonlinear Extensible High Definition, family of intraframe codes made by avid
      • DNxHR - high resolution version of DNxHD supporting 4K video and above
    2. Interframe Compression - It stores only the change from previous frame, low quality output. for example,
      • H.264 or AVC - Advanced Video Coding, defined by MPEG(Motion Picture Experts Group), produces relatively small video files, recommended for youtube
      • H.265 or HEVC - High Efficiency Video Coding, defined by MPEG(Motion Picture Experts Group), produces same quality of H.264 in half of it’s size, but requires more computing power
      • H.262 or MPEG-2 - defined by MPEG(Motion Picture Experts Group), It need less computing of them all, but produces larger video files, It’s widely used for SD digital TV broadcasts and DVDs

Container:

  • Wrapper that stores video stream, audio stream, subtitles and metadata. They are as follows
    • MP4 - Defined by MPEG(Motion Picture Experts Group), recommended for youtube
    • AVI - Audio Video Interleave, Developed by Microsoft
    • MOV - Short for Movie, Developed by Apple
    • MXF - Material Exchange Format, used in professional setups
    • 3GP & 3G2 - Developed by Third Generation Partnership Project to be used on Mobile Phones
    • MTS, M2TS & TS - Created for AVCHD(Advanced Video Coding High Definition) and Blu-Ray, they Stand for MPEG transport stream, MPEG-2 transport stream and Transport stream

Formats:

  • H.264 + MP4 - widely used for online streaming and YouTube, .mp4 file extension, Universal compatibility, good compression efficiency. Used in Web streaming, social media, mobile playback
  • H.265/HEVC + MP4 - Next-generation streaming format, .mp4 file extension, 50% smaller file sizes than H.264 at same quality. Used in 4K streaming, bandwidth-constrained environments
  • VP9 + WebM - Open-source web streaming, .webm file extension, Royalty-free, optimized for web delivery. Used in Web browsers, HTML5 video
  • ProRes + MOV - Apple’s professional editing standard, .mov file extension. Used in Final Cut Pro, professional video editing
    • MOV + ProRes 422 (standard editing quality)
    • MOV + ProRes 4444 (highest quality with alpha channel)
  • DNxHD/DNxHR + MXF - Professional broadcast format, .mxf file extension, Frame-accurate editing, metadata support. Used in Avid Media Composer, broadcast workflows
  • DV + AVI - Digital Video format, .avi file extension, It has good compatibility with windows systems, Used in Older camcorders, legacy systems
  • H.263 + 3GP - Mobile phone video, .3gp file extension, Very small file sizes for mobile devices. Used in Older mobile phones, low-bandwidth scenarios
  • H.264 + MKV - Flexible container with reliable codec, .mkv file extension, Supports chapters, multiple subtitles, metadata. Used in High-definition video storage, multiple audio tracks
  • M-JPEG + MOV - Motion JPEG format, .mov file extension, Each frame is independently encoded. Used in Frame-by-frame editing, security cameras
  • H.264 for HLS + MP4 - HTTP Live Streaming, Not file extension as it’s bit streaming, Adjusts quality based on connection speed. Used in Adaptive Bitrate Streaming
  • H.265 + DASH - Dynamic Adaptive Streaming, Not file extension as it’s bit streaming, Efficient delivery of 4K content. Used in Modern streaming platforms
  • AV1 + MP4 - Next-generation open codec,  .mp4 file extension, Superior compression, royalty-free. Used in Netflix, YouTube premium content
  • VP8 + WebM - Web-optimized combination, .webm file extension, Low-latency streaming, open-source. Used in WebRTC video calls, browser-based applications

Audio

The digital Audio files provide an approximation of analog audio waveforms and before any other compression is applied their quality is determined by two parameters:

  1. Sample Rate - how many times a second a measurement is taken of the magnitude of the analog audio signal, the common sample rate is nearly 44.1kHz which stores a digital value 44,100 times every second, this is used in all audio CDs and it’s considered to be high quality as it equates the capabilities of human hearing, btw the normal sample rate used in professional setup is 48kHz and to reduce quality loss during production they use sample rates over 88.1kHz and 96kHz and for even more quality is required then they use around 170.5kHz and 192kHz
  2. Bit Depth - Most audio formats sample data using a method called pulse code modulation(PCM) which allocates a fixed number of bits to represent the height of the audio wave. Thus Bit depth refers to the number of bits of data allocated to each sample. The 8-bit audio is considered as low quality as the magnitude of the audio wave has to be recorded on the scale that offers just 256 digital values, stepping up 16-bit audio provides a value range from 65535 which provides much higher quality and is bit depth used for audio CDs and many digital audio files, btw bit depth of 24-bit(3 bytes - provides 0 to 16,777,215) and 32-bit(provides 1.2 x 10^-38 to 3.4 x 10^38) are used when higher quality is needed

The concepts of Codec, Containers and Formats also gets applicable to Audio files, content is here

High Quality, Non-Compressed Audio:
Used in production environments

  • WAV - Wave Audio File Format was made by IBM and Microsoft in 1991, and has a container with .wav extension
  • BWF - Broadcast Wave Format, made by EBU by extending WAV in 1997, however the audio data is identical and many BWF files use .wav extension rather than .bwf
  • AIFF - Audio Interchange File Format was developed by Apple in 1998, and has a container with .aif extension
  • All the above three support sample rates up to 192kHz in up to a 32-bit float bit depth
  • DSD - Direct Stream Digital, Made by Sony and Philips in 1999 for there super audio CD(SACD) format
    • DSD abandons the pulse code modulation encoding method, which usually allocates 8, 16, 24 or 32 bits of data to each audio sample
    • Rather in DSD Pulse Density Modulation is used this allocates just one bit per sample which is used to indicate if the current sample is higher or lower than the previous one
    • It has a Sample rate of 2.8MHz for a variant called DSD 64, with other variant sampling at 5.6, 11.2 or even 22.4MHz, this makes DSD offer an excellent quality audio, but it’s difficult to edit
    • While DSD is developed for super audio CDs these actually store data in a losslessly-compressed version of DSD called DST(Direct Stream Transfer)

High Quality, Lossless Compression Audio:
Here the Audio files are compressed for lesser size requirement, but they lack very little quality in compression

  • FLAC - Free Lossless Audio Codec, made by non-profit Xiph.org Foundation in 2001, and has a container with .flac extension
  • ALAC - Apple Lossless Audio Codec, made by apple in 2004, and has a container with .m4a extension
  • Both FLAC and ALAC support a sample rate up to 192kHz and a bit depth of up to 32-bit floating point file sizes may be up to 70% smaller than WAV or AIFF
  • Dolby TrueHD - Lossless version of Adobe digital family of codecs, also called as AC-4
  • Monkey's Audio - Free Losslessly-compressed format which uses .ape extension, Made by Mathew Ashland in 2000
  • WAV Pack - It acts like szip compressor for audio files including the preservation of all headers and metadata so the restored files are identical to the original, it uses .wv extension and Made by David Bryant in 1988
  • WMA - Lossless version of Windows Media Audio

Lower Quality, Lossy Compression:
It sacrifices quality to reduce file size, and in turn their bit rate(number of bits transmitted per second - this is important when we stream music via internet)

  • MP3 - Depending on the codec used this MP3 means anyone of “MPEG-1 Audio Layer III” or “MPEG-2 Audio Layer III” or “MPEG-2.5 Audio Layer III”, It was made by Fraunhofer Society in 1991, and has a container with .mp3 file extension. It supports a Sample rate of up to 48kHz, and a data rate of up to 320kbps
  • AAC - Advanced Audio Coding, made in 1997, which supports up to 96kHz at up to 384Kbps. This can be delivered in many containers like .aac, .m4a, .m4b and .m4r
  • AC-3 - Arc Consistency Algorithm 3, made by Dolby which is also called as “EC-3” or “Dolby Digital Plus” or “DDP”
  • MQA - Master Quality Authenticated is proprietary and high-resolution, the future is uncertain
  • Ogg Vorbis - Open non-proprietary format uses .ogg as extension
  • Opus - Open format that is great for streaming and uses .opus as extension
  • WMA - Lossy version of Windows Media Audio

Image

There are two types of compressions in the Image formats

  1. Lossy Compression - It discards data and reduce the image quality, for example it uses Discrete Cosine Transform(DCT)
  2. Non-Lossy Compression - Reduces file size while retaining all the information, for example it uses run-length encoding(RLE)

Color Channels: Most of the image formats store data in three color channels, using red/green/blue(RGB) color space, but for professional use we also use four color channels cyan/magenta/yellow/black(CMYK), In addition some channels also support two channels Black and White(greyscale)

Different Image formats also store different quantities of color information

  • Most formats can support 8 bits per channel(8bpc), which with a three-channel RGB color space delivers 24-bit color, this works for most cases
  • But some formats also support 16bpc and 32bpc which provides greater flexibility in image manipulation
  • Other offer 8-bit color to save file space
  • In addition to color values, some raster formats can store individual pixel transparency using something called alpha channel, for example an image with RGB values, will become RGBA when an alpha channel is added

There are two types of Image formats
Raster Image Formats:
This stores images as a grid of pixels, which is also called a bitmap, It means when you zoom into the image, we see distorted pixels edges of the Image, this is best suited for photographs and other images which are hard to mathematically define and for the Images for which quality not matters

  • JPEG - Created by Photographic Experts Group in 1992, This store raster images in either RGB or CMYK with DCT lossy compression, which can have extensions like .jpg, .jpeg - There is no alpha channel in this and has a maximum 24-bit color, this is widely supported like can be captured and read by most of the devices, this is widely used in sharing photographs, in some cases it is also used in print workflows, if that the case, it should be stores in high resolution with CMYK color space
  • PNG - Portable Network Graphic, created in 1994. It stores raster images using a non-lossy compression, this include alpha channel which makes them very lucrative for website assets, on the flip side they can only store 8-bit(PNG-8) or 24-bit(PNG-24) color this makes them to only store in RGB color there is not CMYK option, There also exists PNG-32 which includes alpha channel allocating 8 bit pixels to each of their red, green, blue and alpha channels, uses .png file extension
  • GIF - Graphic Interchange Format, made by CompuServe in 1987, Raster based with non-lossy compression, It has small file size as maximum 8-bit color, limited to a maximum palette of 256 colors, this is widely used in web graphics including animations, .gif file extension
  • TIFF - Tagged Image File Format, made in 1986, stores raster images with no compression, non-lossy compression or lossy compression, depending on the options chosen when file is saved, It supports RGB, CMYK, B&W, with 8, 16 or 32bpc, and the alpha channel is also included, this makes it ideal for professional graphics especially in print applications, have wide software support, some high-end cameras directly take tiff files, It uses .tiff file extension
  • PSD - Photoshop Document, native Adobe Photoshop file format, mainly stores non-compressed raster images, but can include vector text and other elements, it supports RGB, CMYK, B&W with 8, 16 or 32bpc, It is layer based with an alpha channel per layer
  • RAW - It’s a Native Raster Digital Camera file format that preserves information that’s lost when choosing jpeg or tiff files, It offers a RGB color space with 12, 14 or 16bpc depending on camera model, in this each image is a camera dump from the camera’s sensor so no data is thrown away by the camera’s internal image processing, this can’t be read by majority image reading software’s, some specialized like Photoshop can only read them, It uses .raw file extension

Vector Image Formats:
This stores each part of Image as a vector formula, It means when you zoom into the image, it’s still clear, this is best suited for Images which include logos, text and other elements for which quality is highest priority. In this format the need for alpha channels to deliver transparency and allows a wide range of color spaces to be used

  • AI - Adobe Illustrator, Dominant file format used for creating vector graphic artwork, this can be edited in many other packages, it uses .ai as file format
  • SVG - Scalable Vector Graphics, is made by world-wide web consortium as an open standard, which is a native file format in the Inkscape vector, it uses .svg file format
  • EPS - Encapsulated Postscript, It is vector based and can embed raster images in it
  • PDF - Portable Document Format, It is vector based and can embed raster images in it
  • CDR - Native CorelDraw format

File

File compression reduces the size of a file to create a smaller version that uses less space. Often the compressed files is an archive format which includes multiple files. File compression can be both, lossy and non-lossy, let’s focus on non-lossy File compression

Non-Lossy file compression maintains all data by identifying repeated data patterns and replacing them with codes that take up less space. Early the compression was made by Huffman called Huffman Coding in 1952, And in 1977 and 1978, Abraham Lempel & Jacob Ziv defined lossless compression methods called LZ77 and LZ78, these are dictionary/substitution coders, as they replace repeated data patterns with a reference to an entry in dictionary

  • zip - In 1986, Phil Katz founded a company called PKWARE to create data compression software, they introduced a lossless file compression format called zip along with DOS program called PKZIP, this uses some algorithms to compress the files called DEFLATE, which uses combination of L277 and Huffman Coding
  • gz - In 1992, Jean-loup Gailly and Mark Adler released the GZip file compression format and software for GNU project, it uses DEFLATE Compression algorithm, GZip is not an archiving format, so it can only compress a single file, It uses gz extension
  • rar - In 1993, Eugene Roshal created the Roshal Archive Format(RAR), it’s proprietary but the license allows anybody to create software capable of decompressing a RAR, but only using commercial applications like WinRAR
  • 7z - In 1999, Igor Pavlov released 7-zip, which has a native, archiving format called 7z, It can use a number of different compression algorithms including LZMA and LZMA2. Here, LZMA stands for Lempel-Ziv-Markov chain algorithm, is based on a variant of LZ77. Also LZMA2 compression is faster than LZMA as it better uses multiple processor cores
  • lzma - It also has it’s own, non-archiving file compression format, with extension .lzma, XZ utils can create LZMA files, it also have it’s own XZ format, which is non-archiving and uses LZMA compression
  • BZip2 - Open source file compression utility, It has a native format bz2, and uses a compression algorithm called Burrows-Wheeler Transform(BWT)
  • Compressed TAR files - In 1979, the Tape Archive format was developed by AT&T Bell Labs. It is created to package files together for storage on tape and it’s not a file compression format. However TAR files are often compressed using many methods
    • GZip can be used to create compressed TAR files with a tgz or tar.tgz extension
    • Similarly it can be done using XZ and BZip2, having file extensions tar.xz and tar.bz2
  • zipx - In 2008, WinZip made a new version of zip format called Zipx(Zip extended), it creates smaller files using XZ compression