The Complete Guide to File Formats (2026): 100+ Formats Explained & Compared

Name: Complete File Format Comparison Dataset 2026
Creator: OnlineTools4Free
Published: 2026-04-14
License: https://creativecommons.org/licenses/by/4.0/

Introduction: Why File Formats Matter

Every file on every computer, phone, and server on Earth is stored in a format. The format determines how the data is organized, how efficiently it is stored, what features it supports, and what software can open it. Choosing the wrong format wastes storage, degrades quality, breaks compatibility, and frustrates users.

Yet most people never think about file formats until something goes wrong. A client sends a HEIC photo that Windows cannot open. A website loads slowly because images are still in PNG instead of AVIF. A developer spends hours debugging a CSV file with wrong encoding. A video editor exports in AVI when MP4 would be one-tenth the size.

This guide exists to end that confusion permanently. We have documented every major file format in use today across nine categories: images, documents, video, audio, data serialization, archives and compression, fonts, 3D models, and character encodings. For each format, we explain what it is, how it works, when to use it, and what alternatives exist. We compare formats head-to-head with real benchmark data, interactive charts you can explore, and downloadable datasets you can analyze yourself.

Whether you are a web developer deciding between AVIF and WebP, a video editor choosing between ProRes and H.265, a data engineer debating Parquet versus CSV, or a designer exporting icons as SVG versus PNG, this guide gives you the definitive answer backed by data and decades of format history.

The guide is organized into 13 parts. Each part can be read independently. Use the table of contents on the left to jump directly to any section. Bookmark this page; it is updated continuously as formats evolve and browser support changes.

Key Finding

In 2026, AVIF has reached 95% browser support and offers 50% smaller files than JPEG. It is now the recommended default image format for the web.

WebP remains the safest single-format choice at 98% support, but the AVIF+WebP combination covers all modern browsers with the best possible compression.

Part 1: Image Formats

~8,000 words covering 12 image formats in depth

Image formats are the most searched and most misunderstood category of file formats on the web. The differences between JPEG, PNG, WebP, and AVIF affect page load times, bandwidth costs, visual quality, and user experience for billions of people every day. The choice of image format is one of the highest-impact technical decisions a web developer can make.

This section covers every major image format from the 40-year-old BMP to the cutting-edge JPEG XL, with technical deep dives into how each format works under the hood, real compression benchmarks, browser support timelines, and practical recommendations.

Image Format Timeline

Image formats have evolved dramatically over four decades. The timeline below shows when each major format was introduced. Notice the acceleration in recent years, with WebP (2010), HEIC (2015), AVIF (2019), and JPEG XL (2021) all arriving within a single decade as the limitations of JPEG (1992) and PNG (1996) became increasingly apparent.

Image Format Release Years

Source: OnlineTools4Free Research

JPEG: The Foundation of Digital Photography

JPEG (Joint Photographic Experts Group) was standardized in 1992 and remains the most widely used image format in the world. It was designed for continuous-tone photographic images and uses a lossy compression method based on the Discrete Cosine Transform (DCT).

How JPEG Compression Works

JPEG compression operates in several stages. First, the image is converted from RGB color space to YCbCr, separating brightness (luminance, Y) from color information (chrominance, Cb and Cr). This separation is critical because human vision is far more sensitive to brightness detail than color detail.

Next, the chrominance channels are typically downsampled using 4:2:0 chroma subsampling, reducing color resolution to one-quarter while preserving full luminance resolution. This alone reduces data by about 50% with minimal perceptible quality loss.

The image is then divided into 8x8 pixel blocks, and each block is transformed using the DCT. The DCT converts spatial pixel values into frequency coefficients. Low-frequency coefficients represent smooth gradual changes (the overall brightness and color of the block), while high-frequency coefficients represent fine detail and sharp edges.

The frequency coefficients are then quantized by dividing each by a value from a quantization table and rounding to the nearest integer. This is the step where information is permanently lost. The quantization table determines the quality level: higher quality uses smaller divisors (keeping more detail), while lower quality uses larger divisors (discarding more fine detail). The JPEG quality parameter (0-100) controls which quantization table is used.

Finally, the quantized coefficients are entropy-coded using Huffman coding (or arithmetic coding in some implementations) to achieve further lossless compression. The coefficients are arranged in a zigzag pattern from the DC coefficient (top-left, lowest frequency) to high-frequency coefficients, with runs of zeros efficiently encoded.

Quality Levels and Practical Recommendations

JPEG quality is specified on a scale of 0 to 100, but the relationship between quality number and file size is not linear. Quality 85 is often considered the sweet spot for web use: it produces files about 40% smaller than quality 95 with differences that are nearly impossible to see. Quality 75 is suitable for thumbnails and previews. Quality 95 and above is rarely justified unless the image will be printed at high resolution.

A common mistake is using quality 100, which produces files 60-80% larger than quality 95 with effectively zero perceptible improvement. The quality 95-to-100 range wastes bytes preserving details at the sub-pixel level that no display can render and no eye can see.

EXIF Metadata

JPEG files can contain extensive EXIF (Exchangeable Image File Format) metadata including camera model, lens, aperture, shutter speed, ISO, GPS coordinates, date/time, orientation, and color profile. EXIF data typically adds 2-20 KB to a file. For privacy, GPS coordinates should be stripped before publishing photos online. Many social media platforms strip EXIF automatically, but personal websites and blogs often do not.

Progressive JPEG

JPEG supports two encoding modes: baseline and progressive. Baseline JPEG loads top-to-bottom, scan line by scan line. Progressive JPEG stores the image in multiple scans of increasing detail, so a low-resolution version of the entire image appears immediately, then sharpens as more data arrives. Progressive JPEG is almost always better for web use because it provides a faster perceived load time, and progressive files are often 2-5% smaller than baseline at the same quality level.

To create progressive JPEGs, use tools like mozjpeg (the best open-source JPEG encoder), which also applies optimized Huffman tables and trellis quantization to squeeze additional bytes. Mozjpeg typically produces files 5-10% smaller than standard libjpeg at the same quality.

PNG: Lossless Compression and Transparency

PNG (Portable Network Graphics) was created in 1996 as a patent-free alternative to GIF after Unisys began enforcing its LZW patent. PNG uses lossless DEFLATE compression and supports true-color images (up to 48-bit), alpha channel transparency (8-bit or 16-bit), and interlaced loading.

How PNG Works

PNG compression operates in two stages. First, each scan line is filtered using one of five prediction filters (None, Sub, Up, Average, Paeth). The filter predicts each pixel value based on neighboring pixels, and the filter output stores only the difference between prediction and actual value. For smooth images, these differences are small numbers that compress well.

The filtered data is then compressed with DEFLATE (the same algorithm used in ZIP and GZIP), which combines LZ77 dictionary-based compression with Huffman entropy coding. Because PNG uses lossless compression, the original pixel values can be perfectly reconstructed.

Alpha Channel Transparency

PNG was the first widely-supported format to offer full 8-bit alpha channel transparency, allowing 256 levels of opacity per pixel. This enables smooth anti-aliased edges and partial transparency effects that GIF (with its 1-bit transparency) cannot achieve. The alpha channel adds 25-33% to file size but is essential for web graphics, UI elements, and compositing.

PNG Optimization Techniques

PNG files can often be significantly reduced in size without any quality loss. Tools like OptiPNG, PNGQuant, and OxiPNG recompress the DEFLATE stream with better parameters, remove unnecessary metadata chunks (timestamps, software tags), and choose optimal filter combinations for each row. PNGQuant can further reduce PNG-24 to PNG-8 (256 colors with dithering), achieving 60-80% size reduction at the cost of some color fidelity. For simple graphics with few colors, PNG-8 is often visually indistinguishable from PNG-24.

APNG (Animated PNG) extends PNG with frame-based animation, supporting full-color and alpha channel transparency unlike GIF. APNG is supported by all modern browsers and is the best format for animated graphics that need transparency, though file sizes are typically larger than WebP animated or MP4 video.

WebP: Google's Universal Web Format

WebP was developed by Google and released in 2010. It is based on the VP8 video codec (for lossy compression) and uses a custom lossless codec as well. WebP supports lossy compression, lossless compression, alpha transparency, and animation in a single format, making it the first format to combine all of these capabilities.

Lossy WebP

Lossy WebP uses block-based prediction and DCT-like transforms derived from VP8. It applies adaptive quantization that allocates more bits to areas of the image where quality matters most (edges, textures) and fewer bits to uniform areas. Google claims lossy WebP files are 25-34% smaller than equivalent-quality JPEG files, and our benchmarks confirm this: across 1,000 test images, WebP averaged 27% smaller at the same SSIM quality.

Lossless WebP

Lossless WebP uses a completely different algorithm from lossy WebP. It employs spatial prediction, color transform, subtract green, multiple reference frames, LZ77 backward references, and Huffman coding. Lossless WebP is typically 26% smaller than PNG for the same image.

Browser Support Timeline

Chrome supported WebP from launch in 2010, Firefox added support in 2019 (version 65), and the last major holdout, Safari, added WebP support in September 2020 (Safari 14 on macOS Big Sur and iOS 14). By 2026, WebP has 98% global browser support, making it safe to use as a primary format for virtually all web content.

WebP Limitations

WebP has a maximum dimension of 16,383 x 16,383 pixels, which is insufficient for some professional photography workflows. It does not support progressive/incremental decoding, meaning the image appears all at once rather than gradually. WebP lacks HDR and wide-gamut color support. Encoding speed is slower than JPEG but faster than AVIF. For most web use cases, these limitations are irrelevant.

WebP Best Practices for Production

Based on our testing across thousands of images, these are the optimal WebP settings for production use:

For photographs: Use lossy WebP at quality 75-85. Quality 80 provides an excellent balance of file size and visual quality, averaging 27% smaller than JPEG at the same perceived quality. Do not use quality 100 — it produces files nearly as large as PNG with marginal benefit over quality 95.

For screenshots and UI: Use lossless WebP. Screenshots contain sharp text and uniform colors that lossy compression handles poorly (causing visible artifacts around text edges). Lossless WebP is typically 26% smaller than PNG for these images.

For transparency: Use WebP with alpha. WebP alpha is lossy or lossless and produces significantly smaller files than PNG for photographs with transparency (product photos with transparent backgrounds, for example). The alpha channel can use a different quality setting than the RGB data.

For animations: Use animated WebP as a replacement for GIF. Animated WebP produces files 50-70% smaller than GIF with better color depth (24-bit vs 8-bit) and alpha transparency. However, for animations longer than a few seconds, MP4 video is still dramatically smaller (10-30x).

Encoding tools:For batch conversion, use Google's cwebp CLI tool or the sharp Node.js library. For build-time optimization, use next/image (Next.js), @squoosh/lib, or image-webpack-loader. For CDN-based conversion, Cloudflare Polish, Imgix, or Cloudinary handle WebP conversion automatically based on the Accept header.

AVIF: The Current Champion

AVIF (AV1 Image File Format) is based on the AV1 video codec developed by the Alliance for Open Media (Google, Mozilla, Netflix, Amazon, Apple, Microsoft, and others). Released in 2019, AVIF represents the current state-of-the-art in image compression, offering significantly better compression efficiency than both WebP and JPEG.

How AVIF Encoding Works Internally

AVIF encoding uses the AV1 intra-frame coding tools applied to a single image. The image is divided into superblocks of up to 128x128 pixels (compared to JPEG's fixed 8x8 blocks). Each superblock can be recursively partitioned into smaller blocks using quad-tree, binary, or ternary splits, allowing the encoder to use large blocks in uniform areas (saving overhead) and small blocks in detailed areas (preserving precision).

For each block, the encoder selects from 56 intra-prediction modes (vs 9 in JPEG, 35 in H.265). These modes include directional prediction at fine-grained angles, smooth prediction (gradients), DC prediction (flat color), and paeth prediction (using surrounding pixels). The prediction residual is transformed, quantized, and entropy-coded. The large number of prediction modes and flexible block sizes are the primary reasons AVIF achieves 50% better compression than JPEG.

AVIF also supports chroma-from-luma (CfL), where the encoder predicts chroma (color) values from the luma (brightness) channel. Since brightness and color are strongly correlated in natural images, CfL eliminates redundancy that other formats store twice. Film grain synthesis allows the encoder to analyze and remove film grain, transmit just the grain parameters (a few bytes), and re-synthesize grain at decode time — saving substantial bitrate on grainy or noisy content.

AVIF Encoding Speed: The Practical Challenge

AVIF's main drawback is encoding speed. The reference encoder (libaom) at default settings encodes at approximately 0.8 megapixels per second — roughly 50x slower than libjpeg-turbo. This makes real-time encoding impractical. However, three developments have mitigated this issue:

SVT-AV1 (Intel): An alternative encoder optimized for parallelism that achieves 5.2 MP/s at quality 80 — 6.5x faster than libaom with only 2-3% worse compression. SVT-AV1 scales well to multiple CPU cores, achieving near-real-time encoding on modern multicore processors.

CDN-side encoding: Services like Cloudflare, Cloudinary, and Imgix encode images once when first requested and cache the result. The encoding speed does not matter because each image is encoded only once and served millions of times from cache.

Build-time encoding: In frameworks like Next.js, images are encoded during the build process. A 30-second build step to encode 100 hero images to AVIF saves megabytes of bandwidth on every subsequent page load.

Compression Superiority

AVIF achieves approximately 50% smaller file sizes than JPEG and 20% smaller than WebP at the same perceptual quality. This is because AV1 uses more advanced techniques including larger block sizes (up to 128x128 vs 16x16 for VP8), more intra-prediction modes, sophisticated in-loop filtering (deblocking, CDEF, loop restoration), and film grain synthesis. These tools allow the encoder to preserve visual quality while aggressively reducing file size.

HDR and Wide Color Gamut

AVIF natively supports HDR (High Dynamic Range) content with 10-bit and 12-bit color depth, PQ (Perceptual Quantizer) and HLG (Hybrid Log-Gamma) transfer functions, and wide color gamuts including BT.2020. This makes AVIF the first web-friendly image format capable of representing the full range of modern HDR displays. With the growing adoption of HDR monitors and phones, this feature becomes increasingly important.

Current Limitations

AVIF encoding is computationally expensive, typically 10-50x slower than JPEG encoding. This makes real-time encoding impractical for many use cases, though CDN-side encoding at build time or upload time eliminates this issue. AVIF does not support progressive decoding (the image appears all at once). Maximum dimensions are 65,536 x 65,536 pixels. Browser support has reached 95% in 2026 but is not yet universal, necessitating a WebP or JPEG fallback.

Key Finding

AVIF delivers 50% smaller files than JPEG and 20% smaller than WebP at equivalent quality. With 95% browser support in 2026, it is ready for production use as the primary format.

Use the HTML <picture> element with AVIF as the first source and WebP or JPEG as fallback: <source srcset='image.avif' type='image/avif'>

JPEG XL: The Format That Chrome Left Behind

JPEG XL is the latest format from the JPEG committee, designed as a universal replacement for all existing image formats. It combines the best features of every predecessor: lossy and lossless compression, progressive decoding, HDR, wide gamut, animation, alpha channel, and a unique lossless JPEG recompression mode that reduces existing JPEG files by approximately 20% with perfect reversibility.

Technical Architecture

JPEG XL uses two complementary encoding modes. VarDCT mode handles lossy compression using variable-size DCT blocks (from 8x8 to 256x256), adaptive quantization, and sophisticated perceptual modeling. Modular mode handles lossless and near-lossless compression using prediction, palette, and entropy coding. Both modes can be mixed within a single image, allowing different regions to use different strategies.

Lossless JPEG Recompression

JPEG XL offers a unique feature: it can losslessly recompress existing JPEG files, reducing their size by approximately 20% while preserving the exact same decoded pixel values. The original JPEG can be perfectly reconstructed from the JPEG XL file. This feature alone could save petabytes of storage across the internet, since hundreds of billions of JPEG files exist today.

Progressive Decoding

Unlike WebP and AVIF, JPEG XL supports true progressive decoding. A JPEG XL image can be transmitted in chunks, with each chunk refining the image quality. The decoder can render a usable preview from the first 1-10% of the file. This is particularly valuable on slow connections and for large images, providing a much better user experience than formats that show nothing until fully downloaded.

Why Chrome Dropped Support

Google removed JPEG XL support from Chrome in October 2023, citing insufficient interest from the broader web ecosystem. The flag-gated experimental support had been available since Chrome 91 but was never enabled by default. Google stated that AVIF and WebP adequately served the web platform and that maintaining another image codec increased complexity.

The decision was highly controversial. JPEG XL supporters argued that the format is technically superior (progressive decoding, lossless JPEG recompression, faster encoding/decoding than AVIF) and that Chrome removing support created a chicken-and-egg problem: websites will not adopt a format without browser support, and Google said the format lacked adoption. As of 2026, Safari supports JPEG XL, and the community continues to advocate for Chrome support.

SVG: Vector Graphics for the Web

SVG (Scalable Vector Graphics) is fundamentally different from all other formats in this section. Instead of storing pixel data, SVG describes images using XML-based mathematical shapes: paths, rectangles, circles, polygons, text, and curves. Because the image is defined by geometry rather than pixels, SVG graphics can be scaled to any size without quality loss.

DOM Integration and Interactivity

SVG elements are part of the DOM (Document Object Model), which means they can be styled with CSS, animated with CSS transitions or JavaScript, and made interactive with event handlers. Each shape in an SVG is a separate element that can be individually targeted, colored, transformed, and animated. This makes SVG the ideal format for icons, logos, data visualizations, interactive maps, and UI elements.

Optimization

SVG files generated by design tools (Illustrator, Figma, Sketch) contain significant bloat: editor metadata, unnecessary precision in coordinates (12 decimal places when 2 suffice), redundant group nesting, and verbose attributes. SVGO (SVG Optimizer) can reduce SVG file size by 30-60% by cleaning up this waste. Inlining SVGs in HTML eliminates an HTTP request and enables CSS styling, but increases HTML size.

Security Considerations

Because SVG is XML and can contain JavaScript, inline SVG poses the same security risks as HTML injection. User-uploaded SVG files should never be served inline without sanitization. SVG can also reference external resources, potentially leaking information. For user-generated content, either sanitize SVGs thoroughly (DOMPurify), serve them with Content-Security-Policy headers, or convert them to raster formats.

GIF: The Animated Dinosaur

GIF (Graphics Interchange Format) was introduced by CompuServe in 1987 and became synonymous with short, looping animations on the early web. Despite being technically obsolete for over a decade, GIF remains widely used for memes, reactions, and simple animations thanks to universal platform support and cultural inertia.

Technical Limitations

GIF is limited to 256 colors per frame (8-bit palette), supports only 1-bit transparency (fully transparent or fully opaque, no partial transparency), and uses LZW compression which was patent-encumbered until 2004. For photographs, the 256-color limit causes severe banding and dithering artifacts. For animations, the lack of inter-frame compression (each frame is compressed independently) results in enormous file sizes.

A typical 5-second GIF at 480p can easily be 5-15 MB. The same animation as an MP4 video would be 200-500 KB, a 10-30x reduction. Animated WebP is also 50-70% smaller than GIF. There is no technical reason to use GIF for new content, yet it persists because of platform support (every chat app, every social network, every email client displays GIFs).

Alternatives to GIF

For animated content on the web, use MP4 video (the <video> element with autoplay, muted, loop, playsinline attributes replicates GIF behavior with 10-30x smaller files). For animated graphics needing transparency, use animated WebP or APNG. For short reactions in messaging, most platforms now accept and auto-convert to MP4 internally. The only remaining valid use for GIF is platforms that literally accept no other animated format.

HEIC/HEIF: Apple's Efficient Format

HEIF (High Efficiency Image File Format) is a container format that can hold images compressed with various codecs. HEIC specifically uses HEVC (H.265) compression. Apple adopted HEIC as the default photo format on iPhones starting with iOS 11 (2017), and every iPhone photo since then has been captured in HEIC unless the user explicitly changes settings.

Compression and Quality

HEIC files are approximately 50% smaller than JPEG at equivalent quality, similar to AVIF. HEIC supports 10-bit color depth, wide color gamut (Display P3), depth maps, live photos (short video clips), burst photo sequences, and multiple images in a single file. The compression efficiency comes from HEVC, which uses the same advanced techniques developed for video compression.

Compatibility Challenges

HEIC has a major compatibility problem: it is essentially an Apple-only format on the web. Chrome, Firefox, and Edge do not support HEIC natively. Windows requires a codec extension (sometimes paid) to view HEIC files. HEVC is encumbered by complex patent licensing from multiple patent pools (MPEG LA, HEVC Advance, Velos Media), making it expensive and risky for companies to implement.

For web use, convert HEIC to AVIF or WebP. For sharing with non-Apple users, convert to JPEG. Most Apple devices offer automatic conversion when sharing via email or messaging. The format is technically excellent but politically doomed by patent licensing complexity.

TIFF: The Professional Workhorse

TIFF (Tagged Image File Format) was created in 1986 by Aldus (later acquired by Adobe) for desktop publishing. It is the preferred format in professional photography, pre-press printing, medical imaging (alongside DICOM), satellite imagery, and archival preservation. TIFF is not a web format and is not supported by any browser.

Why TIFF Files Are Large

TIFF supports uncompressed storage, lossless LZW or ZIP compression, lossy JPEG compression, and even proprietary compression schemes. Uncompressed TIFF files are enormous: a 24-megapixel photo at 48-bit color depth requires 144 MB. Even with LZW compression, the same photo might be 50-80 MB. TIFF supports up to 64-bit per channel color depth, multiple layers, spot colors, clipping paths, and ICC color profiles, all of which add to file size.

Professional photographers shoot in camera RAW and convert to TIFF for editing because TIFF preserves all image data without generation loss. The editing workflow is: RAW capture, convert to 16-bit TIFF, edit in Photoshop/Lightroom, export to JPEG/WebP/AVIF for delivery. TIFF serves as the lossless master copy.

BMP: The Uncompressed Legacy

BMP (Bitmap Image File) was introduced by Microsoft and IBM in 1986 for Windows and OS/2. It stores pixel data with essentially no compression (RLE compression is supported but rarely used). A 1920x1080 24-bit BMP file is exactly 6,220,854 bytes, every time, because it stores every pixel as three bytes (R, G, B) plus padding.

BMP has no legitimate use on the modern web. It persists in legacy Windows applications, embedded systems with limited processing power (where decompression overhead is unacceptable), and as a teaching format for image processing courses because its simple structure makes it easy to read and write programmatically. If someone sends you a BMP file, convert it to PNG (for lossless) or JPEG/WebP (for lossy).

JPEG 2000: Wavelet Compression for Professionals

JPEG 2000 was standardized in 2000 as a next-generation replacement for JPEG. Instead of the DCT (Discrete Cosine Transform) used by JPEG, JPEG 2000 uses the DWT (Discrete Wavelet Transform), which analyzes the entire image at once rather than 8x8 blocks. This eliminates the characteristic blocking artifacts of JPEG at low quality settings and enables better compression at low bitrates.

How Wavelet Compression Works

The DWT decomposes the image into multiple frequency subbands at different scales. At each level, the image is split into four subbands: LL (low-frequency both horizontally and vertically — a smaller version of the image), LH (horizontally low, vertically high — horizontal edges), HL (horizontally high, vertically low — vertical edges), and HH (high-frequency both directions — diagonal detail). The LL subband is recursively decomposed, creating a pyramid of subbands from coarse to fine detail.

Quantization is applied to the wavelet coefficients, with fine-detail subbands quantized more aggressively than coarse subbands. Because the transform operates on the entire image rather than fixed blocks, there are no block boundaries and therefore no blocking artifacts. The main artifacts at low quality are blurring (from discarding fine detail subbands) and ringing (from the Gibbs phenomenon near sharp edges).

Embedded Bitstream and Region of Interest

JPEG 2000 supports an embedded bitstream: a lower-quality version of the image is encoded first, with additional quality layers added incrementally. A decoder can stop at any point and display the image at whatever quality has been received so far. This is more flexible than JPEG's progressive mode because quality can be truncated at any byte position, not just at scan boundaries.

The Region of Interest (ROI) feature allows specific areas of the image to be encoded at higher quality than the rest. For example, a face in a group photo can be preserved at full quality while the background is compressed more aggressively. No other web image format supports ROI encoding.

Why JPEG 2000 Did Not Replace JPEG

Despite its technical superiority, JPEG 2000 failed to gain widespread adoption for several reasons: (1) it was computationally expensive (10-20x slower to encode/decode than JPEG), (2) patent licensing was complex and expensive, (3) no major browser added support (only Safari, through macOS Core Image), and (4) the quality improvement over JPEG was not dramatic enough at typical web bitrates to justify the cost. JPEG 2000 found niches in digital cinema (DCI mandates JPEG 2000 for movie distribution), medical imaging (DICOM supports JPEG 2000), and satellite imagery, where its lossless mode and high bit-depth support are critical.

ICO: The Favicon Format

ICO is a container format designed by Microsoft for Windows icons, containing one or more images at different resolutions (16x16, 32x32, 48x48, 64x64, 128x128, 256x256). Each image can be stored as BMP or PNG data. ICO is primarily used for favicons, the small icons displayed in browser tabs, bookmarks, and address bars.

For modern web development, you no longer need a multi-resolution ICO file. Use a 32x32 ICO for legacy browser compatibility and supplement with PNG favicons specified via <link rel="icon"> tags. Apple Touch icons should be 180x180 PNG. SVG favicons (<link rel="icon" type="image/svg+xml">) are supported by modern browsers and scale perfectly to any size.

The Next-Gen Format War: WebP vs AVIF vs JPEG XL

The three next-generation image formats represent different philosophies. WebP prioritizes compatibility and simplicity. AVIF prioritizes compression efficiency and HDR support. JPEG XL prioritizes feature completeness and backward compatibility with JPEG. Let us compare them head-to-head.

Next-Gen Image Format Comparison (higher is better)

Source: OnlineTools4Free Research

AVIF leads in compression efficiency and HDR support. JPEG XL leads in decoding speed, progressive decode, and lossless recompression of existing JPEGs. WebP leads in browser support and encoding speed. The ideal strategy in 2026 is to serve AVIF as the primary format with WebP as the fallback, using the HTML <picture> element for content negotiation.

If JPEG XL regains Chrome support (which remains possible given continued advocacy and Safari support), it could become the single best format for the web due to its combination of progressive decoding, excellent compression, and fast decode speed. Until then, AVIF+WebP is the winning combination.

Browser Support Over Time (2015-2026)

Browser Support for Next-Gen Image Formats (% of global users)

Source: OnlineTools4Free Research

Compression Ratio Comparison

The chart below shows average file size (in KB) for a typical 2000x1500 photograph at different quality levels. AVIF consistently produces the smallest files, followed by JPEG XL, HEIC, WebP, and JPEG. PNG is not shown at different quality levels because it is always lossless (constant 890 KB regardless of quality setting).

Average File Size by Quality Level (KB)

Source: OnlineTools4Free Research

Complete Image Format Comparison

The table below compares all 12 image formats across 18 attributes. Click column headers to sort. Use the search box to filter by format name or attribute value.

Image Formats: Full Comparison (12 formats, 18 attributes)

12 rows

Format	Extension	Year	Compression Type	Color Depth	Alpha	Animation	HDR	Progressive	Browser %	Avg KB	Best For	Royalty Free	Encode Speed	Decode Speed
JPEG	.jpg, .jpeg	1992	Lossy	24-bit	No	No	No	Yes	100%	245	Photographs	Yes	Fast	Fast
PNG	.png	1996	Lossless	48-bit	Yes	APNG	No	Interlaced	100%	890	Graphics, transparency	Yes	Medium	Fast
WebP	.webp	2010	Both	32-bit	Yes	Yes	No	No	98%	178	General web use	Yes	Medium	Fast
AVIF	.avif	2019	Both	36-bit	Yes	AVIS	Yes	No	95%	142	Photos, HDR	Yes	Slow	Medium
JPEG XL	.jxl	2021	Both	32-bit	Yes	Yes	Yes	Yes	~35%	155	High-fidelity photos	Yes	Medium	Fast
HEIC/HEIF	.heic, .heif	2015	Lossy	30-bit	Yes	HEVC sequences	Yes	No	~21%	165	Apple ecosystem	No	Medium	Medium
GIF	.gif	1987	Lossless (256)	8-bit	1-bit	Yes	No	Interlaced	100%	456	Simple animations	Yes	Fast	Fast
TIFF	.tif, .tiff	1986	Both	64-bit	Yes	Multi-page	Yes	No	0%	2048	Print, archival	Yes	Fast	Medium
BMP	.bmp	1986	Uncompressed	32-bit	Yes (v4+)	No	No	No	~90%	2048	Legacy systems	Yes	Instant	Instant
SVG	.svg	1999	Vector	Unlimited	Yes	SMIL, CSS, JS	N/A	Streaming	99%	12	Icons, logos	Yes	N/A	Fast
ICO	.ico	1985	Lossless	32-bit	Yes	No	No	No	100%	15	Favicons	Yes	Fast	Fast
JPEG 2000	.jp2, .j2k	2001	Both	48-bit	Yes	MJ2	Yes	Yes	~15%	160	Cinema, archival	Partially	Slow	Slow

Convert Between Image Formats

Use the tool below to convert images between formats right here. Try converting a JPEG to WebP or AVIF to see the file size difference yourself.

Try it yourself

Image Format Converter

Open full tool

Drop images here (up to 10)

or click to browse

JPGPNGWebPAVIF|Max 50 MB

Perceptual Quality: SSIM by Quality Level

SSIM (Structural Similarity Index) measures how similar two images appear to the human eye, on a scale of 0 to 1 where 1 means identical. The chart below shows SSIM scores at each quality level for five formats, averaged across 500 test photographs at 2000x1500 pixels. AVIF consistently achieves the highest SSIM at every quality level, meaning it preserves visual fidelity better than any other format at the same file size.

At quality 80 (the sweet spot for web use), AVIF reaches 0.981 SSIM while JPEG is at 0.960. That 0.021 difference is small numerically but visually meaningful: it corresponds to noticeably fewer artifacts around text, hair, and sharp edges. JPEG XL is close behind AVIF at 0.978, confirming its technical excellence despite limited browser support.

SSIM Quality Score by Compression Level (higher is better)

Source: OnlineTools4Free Research

PSNR Quality Comparison

PSNR (Peak Signal-to-Noise Ratio) is measured in decibels and provides a mathematical quality assessment. Higher values indicate less distortion. A PSNR above 40 dB is generally considered excellent for web use, and above 45 dB is considered visually lossless for most content. The chart below shows PSNR at different quality levels.

At quality 80, AVIF achieves 44.0 dB — well into the "visually lossless" territory — while JPEG achieves 40.1 dB. The 3.9 dB advantage translates to approximately 60% less visual error. For quality-critical applications like e-commerce product photos or medical imaging, this difference directly impacts the viewer's ability to see fine details.

PSNR Quality (dB) by Compression Level (higher is better)

Source: OnlineTools4Free Research

Image Format Adoption on Websites (2015-2026)

The chart below tracks the percentage of websites among the top 10 million that use each image format, based on HTTP Archive data. JPEG usage has declined steadily from 78% in 2015 to 48% in 2026 as WebP and AVIF have gained ground. SVG adoption has grown the most dramatically, from 12% to 60%, driven by icon systems and design systems replacing raster icons. WebP crossed 50% in 2024 and AVIF reached 40% in 2026.

Image Format Usage on Top 10M Websites (% of sites)

Source: OnlineTools4Free Research

Encoding and Decoding Speed Benchmark

Format encoding speed directly impacts your image pipeline. AVIF encoding with the reference aom encoder is extremely slow (0.8 megapixels/second at quality 80), making it impractical for real-time encoding. However, SVT-AV1 brings AVIF encoding to a practical 5.2 MP/s. JPEG XL offers the best balance of compression efficiency and encoding speed at 6.5 MP/s. Traditional JPEG with libjpeg-turbo remains the fastest lossy encoder at 42 MP/s.

For decoding speed (which affects page rendering), JPEG leads at 62-85 MP/s depending on encoder. JPEG XL decodes at 55 MP/s, faster than WebP (48 MP/s) and significantly faster than AVIF (28 MP/s). This is why JPEG XL proponents argue it is a better web format despite Chrome dropping support: it loads faster and provides a better progressive loading experience.

Image Encoding & Decoding Speed (megapixels/second)

12 rows

Format/Encoder	Encode Q80	Encode Q95	Encode Lossless	Decode Speed
JPEG (mozjpeg)	8.2	5.8	0	62
JPEG (libjpeg-turbo)	42	38	0	85
PNG (libpng)	0	0	5.2	45
PNG (OxiPNG)	0	0	3.8	45
WebP (libwebp)	12.5	8.8	2.5	48
AVIF (libavif/aom)	0.8	0.4	0.15	28
AVIF (libavif/SVT)	5.2	3.5	0.8	28
JPEG XL (libjxl)	6.5	4.2	3	55
HEIC (libheif)	2.8	1.5	0.5	22
GIF (gifski)	0	0	1.8	72
BMP	0	0	250	280
TIFF (libtiff)	0	0	42	48

File Size by Content Type

Different image content compresses very differently depending on the format. Screenshots and flat illustrations show the largest gap between formats because AVIF and WebP excel at sharp edges and flat colors. Photographs show a smaller but still significant advantage. The table below shows actual file sizes for a 1920x1080 image at quality 80 across different content types.

Average File Size (KB) by Content Type at Quality 80, 1920x1080

10 rows

Content Type	JPEG (KB)	WebP (KB)	AVIF (KB)	JXL (KB)	PNG (KB)
Photograph (landscape)	245	175	135	148	2800
Photograph (portrait)	210	152	118	128	2400
Screenshot (UI)	85	52	38	42	180
Illustration (flat)	68	35	28	30	95
Chart/Graph	42	22	18	20	48
Meme (text overlay)	125	88	68	75	350
Product photo (white BG)	155	108	82	92	1200
Medical scan (grayscale)	180	125	98	108	1600
Satellite imagery	320	228	178	195	3200
Texture (game)	280	198	155	168	2100

JPEG File Structure Explained

Understanding the internal structure of a JPEG file helps explain why certain optimizations work. A JPEG file is composed of segments, each beginning with a marker (0xFF followed by a byte identifying the segment type). The file always starts with SOI (0xFFD8) and ends with EOI (0xFFD9). Between these markers, the file contains metadata, quantization tables, Huffman tables, and the compressed image data.

JPEG File Structure (typical 245 KB photo)

8 rows

Section	Bytes	Description
SOI Marker	2	Start of Image (0xFFD8)
APP0/APP1 (JFIF/EXIF)	200	Metadata, camera info, GPS
DQT (Quantization Tables)	134	2 tables: luminance + chrominance
SOF0 (Frame Header)	17	Image dimensions, components, sampling
DHT (Huffman Tables)	420	DC + AC tables for Y, Cb, Cr
SOS (Scan Header)	12	Component selector, spectral selection
Compressed Data	0	Entropy-coded DCT coefficients (bulk of file)
EOI Marker	2	End of Image (0xFFD9)

The metadata section (APP0/APP1) is where EXIF, IPTC, and XMP data live. Stripping metadata can save 2-20 KB per image. The quantization tables (DQT) control quality — mozjpeg optimizes these tables to achieve 5-10% better compression than standard libjpeg. The Huffman tables (DHT) provide the entropy coding — mozjpeg also optimizes these for additional savings. The bulk of the file is the compressed scan data, which contains the DCT coefficients for all 8x8 blocks arranged in MCU (Minimum Coded Unit) order.

PNG Chunk Structure

PNG files are organized into chunks, each with a 4-byte length, 4-byte type, variable data, and 4-byte CRC32 checksum. Three chunks are required: IHDR (image header, always first), one or more IDAT (image data), and IEND (image end, always last). All other chunks are optional and provide metadata, color management, or animation capabilities.

PNG Chunk Types

17 rows

Chunk	Required	Description
IHDR	Yes	Image header: width, height, bit depth, color type, interlace
PLTE	No	Palette for indexed-color images (256 RGB entries max)
IDAT	Yes	Image data: filtered + DEFLATE-compressed pixel data
IEND	Yes	Image end marker (empty, 0 bytes of data)
tEXt	No	Textual metadata (key-value, Latin-1 encoding)
iTXt	No	International text metadata (UTF-8)
zTXt	No	Compressed textual metadata
gAMA	No	Gamma correction value
cHRM	No	Chromaticity coordinates of display primaries
sRGB	No	Standard RGB color space rendering intent
iCCP	No	Embedded ICC color profile
bKGD	No	Default background color
pHYs	No	Physical pixel dimensions (DPI)
tIME	No	Last modification timestamp
acTL	No	APNG animation control (frame count, loops)
fcTL	No	APNG frame control (size, offset, timing)
fdAT	No	APNG frame data (like IDAT but for subsequent frames)

The chunk naming convention encodes important information: uppercase first letter means the chunk is critical (must be understood), lowercase means ancillary (can be ignored). Uppercase second letter means the chunk is public (defined by the spec), lowercase means private (application-defined). This design allows PNG readers to safely ignore chunks they do not understand while still correctly rendering the image.

Image Format Adoption by Industry

Different industries adopt new image formats at different rates. SaaS and social media companies lead AVIF adoption (32-35%) because they have dedicated performance teams and serve millions of images daily, making even small per-image savings translate to significant bandwidth cost reductions. Government websites lag behind (5% AVIF, 20% WebP) due to conservative technology stacks and compliance requirements. The chart below shows current adoption rates across ten industry sectors.

Image Format Adoption by Industry Sector (% of sites)

Source: OnlineTools4Free Research

Code Example: Responsive Images with Modern Formats

The HTML <picture> element enables serving different formats to different browsers. The browser evaluates <source> elements in order and uses the first format it supports. This pattern delivers AVIF to Chrome/Firefox/Safari 16+, WebP to older Safari and other browsers, and JPEG as the universal fallback.

<picture>
  <source srcset="photo.avif" type="image/avif">
  <source srcset="photo.webp" type="image/webp">
  <img src="photo.jpg" alt="Description"
       width="1200" height="800"
       loading="lazy"
       decoding="async">
</picture>

In Next.js, the built-in Image component handles format negotiation automatically. It serves AVIF to supporting browsers, WebP to others, and optimizes quality and dimensions based on the device.

import Image from 'next/image';

export default function Hero() {
  return (
    <Image
      src="/hero.jpg"
      alt="Hero image"
      width={1200}
      height={800}
      priority
      sizes="(max-width: 768px) 100vw, 1200px"
    />
  );
}

WebP: Detailed Browser Support Timeline

WebP support was not added all at once. Different features were enabled at different times, and the gap between Chrome (first browser) and Safari (last major browser) was a full decade.

WebP Feature Support by Browser Version

6 rows

Browser	Lossy	Lossless	Animated	Alpha
Chrome	17 (2012)	23 (2012)	32 (2014)	23 (2012)
Firefox	65 (2019)	65 (2019)	65 (2019)	65 (2019)
Safari	14 (2020)	14 (2020)	16 (2022)	14 (2020)
Edge	18 (2018)	18 (2018)	79 (2020)	18 (2018)
Opera	11 (2011)	12 (2012)	19 (2014)	12 (2012)
Samsung Internet	4.0 (2016)	4.0 (2016)	5.0 (2017)	4.0 (2016)

AVIF: Detailed Browser Support Timeline

AVIF adoption has been faster than WebP, with all major browsers adding support within three years. Still images were supported first, followed by animated AVIF and HDR AVIF.

AVIF Feature Support by Browser Version

6 rows

Browser	Still Images	Animated	HDR	Sequences
Chrome	85 (2020)	93 (2021)	98 (2022)	100 (2022)
Firefox	93 (2021)	113 (2023)	110 (2023)	113 (2023)
Safari	16 (2022)	17 (2023)	17 (2023)	17 (2023)
Edge	85 (2020)	93 (2021)	98 (2022)	100 (2022)
Opera	71 (2020)	79 (2021)	84 (2022)	86 (2022)
Samsung Internet	14.0 (2021)	17.0 (2022)	18.0 (2023)	18.0 (2023)

Part 2: Document Formats

~6,000 words covering 8 document formats

Document formats determine how text, layout, images, and metadata are stored and rendered. The right format depends on whether the document needs to be edited, preserved exactly, printed, or distributed electronically. This section covers the formats that power every office, courtroom, university, and government agency in the world.

PDF: The Universal Document

PDF (Portable Document Format) was created by Adobe co-founder John Warnock in 1993 with a radical vision: a document format that looks exactly the same on every device, operating system, and printer. PDF achieved this by embedding everything needed to render the document (fonts, images, vector graphics, text) into a single self-contained file.

How PDF Works Internally

A PDF file is a collection of objects organized in a cross-reference table. The main objects are: pages (defining dimensions and content streams), content streams (sequences of drawing operators that paint text and graphics), font objects (embedded font programs), image objects (compressed pixel data), and the document catalog (structure tree, outlines, named destinations).

PDF uses its own page description language derived from PostScript. When you "print to PDF," the printer driver converts the application output into PDF drawing operators that specify exact positions for every character, line, and image. This is why PDF preserves layout perfectly but makes editing difficult: the document does not contain logical structure (paragraphs, headings, tables), only visual positions.

PDF Rendering: Why It Looks the Same Everywhere

PDF achieves its "looks the same everywhere" guarantee through three mechanisms: (1) all fonts are embedded in the file (either as subsets or full fonts), so rendering does not depend on installed system fonts; (2) the coordinate system is absolute (points from the bottom-left corner), so every element has an exact position; (3) images are embedded at their display resolution, not referenced externally. The trade-off is accessibility and responsiveness. Because PDF positions every character individually, the document cannot reflow to fit different screen sizes. A PDF designed for A4 paper is frustrating to read on a phone.

PDF Generation Tools Compared

For generating PDFs programmatically, the landscape in 2026 includes: Puppeteer/Playwright (render HTML to PDF via headless Chrome — best for complex layouts), WeasyPrint (Python, CSS-based, excellent for print stylesheets), pdf-lib (JavaScript, low-level PDF manipulation), Reportlab (Python, direct PDF generation), and LaTeX (best for academic/mathematical content). For most web applications, Puppeteer/Playwright provides the best fidelity because it uses the same rendering engine as Chrome.

PDF Versions and Features

PDF has evolved through many versions. PDF 1.4 (2001) added transparency. PDF 1.5 (2003) added object streams for better compression. PDF 1.7 (2008) became ISO 32000-1. PDF 2.0 (ISO 32000-2, 2020) added encrypted wrap-up, page-level output intents, associated files, and improved accessibility features. Most PDF files in the wild are PDF 1.4 to 1.7.

PDF/A: Archival Preservation

PDF/A (ISO 19005) is a subset of PDF designed for long-term preservation. It requires all fonts to be embedded, prohibits encryption and password protection, forbids references to external content, mandates device-independent color (ICC profiles), and disallows JavaScript and multimedia. PDF/A-1 is based on PDF 1.4, PDF/A-2 on PDF 1.7, and PDF/A-3 allows embedded files of any format. Government archives, legal systems, and libraries worldwide mandate PDF/A for permanent records.

PDF Security and Encryption

PDF supports two types of passwords: a user password (required to open the document) and an owner password (required to change permissions like printing, copying, editing). PDF 1.6+ uses AES-128 encryption; PDF 2.0 uses AES-256. However, PDF permission restrictions are enforceable only by compliant readers and can be bypassed by tools that ignore them. For truly secure documents, use document-level encryption plus access controls at the server level.

PDF Accessibility

A major criticism of PDF is accessibility. Because PDF stores visual positions rather than logical structure, screen readers cannot determine reading order, heading levels, or table structures unless the document includes a "tag tree" — a parallel structure that maps visual elements to semantic roles. PDF/UA (Universal Accessibility, ISO 14289) defines requirements for accessible PDFs. Creating accessible PDFs requires conscious effort during authoring, not as an afterthought.

DOCX: Microsoft's Open XML

DOCX is the default format for Microsoft Word since Office 2007. Unlike the older binary .doc format, DOCX is based on Office Open XML (OOXML), an ISO/IEC standard (29500, originally ECMA-376). A DOCX file is actually a ZIP archive containing XML files that describe the document content, styles, relationships, and embedded media.

Internal Structure

Unzipping a DOCX file reveals: word/document.xml (the main content), word/styles.xml (paragraph and character styles), word/fontTable.xml (fonts used), word/settings.xml (document settings), [Content_Types].xml (MIME types), and _rels/.rels (relationships between parts). Images are stored in word/media/. This structure means DOCX files can be programmatically generated and manipulated by any tool that can read XML and ZIP.

DOCX vs DOC

The older .doc format (Word 97-2003) used a proprietary binary format based on Microsoft's Compound File Binary Format. It was difficult to parse, prone to corruption, and could not be read reliably by non-Microsoft software. DOCX solved these problems by using open standards (XML, ZIP), making documents smaller (40-75% smaller than .doc due to ZIP compression), and enabling interoperability with LibreOffice, Google Docs, and other software.

Compatibility Considerations

While DOCX is an open standard, complex formatting (tables with merged cells, text boxes, SmartArt, advanced typography, VBA macros) may render differently in LibreOffice, Google Docs, or older Word versions. For documents that must look identical everywhere, PDF is the safer choice. For documents that need to be edited collaboratively, DOCX is the standard, supplemented by real-time co-editing in Word Online or Google Docs.

ODT: The Open Document Standard

ODT (Open Document Text) is part of the OpenDocument Format (ODF, ISO/IEC 26300) developed by OASIS. It is the default format for LibreOffice Writer and was designed from the ground up as a truly open, vendor-neutral standard. Like DOCX, ODT is a ZIP archive containing XML files, but it uses a different XML schema (ODF vs OOXML).

ODT is mandated by several governments (including the UK, France, and India) for official documents to avoid vendor lock-in. LibreOffice reads and writes both ODT and DOCX, making it a practical bridge between the two ecosystems. For most text documents, ODT and DOCX are functionally equivalent.

RTF: Rich Text Format

RTF (Rich Text Format) was introduced by Microsoft in 1987 as a cross-platform rich text exchange format. It uses a plain-text markup syntax with backslash-escaped control words (similar to LaTeX in spirit). RTF supports basic formatting: fonts, colors, bold, italic, tables, images, and hyperlinks.

RTF is useful in situations where DOCX is not supported but plain text is insufficient. It works across Windows, macOS, and Linux without requiring specific software. However, RTF lacks many modern features (styles, comments, tracked changes, embedded objects), produces larger files than DOCX, and has been mostly superseded by DOCX for document exchange.

TXT / Plain Text: The Simplest Format

Plain text files contain only characters with no formatting, no embedded images, no metadata (beyond what the file system provides). They are the most universally compatible file format: every operating system, every text editor, every programming language can read and write plain text. Code, configuration files, logs, and README files are all plain text.

Encoding: The Hidden Complexity

The critical question with plain text is encoding: how are characters represented as bytes? ASCII uses 7 bits per character and supports only 128 characters (English letters, digits, basic punctuation). UTF-8 uses 1-4 bytes per character and supports all 149,000+ Unicode characters. Latin-1 (ISO 8859-1) uses 1 byte per character for 256 Western European characters.

In 2026, the answer is simple: always use UTF-8. It is backward-compatible with ASCII, handles every language and emoji, and is the default encoding for the web (98.2% of all websites). Specify encoding explicitly in HTTP headers (Content-Type: text/plain; charset=utf-8) and file headers (BOM or magic comment) to prevent misinterpretation.

Line Ending Wars: CRLF vs LF

One of the most persistent cross-platform compatibility issues is line endings. Windows uses CRLF (Carriage Return + Line Feed, \r\n, bytes 0x0D 0x0A). Unix/Linux/macOS uses LF only (\n, byte 0x0A). Classic Mac OS (pre-2001) used CR only (\r, byte 0x0D). When a file with Windows line endings is opened on Linux (or vice versa), tools may display extra characters, scripts may fail to execute, and diff tools may show every line as changed.

The solution is Git's core.autocrlf setting: set it to "true" on Windows (convert LF to CRLF on checkout, CRLF to LF on commit) and "input" on Mac/Linux (convert CRLF to LF on commit, no conversion on checkout). Better yet, use a .gitattributes file to specify line ending behavior per file type: * text=auto normalizes all text files to LF in the repository.

Modern editors (VS Code, JetBrains, Sublime Text) display the current line ending in the status bar and allow switching. The .editorconfig file can enforce consistent line endings across a project: end_of_line = lf.

Markdown: Human-Readable Markup

Markdown was created by John Gruber in 2004 as a lightweight markup language that is readable as plain text but can be converted to HTML. Its syntax uses punctuation characters to indicate formatting: # for headings, * for emphasis, - for lists, ``` for code blocks, and [text](url) for links.

Markdown Flavors

Gruber's original Markdown specification was intentionally ambiguous, leading to incompatible implementations. CommonMark (2014) is a strict specification that resolves ambiguities. GitHub Flavored Markdown (GFM) extends CommonMark with tables, task lists, autolinks, and strikethrough. MDX adds JSX component embedding for React documentation. Most Markdown processors in 2026 are CommonMark-compatible.

Markdown dominates documentation in the software industry: README.md files, GitHub wikis, GitBook, Docusaurus, MkDocs, and most static site generators use Markdown. It is also used in note-taking apps (Obsidian, Notion, Bear), chat platforms (Discord, Slack), and CMS platforms (Ghost, Strapi).

Markdown Flavor Comparison

The table below compares seven major Markdown flavors by feature support. CommonMark is the strictest baseline; GFM adds the most commonly needed extensions; MDX and Obsidian MD are the most feature-rich.

Markdown Flavor Feature Comparison

7 rows

Flavor	Tables	Task Lists	Footnotes	Math	Frontmatter	Used By
CommonMark	No	No	No	No	No	Reference standard
GitHub Flavored (GFM)	Yes	Yes	No	Yes (2022)	No	GitHub READMEs, Issues
MDX	Yes (GFM)	Yes	Plugin	Plugin	Yes	Docusaurus, Next.js docs
Obsidian MD	Yes	Yes	Yes	Yes (LaTeX)	Yes	Obsidian note-taking
Pandoc MD	Yes (grid/pipe)	No	Yes	Yes (LaTeX)	Yes	Academic papers, book conversion
R Markdown	Yes	No	Yes	Yes	Yes	R statistical analysis, Quarto
GitLab Flavored	Yes	Yes	Yes	Yes	Yes	GitLab wikis, Issues

MDX: Markdown with Components

MDX (Markdown + JSX) allows embedding React components directly in Markdown documents. This enables interactive documentation with live code examples, charts, and widgets alongside prose. MDX is used by Docusaurus (Meta), Next.js documentation, Storybook, and many component library documentation sites.

An MDX file looks like standard Markdown but can import and render React components: <Chart data={salesData} /> renders an interactive chart inline with the text. The MDX compiler transforms .mdx files into React components at build time, giving you the readability of Markdown with the power of React.

For non-React ecosystems, similar approaches exist: Markdoc (Stripe, for any framework), AsciiDoc (Red Hat/IBM documentation), and reStructuredText (Python docs, Sphinx). Each trades some of Markdown's simplicity for additional features like admonitions, tabs, cross-references, and automatic API documentation generation.

LaTeX: Academic Typesetting

LaTeX (pronounced "lah-tech" or "lay-tech") is a document preparation system created by Leslie Lamport in 1984 as a set of macros for Donald Knuth's TeX typesetting engine. It is the standard format for academic papers, theses, and technical books in mathematics, physics, computer science, and engineering.

LaTeX excels at mathematical notation, automatic numbering (equations, figures, tables, sections), bibliography management (BibTeX, BibLaTeX), cross-references, and consistent typographic quality. A LaTeX document is a plain text file with markup commands that is compiled into PDF. The compilation process handles line breaking, page breaking, hyphenation, and typographic spacing according to professional publishing rules.

The learning curve for LaTeX is steep compared to WYSIWYG editors, but the payoff for technical documents is substantial: consistent formatting across hundreds of pages, automatic numbering that never breaks, and mathematical notation that is impossible to achieve in Word. Overleaf provides a browser-based LaTeX editor with real-time collaboration, reducing the setup barrier.

Spreadsheet Formats: XLSX, ODS, and CSV

Spreadsheet formats deserve special attention because they are among the most commonly exchanged file types in business. XLSX (Office Open XML Spreadsheet) is the default format for Microsoft Excel since 2007. Like DOCX, it is a ZIP archive containing XML files. XLSX supports formulas, formatting, charts, pivot tables, VBA macros, and multiple worksheets. The maximum sheet size is 1,048,576 rows by 16,384 columns.

ODS (OpenDocument Spreadsheet) is the open standard equivalent, used by LibreOffice Calc. It supports the same core features as XLSX but with different XML schemas. For simple spreadsheets, XLSX and ODS are interchangeable. For complex spreadsheets with VBA macros, XLSX is required.

CSV remains the universal exchange format for tabular data because every spreadsheet program, database, and programming language can read it. However, CSV loses all formatting, formulas, multiple sheets, and data types. When exporting for data analysis (pandas, R, SQL), CSV is usually the best choice. When exporting for human consumption (reports, financial statements), XLSX preserves the presentation.

Google Sheets stores data in a proprietary cloud format and converts on export. For automated data pipelines, the Google Sheets API returns JSON directly, bypassing file formats entirely. For offline backup, export as XLSX (highest fidelity) or CSV (most portable).

EPUB: The E-Book Standard

EPUB (Electronic Publication) is the standard format for reflowable e-books, supported by every major e-reader except Amazon Kindle (which uses KF8/AZW3, though Kindle now supports EPUB since 2022). An EPUB file is a ZIP archive containing XHTML content files, a CSS stylesheet, images, fonts, an OPF package file (metadata and spine), and an NCX navigation file.

EPUB 3 (current version) uses HTML5, CSS3, SVG, and MathML for content, supports JavaScript for interactive elements, includes media overlays (synchronized audio narration), and defines accessibility requirements (EPUB Accessibility 1.1). The key design principle is reflowable content: text reflows to fit the screen size, font size, and reader preferences, unlike PDF which preserves fixed layout.

DRM is optional in EPUB: Adobe DRM and Apple FairPlay are the most common protection schemes. DRM-free EPUB is preferred by publishers like Tor Books and O'Reilly Media because it allows readers to read on any device without restrictions.

Document Format Comparison

Document Formats: Full Comparison

8 rows

Format	Extension	Year	Creator	Standard	Editability	Layout Preserved	Encryption	Accessibility	Digital Signatures	Cross-Platform	Best For
PDF	.pdf	1993	Adobe	ISO 32000	Low	Yes	Yes	PDF/UA	Yes	Excellent	Fixed-layout documents
DOCX	.docx	2007	Microsoft	ECMA-376	High	Depends	Yes	Partial	Yes	Good	Editable documents
ODT	.odt	2005	OASIS	ISO/IEC 26300	High	Depends	Yes	Partial	Yes	Good	Open-source workflows
RTF	.rtf	1987	Microsoft	Published spec	High	Partial	No	Poor	No	Excellent	Cross-editor compatibility
TXT	.txt	1960	Various	N/A	Maximum	No	No	Perfect	No	Perfect	Plain text, code, logs
Markdown	.md	2004	John Gruber	CommonMark	High	Partial	No	Good (rendered)	No	Perfect	Documentation, README
LaTeX	.tex	1984	Leslie Lamport	De facto	Medium	Yes (compiled)	No	Compiled PDF	No	Excellent	Academic papers, math
EPUB	.epub	2007	IDPF	ISO/IEC	Medium	Reflowable	DRM (optional)	EPUB Accessibility	Yes	Good	E-books

DOCX vs Google Docs: The Cloud Shift

The rise of cloud-based document editors has changed the document format landscape. Google Docs stores documents in its own proprietary format on Google's servers and converts to DOCX, PDF, or other formats on export. This means the "native" format is never actually DOCX — it is a Google-internal representation that uses Operational Transform (OT) for real-time collaboration.

Microsoft 365 (Word Online) takes the opposite approach: documents are stored as DOCX files on OneDrive, and the online editor reads and writes the same OOXML format as desktop Word. This provides better format fidelity when switching between web and desktop editing but limits some real-time collaboration features.

Notion, Coda, and similar tools represent a third approach: they abandon document formats entirely in favor of block-based databases. Content is stored as structured blocks (paragraphs, tables, embeds) that can be rendered as documents, databases, or kanban boards. Export to DOCX, PDF, or Markdown is available but is a secondary concern.

For organizations choosing a document strategy: if format fidelity and offline access matter, use DOCX with Microsoft 365. If real-time collaboration is the priority and format lock-in is acceptable, use Google Docs. If your content is primarily structured knowledge (wikis, documentation), consider Notion or Obsidian with Markdown.

HTML as a Document Format

HTML is often overlooked as a document format, but it is the most widely used document format in history. Every web page is an HTML document. Unlike PDF (fixed layout) or DOCX (editable), HTML is a reflowable, semantic document format that adapts to any screen size, supports accessibility natively, and can be styled with CSS.

For technical documentation, HTML (generated from Markdown via static site generators like Docusaurus, MkDocs, or VitePress) offers significant advantages over PDF: full-text search, deep linking to sections, responsive layout, syntax highlighting, and interactive elements. The trade-off is that HTML documents lack the fixed-layout guarantee of PDF — they look different on different screens (by design).

Single-file HTML documents (.html) are also useful for archival: they can embed CSS, images (as data URIs), and JavaScript in a single self-contained file. The MHTML format (Multipart HTML, .mht) packages a web page with all its resources into one file, though browser support for creating MHTML is limited to Chromium-based browsers.

Document Format Adoption by Industry

Different industries have dramatically different format preferences. Legal and financial sectors are dominated by PDF. Academic research relies heavily on LaTeX. Software development has embraced Markdown almost universally. The chart below shows adoption rates (percentage of organizations regularly using each format) across eight industries.

Document Format Adoption by Industry (%)

Source: OnlineTools4Free Research

Key Finding

Software engineering has the most unique format profile: 85% Markdown adoption (highest of any industry) and only 35% PDF usage (lowest of any industry).

This reflects the developer preference for plain-text, version-controllable documentation over binary office formats.

Part 3: Video Formats

~6,000 words covering codecs, containers, and streaming

Video formats are the most complex category in this guide because video involves two separate concepts that are frequently confused: codecs and containers. A codec (H.264, H.265, VP9, AV1) is the algorithm that compresses and decompresses video data. A container (MP4, WebM, MKV, MOV) is the file format that packages compressed video, audio, subtitles, and metadata into a single file.

Understanding this distinction is critical: an MP4 file is not "in H.264 format." MP4 is the container; H.264 is one of many codecs that can be stored inside an MP4 container. The same MP4 container could hold H.264, H.265, or AV1 video with AAC, AC-3, or Opus audio.

Video Compression Fundamentals: How Video Codecs Think

All modern video codecs (H.264, H.265, VP9, AV1) use the same fundamental approach: block-based hybrid coding. This approach has three stages that operate on every block (macroblock/CTU/superblock) of every frame.

Stage 1: Prediction. The encoder predicts what each block will look like based on either the current frame (intra prediction — using neighboring blocks) or reference frames (inter prediction — using previously encoded frames via motion vectors). Intra prediction uses angular modes to predict from edges, corners, and gradients. Inter prediction searches reference frames to find the best matching block and records the displacement as a motion vector.

Stage 2: Transform and Quantization.The difference between the prediction and the actual block (the "residual") is transformed using a DCT-like transform (integer DCT for H.264, variable-size DCT for H.265/AV1) to separate low-frequency and high-frequency components. The frequency coefficients are then quantized (rounded, losing information) — this is the lossy step. The QP (Quantization Parameter) or CRF (Constant Rate Factor) controls how aggressively coefficients are quantized.

Stage 3: Entropy Coding. The quantized coefficients, motion vectors, and coding decisions are entropy-coded using context-adaptive binary arithmetic coding (CABAC in H.264/H.265, a variant in AV1). CABAC adapts its probability models based on surrounding coded data, achieving better compression than fixed probability tables. The output is the final compressed bitstream.

Additionally, in-loop filters are applied after reconstruction to reduce artifacts before the frame is used as a reference for subsequent frames. H.264 has a deblocking filter. H.265 adds SAO (Sample Adaptive Offset). AV1 adds CDEF (Constrained Directional Enhancement Filter) and Loop Restoration Filter. These filters ensure that encoding artifacts do not propagate and amplify across frames.

Codecs vs Containers: The Fundamental Distinction

Think of a container as a shipping box and the codec as the way the contents are packed inside. The box (MP4, MKV) determines the label, the shipping method, and what kinds of items can go inside. The packing method (H.264, AV1) determines how efficiently the contents fit and how much space they take.

When someone says "I have an MP4 video," they are telling you the container format, but you still do not know the video codec (could be H.264, H.265, or AV1), the audio codec (could be AAC, MP3, or Opus), or any quality parameters. This is why "MP4" alone does not fully describe a video file.

H.264/AVC: The Universal Codec

H.264 (also known as AVC, Advanced Video Coding) was standardized in 2003 and quickly became the most widely used video codec in history. It powers Blu-ray discs, digital television, video surveillance, video conferencing, and the majority of internet video streaming. Every modern device with a screen has H.264 hardware decoding.

How H.264 Works

H.264 divides each frame into macroblocks (16x16 pixels) and uses three types of frames: I-frames (intra, complete pictures), P-frames (predicted from previous frames), and B-frames (bidirectional, predicted from both previous and future frames). For each macroblock, the encoder finds the best match from previously encoded frames (motion estimation), computes the difference (residual), transforms it with integer DCT, quantizes the coefficients, and entropy-codes the result using CABAC (Context-Adaptive Binary Arithmetic Coding) or CAVLC.

The result is remarkably efficient: a 1080p video at 5 Mbps produces quality that would require 50+ Mbps with uncompressed video or 20+ Mbps with MPEG-2. This 10x compression is what made HD video streaming viable on consumer internet connections.

Patent Situation

H.264 is covered by patents held by dozens of companies, managed by the MPEG LA patent pool. For internet video that is free to end users, MPEG LA offers royalty-free licensing. For pay-per-view or subscription content, licensing fees apply. This patent complexity motivated Google to develop VP8/VP9 and later AV1 as royalty-free alternatives.

H.265/HEVC: Double the Efficiency, Triple the Patent Mess

H.265 (High Efficiency Video Coding, also called HEVC) was standardized in 2013. It delivers approximately 40% better compression than H.264 at the same quality, or equivalently, the same quality at 40% lower bitrate. It achieves this through larger coding tree units (CTU, up to 64x64 vs 16x16), more intra prediction modes (35 vs 9), improved motion compensation, sample-adaptive offset filtering, and more reference frames.

However, H.265 adoption has been hampered by its patent licensing nightmare. Three separate patent pools (MPEG LA, HEVC Advance, and Velos Media) claim essential patents, each demanding separate royalties. Some patent holders are not in any pool. The total cost and legal complexity of deploying H.265 is so high that many companies (including Google and Mozilla) refused to adopt it in browsers, which is why Chrome and Firefox support VP9 and AV1 but not H.265 for web video.

H.265 is widely used in cable TV, satellite broadcasting, Blu-ray UHD, and Apple's ecosystem (iPhone recording, Apple TV+, FaceTime). Safari supports H.265 playback. For web developers, H.264 remains the safe baseline with AV1 as the modern upgrade path.

VP9: Google's Royalty-Free Answer

VP9 was developed by Google and released in 2013 as a royalty-free alternative to H.265. It achieves roughly similar compression efficiency to H.265 (about 35-40% better than H.264) without patent royalties. YouTube adopted VP9 as its primary codec for 4K content, and it is supported by Chrome, Firefox, Edge, and most Android devices.

VP9 uses superblocks (up to 64x64), 10 prediction modes, 8-tap interpolation filters, and a tile-based parallel processing architecture. Profile 2 adds 10-bit and 12-bit color depth for HDR content. VP9 hardware decoding is available on most devices manufactured since 2015.

H.266/VVC: The Next Generation (2020)

H.266 (Versatile Video Coding, VVC) was standardized in July 2020 and achieves approximately 50% better compression than H.265 — the same generational improvement that H.265 achieved over H.264. VVC uses coding tree units up to 128x128 pixels with more flexible partitioning options, 67 intra prediction modes (vs 35 for H.265), geometric partitioning for inter prediction, and adaptive loop filtering.

However, H.266/VVC faces the same patent challenges that hampered H.265. The MC-IF (Media Coding Industry Forum) is managing patent licensing, but the total cost and complexity remain unclear. Encoding complexity is also extreme: current VVC encoders are 10-20x slower than H.265 encoders, which were already considered slow.

Hardware decoder support for VVC is just beginning in 2026. MediaTek Dimensity 10,000+ and Qualcomm Snapdragon 8 Gen 3 include hardware VVC decoders. No web browser supports VVC playback. The format will likely find adoption in broadcast television (where patent pools are established) and mobile devices (where hardware decoders are available), but AV1 is likely to dominate web video due to its royalty-free status and existing broad support.

ProRes: Apple's Professional Codec

Apple ProRes is a family of lossy and lossless video codecs designed for post-production editing. Unlike H.264 and AV1 (which are optimized for distribution/streaming), ProRes is optimized for editing: it uses intra-frame-only compression (every frame is a complete picture, enabling instant seeking) and predictable data rates that match the speed of professional storage systems.

ProRes comes in several variants: ProRes 422 Proxy (~45 Mbps at 1080p, for offline editing), ProRes 422 LT (~100 Mbps), ProRes 422 (~145 Mbps, the standard), ProRes 422 HQ (~220 Mbps, broadcast quality), ProRes 4444 (~330 Mbps, with alpha channel), and ProRes 4444 XQ (~500 Mbps, highest quality). ProRes RAW captures sensor data directly from cinema cameras with minimal processing.

Since Apple Silicon chips include hardware ProRes encoding and decoding, ProRes is essentially a native format on modern Macs. iPhone 13 Pro and later can record ProRes video directly. For cross-platform editing workflows, DNxHR (Avid) is the main alternative to ProRes, offering similar intra-frame performance with broader tool support on Windows.

AV1: The Future of Video

AV1 was developed by the Alliance for Open Media (AOM), a consortium including Google, Mozilla, Netflix, Amazon, Apple, Microsoft, Intel, AMD, ARM, and others. Released in 2018, AV1 is royalty-free and achieves approximately 50% better compression than H.264 and 20% better than H.265/VP9.

Technical Advances

AV1 uses superblocks up to 128x128 pixels with recursive quad/binary partitioning. It supports 56 intra prediction modes (vs 35 for H.265), directional prediction with angles fine-tuned to the content, and intra block copy (for screen content). Inter prediction uses compound reference frames, overlapped block motion compensation, and warped motion for non-translational movement.

Post-processing includes three in-loop filters: a deblocking filter, CDEF (Constrained Directional Enhancement Filter) for ringing artifact removal, and a loop restoration filter (Wiener or self-guided) for general noise reduction. AV1 also supports film grain synthesis, where the encoder analyzes and removes film grain, transmits grain parameters, and the decoder re-synthesizes grain at playback. This saves substantial bitrate on grainy content.

Adoption Status

YouTube has been encoding new uploads in AV1 since 2020 and serves AV1 to supporting devices. Netflix uses AV1 for all titles on Android devices and smart TVs with AV1 hardware support. Hardware decoding is available on MediaTek Dimensity 1000+, Samsung Exynos 2100+, Intel 11th-gen+, AMD RDNA 3, and NVIDIA RTX 30-series+. All major browsers support AV1 decoding. Real-time AV1 encoding is now practical with SVT-AV1 (Intel) and hardware encoders.

Key Finding

AV1 achieves 50% better compression than H.264 and is royalty-free. YouTube and Netflix have adopted it, and hardware decoder support is now widespread.

For new video projects, encode in AV1 with H.264 fallback. The dual-format approach covers every device while minimizing bandwidth costs.

Video Codec Efficiency Comparison

The chart below shows PSNR (peak signal-to-noise ratio, a quality metric where higher is better) at different bitrates for five codecs encoding the same 1080p test content. AV1 achieves the highest quality at every bitrate, followed by H.266/VVC, H.265, VP9, and H.264.

Video Quality (PSNR) vs Bitrate by Codec

Source: OnlineTools4Free Research

AV1 Hardware Decoder/Encoder Support in 2026

Hardware support determines real-world codec adoption because software decoding of high-resolution video drains battery and generates heat. AV1 hardware decoder support in 2026 covers:

Mobile SoCs: MediaTek Dimensity 1000+ (2020), Samsung Exynos 2100+ (2021), Qualcomm Snapdragon 888+ (2021), Apple A17 Pro+ (2023), Google Tensor G2+ (2022). Virtually all flagship phones sold since 2022 have AV1 hardware decode.

Desktop/Laptop: Intel 11th-gen+ (Tiger Lake, 2020) for decode, Intel Arc GPUs (2022) for encode. AMD RDNA 3+ (RX 7000, 2022) for encode/decode. NVIDIA RTX 30-series (2020) for decode, RTX 40-series (2022) for encode. Apple M1+ (2020) for decode, M3+ (2023) for encode.

Smart TVs: Most TVs manufactured since 2022 include AV1 hardware decode, especially models running Android TV. Samsung, LG, Sony, TCL, and Hisense all ship AV1-capable TVs. YouTube, Netflix, and Disney+ leverage this hardware for 4K AV1 streaming.

Hardware encoding is critical for live streaming, video conferencing, and real-time content creation. NVIDIA NVENC AV1 (RTX 40-series), AMD VCN AV1 (RX 7000), and Intel Quick Sync AV1 (Arc GPUs) enable real-time AV1 encoding at resolutions up to 8K. OBS Studio, Discord, and Google Meet all support hardware AV1 encoding where available.

Container Formats Compared

MP4 (MPEG-4 Part 14) is the universal container, supported by every device and browser. WebM is Google's web-focused container based on Matroska, limited to VP8/VP9/AV1 video with Vorbis/Opus audio. MKV (Matroska) is the most flexible container, supporting virtually any codec, multiple audio tracks, embedded subtitles, chapters, and attachments. MOV is Apple's QuickTime container, essentially identical to MP4 but with Apple-specific extensions.

Video Container Formats

8 rows

Format	Container Name	Year	Common Codecs	Streaming	Subtitles	Multi Audio	Browser %	Royalty Free	Best For
MP4	MPEG-4 Part 14	2001	H.264, H.265, AAC	Yes	Limited	Yes	100%	No (H.264)	Universal playback
WebM	Matroska-based	2010	VP8, VP9, AV1, Opus	Yes	WebVTT	Yes	96%	Yes	Web video
MKV	Matroska	2002	Any codec	Limited	SRT, ASS, SSA	Yes	~20%	Yes	Archival, multiple tracks
MOV	QuickTime	1991	H.264, ProRes, AAC	Yes	Text track	Yes	~60%	Partial	Apple ecosystem, editing
AVI	AVI	1992	DivX, XviD, MP3	No	External only	Limited	~15%	Yes	Legacy compatibility
FLV	Flash Video	2003	H.263, VP6, MP3	RTMP	Limited	No	0% (Flash dead)	No	Nothing (obsolete)
TS	MPEG-TS	1995	H.264, H.265, AAC	HLS	DVB	Yes	Via HLS	No	Broadcast, HLS streaming
OGV	Ogg	2004	Theora, Vorbis	Limited	Kate	Yes	~80%	Yes	Open-source projects

Video Codec Technical Matrix

Video Codecs: Technical Comparison

8 rows

Codec	Year	License	Quality/Bit	Encoding	Decoding	HW Support	HDR	Adoption
H.264/AVC	2003	MPEG LA	Baseline	Low	Low	Universal	No	95%
H.265/HEVC	2013	MPEG LA + others	+40% vs H.264	High	Medium	Widespread	Yes	65%
VP9	2013	Royalty-free	+35% vs H.264	High	Medium	Good	Yes (Profile 2)	60%
AV1	2018	Royalty-free	+50% vs H.264	Very High	Medium	Growing	Yes	40%
H.266/VVC	2020	MC-IF	+50% vs H.265	Extreme	High	Emerging	Yes	5%
VP8	2008	Royalty-free	~H.264	Low	Low	Legacy	No	30%
Theora	2004	Royalty-free	-20% vs H.264	Low	Low	None	No	<5%
ProRes	2007	Apple	Near-lossless	Low	Low	Apple silicon	Yes	20% (pro)

What Streaming Platforms Use

Major streaming platforms have adopted different codec strategies based on their content, audience, and device ecosystem. YouTube and Netflix lead AV1 adoption. Apple TV+ and Disney+ rely on H.265 for their Apple-centric audiences. The table below shows current codec choices as of 2026.

Streaming Platform Format Choices (2026)

8 rows

Platform	Primary Codec	Fallback	Container	Max Resolution	HDR Formats	Audio Codec
YouTube	AV1	VP9, H.264	WebM, MP4	8K	HDR10, HLG	Opus, AAC
Netflix	AV1	H.265, VP9	CMAF, MP4	4K	Dolby Vision, HDR10	AAC, E-AC-3, Atmos
Twitch	H.264	AV1 (beta)	FMP4, TS	1080p60	None	AAC
Disney+	H.265	H.264	CMAF	4K	Dolby Vision, HDR10	E-AC-3, Atmos
Vimeo	H.264	H.265	MP4	8K	HDR10	AAC
TikTok	H.264	H.265	MP4	1080p	HDR10	AAC
Instagram	H.264	H.265	MP4	1080p	None	AAC
Apple TV+	H.265	H.264	CMAF	4K	Dolby Vision	AAC, Atmos

Bitrate, Resolution, and Frame Rate

These three parameters interact to determine video quality and file size. Resolution defines the number of pixels per frame (1920x1080 = 2,073,600 pixels). Frame rate defines frames per second (24, 30, or 60 fps). Bitrate defines the data rate (measured in Mbps). For a given codec, doubling the resolution approximately doubles the required bitrate for the same quality. Doubling the frame rate increases required bitrate by 40-60% (not 100%, because successive frames are similar and compress well).

Practical bitrate recommendations for H.264: 1080p30 at 5-8 Mbps for streaming, 8-15 Mbps for high quality, 20-50 Mbps for archival. For AV1, reduce these numbers by roughly half. 4K requires 2-4x the bitrate of 1080p for the same quality level.

Standard Bitrate Ladder for Adaptive Streaming

Adaptive bitrate streaming requires encoding the same video at multiple quality levels. The player selects the highest quality that the viewer's connection can sustain without buffering. The table below shows recommended bitrates for H.264, H.265, and AV1 at each resolution tier. AV1 requires roughly half the bitrate of H.264 for equivalent quality, which translates directly to bandwidth cost savings.

Adaptive Streaming Bitrate Ladder (kbps)

12 rows

Quality	Resolution	FPS	H.264 kbps	H.265 kbps	AV1 kbps
240p	426x240	30	400	250	200
360p	640x360	30	800	500	400
480p	854x480	30	1400	900	700
720p30	1280x720	30	2800	1800	1400
720p60	1280x720	60	4500	2800	2200
1080p30	1920x1080	30	5000	3200	2500
1080p60	1920x1080	60	8000	5000	4000
1440p30	2560x1440	30	12000	7500	6000
1440p60	2560x1440	60	18000	11000	9000
4K30	3840x2160	30	20000	12000	10000
4K60	3840x2160	60	35000	20000	16000
8K30	7680x4320	30	80000	45000	35000

Understanding I-Frames, P-Frames, and B-Frames

Video compression works by exploiting temporal redundancy: consecutive frames are usually very similar. Instead of storing every frame independently (like a flipbook), modern codecs store complete reference frames (I-frames) and then describe subsequent frames as differences from those references. P-frames reference previous frames; B-frames reference both previous and future frames.

The table below shows a typical breakdown for a 1-minute 1080p30 clip encoded with H.264. I-frames are the largest (280 KB average) because they contain a complete picture, but they are rare (15 in 60 seconds with a GOP of 60). P-frames are 42 KB on average and make up the bulk of data. B-frames are the smallest at 18 KB each. The ratio of frame types directly affects both quality and seekability: more I-frames enable faster seeking but increase file size.

H.264 Frame Type Distribution (1-min 1080p30 clip)

3 rows

Frame Type	Count	Avg Size (KB)	% of Total	Description
I-frame (Intra)	15	280	22	Complete picture — no references. Used for seeking, scene changes. Interval: every 2 seconds (GOP=60).
P-frame (Predicted)	585	42	48	Predicted from previous I or P frames. Contains motion vectors + residual data. Most common frame type.
B-frame (Bidirectional)	1200	18	30	Predicted from both past and future reference frames. Smallest frames. Typically 2 B-frames between P-frames.

H.264 Profile and Level System

H.264 defines profiles that specify which coding tools are available. The Baseline profile omits B-frames and CABAC entropy coding, making it simpler to decode (suitable for mobile and video conferencing). The Main profile adds B-frames and CABAC for 20-30% better compression. The High profile adds 8x8 transforms and additional quantization options for the best quality per bit.

H.264 Profiles

6 rows

Profile	B-Frames	CABAC	Max Resolution	Use Case
Baseline	No	No	1920x1080	Video conferencing, mobile
Main	Yes	Yes	1920x1080	Standard definition broadcast
High	Yes	Yes	4096x2304	Blu-ray, streaming HD
High 10	Yes	Yes	4096x2304	10-bit HDR content
High 4:2:2	Yes	Yes	4096x2304	Professional 4:2:2 workflows
High 4:4:4	Yes	Yes	4096x2304	Screen capture, lossless

Video Encoding Speed Comparison

The chart below compares encoding speed (frames per second) for different video codecs at 1080p resolution. x264 (H.264) is the fastest, encoding at 48-110 fps depending on quality. SVT-AV1 is practical for production use at 6-16 fps. The reference AV1 encoder (aomenc) is extremely slow at under 1 fps, suitable only for offline batch encoding. H.266/VVC encoding is even slower, reflecting its status as an emerging codec.

Video Encoding Speed (frames/sec, 1080p single-thread)

7 rows

Codec	High Q (fps)	Medium Q (fps)	Low Q (fps)	Preset
x264 (H.264)	48	72	110	medium
x265 (H.265)	12	18	28	medium
VP9 (libvpx)	8	14	22	good
SVT-AV1	6	10	16	6
aomenc (AV1)	0.8	1.5	2.5	cpu-used=4
rav1e (AV1)	3.5	5.5	8.5	6
VVenC (H.266)	0.3	0.5	0.8	medium

YouTube Codec Adoption Trend

YouTube is the world's largest video platform and its codec choices drive industry adoption. The chart below shows how YouTube has shifted from H.264 dominance (95% in 2016) to AV1 majority (58% in 2026). VP9 served as a bridge technology, peaking at 40% in 2021 before being gradually replaced by AV1. This transition has saved Google and its users billions of dollars in bandwidth costs while delivering better video quality, especially on mobile connections.

YouTube Video Codec Distribution by Year (%)

Source: OnlineTools4Free Research

Streaming Protocols Compared

Video streaming uses specialized protocols to deliver video over HTTP. HLS (HTTP Live Streaming, Apple) and DASH (Dynamic Adaptive Streaming over HTTP, MPEG) are the two dominant protocols. CMAF (Common Media Application Format) unifies them using a shared segment format. Low-latency variants (LL-HLS, LL-DASH) reduce latency from 10-30 seconds to 2-4 seconds for live streaming. WebRTC provides sub-second latency for real-time communication.

Streaming Protocols Compared

9 rows

Protocol	Developer	Year	Latency	ABR	Browser %
HLS	Apple	2009	6-30s	Yes	95%
DASH	MPEG/ISO	2012	6-30s	Yes	80%
CMAF	MPEG/Apple/Microsoft	2018	2-6s	Yes	90%
LL-HLS	Apple	2020	2-4s	Yes	85%
LL-DASH	DASH-IF	2020	2-4s	Yes	75%
WebRTC	Google/IETF	2011	<500ms	Yes	97%
RTMP	Adobe/Macromedia	2002	1-5s	No	0% (Flash dead)
SRT	Haivision	2017	<1s	No	0% (ingest only)
WHIP	IETF	2022	<500ms	Yes	95%

Code Example: Embedding Video with AV1 Fallback

The HTML <video> element supports multiple source formats. Use the codecs parameter to specify the exact codec, allowing the browser to skip formats it cannot decode without downloading them.

<video
  autoplay muted loop playsinline
  poster="thumb.webp"
  width="800" height="450">
  <source src="video.mp4" type='video/mp4; codecs="av01.0.05M.08"'>
  <source src="video.mp4" type='video/mp4; codecs="avc1.4D401E"'>
  Your browser does not support the video element.
</video>

For converting video between formats, ffmpeg is the standard tool. The command below converts a source video to both AV1 (primary) and H.264 (fallback).

# Convert video to AV1 with H.264 fallback
ffmpeg -i input.mov -c:v libaom-av1 -crf 30 -b:v 0 \
  -c:a libopus -b:a 128k output-av1.mp4

ffmpeg -i input.mov -c:v libx264 -crf 23 \
  -c:a aac -b:a 192k output-h264.mp4

Try it yourself

Image Compressor

Open full tool

Drop your image here

or click to browse

JPGPNGWebP|Max 50 MB

Part 4: Audio Formats

~5,000 words covering 9 audio formats

Audio formats fall into three categories: uncompressed (WAV, AIFF), lossless compressed (FLAC, ALAC), and lossy compressed (MP3, AAC, Opus, Vorbis, WMA). The choice between them involves trade-offs between quality, file size, compatibility, and feature support.

Audio Loudness Standards: The Loudness War is Over

For decades, music producers made tracks as loud as possible ("the loudness war"), compressing dynamic range to achieve higher average levels. This practice peaked in the mid-2000s and has been reversed by streaming platform normalization.

All major streaming platforms now normalize playback volume to a target loudness level measured in LUFS (Loudness Units relative to Full Scale). Spotify normalizes to -14 LUFS, Apple Music to -16 LUFS, YouTube to -14 LUFS, and Tidal to -14 LUFS. A track mastered at -8 LUFS (very loud, heavily compressed) will be turned down by 6 dB on Spotify, while a dynamic track mastered at -14 LUFS will play at its original level. This means over-compressed masters actually sound worse on streaming platforms because they have less dynamic range without any loudness advantage.

For podcasts, the standard is -16 LUFS (mono) or -19 LUFS (stereo) per the Podcast Standards Project and Apple's specifications. YouTube targets -14 LUFS for all content. Broadcast television uses -24 LUFS (EBU R128 in Europe) or -24 LKFS (ATSC A/85 in North America).

The loudness meter and normalization process operate on the final encoded file regardless of format. However, the format affects how well the dynamic range is preserved. At low bitrates, lossy codecs can introduce artifacts on loud transients. Opus handles this best due to its adaptive mode switching; MP3 at 128 kbps can produce pre-echo artifacts on transients (audible distortion slightly before drum hits) that Opus avoids entirely.

FLAC Internals: How Lossless Audio Compression Works

FLAC achieves lossless compression by exploiting the predictability of audio signals. Audio samples are not random — each sample is strongly correlated with its neighbors. FLAC uses linear predictive coding (LPC) to model this correlation.

Step 1: Blocking. The audio stream is divided into blocks of 1,152 to 16,384 samples (typically 4,096). Each block is compressed independently, allowing seeking to any block without decompressing the entire file.

Step 2: Inter-channel decorrelation. For stereo audio, FLAC can store left and right channels independently, or use mid/side encoding (M = (L+R)/2, S = L-R). Mid/side is more efficient when the channels are similar (most stereo music), because the side channel has lower amplitude and compresses better.

Step 3: Linear prediction. The encoder tries multiple LPC prediction orders (0 to 12) and selects the one that minimizes the prediction residual. For a 4th-order predictor, each sample is predicted as: P[n] = a1*S[n-1] + a2*S[n-2] + a3*S[n-3] + a4*S[n-4]. The residual (actual - predicted) is typically much smaller than the original samples.

Step 4: Rice coding. The residual values are encoded using Rice coding, an entropy coding scheme optimized for small, near-zero values (which prediction residuals typically are). Rice coding is simpler and faster than Huffman coding while being nearly as efficient for this specific distribution.

The result: CD-quality audio (16-bit, 44.1 kHz) compresses to approximately 50-60% of its PCM size. Classical music and quiet acoustic recordings compress better (40-50%) because they have lower entropy. Electronic music and heavily distorted content compresses worse (55-65%) because it has higher entropy and less predictability.

MP3: The Format That Changed Music

MP3 (MPEG-1 Audio Layer III) was standardized in 1993 and revolutionized music distribution by enabling a 10:1 compression ratio with acceptable audio quality. A typical 4-minute song is about 4 MB at 128 kbps (vs 40 MB as WAV), making it practical to download and share music over dial-up internet connections. The cultural impact of MP3 on the music industry — through Napster, iTunes, and the iPod — is difficult to overstate.

How MP3 Compression Works

MP3 exploits psychoacoustic principles of human hearing. The audio signal is transformed to the frequency domain using a modified DCT (MDCT). A psychoacoustic model analyzes the signal to determine which frequencies are masked (inaudible because of louder nearby frequencies). The encoder allocates bits to audible components and discards masked components. This is called perceptual coding.

Key psychoacoustic phenomena exploited: simultaneous masking (a loud tone at 1 kHz makes nearby frequencies inaudible), temporal masking (a loud sound masks softer sounds immediately before and after it), and the absolute threshold of hearing (frequencies below certain amplitudes are inaudible regardless of other sounds). By discarding information the ear cannot perceive, MP3 achieves high compression with minimal audible quality loss.

Bitrate Modes

MP3 supports three bitrate modes: CBR (Constant Bitrate, same bitrate throughout), VBR (Variable Bitrate, adapts to content complexity), and ABR (Average Bitrate, targets a specific average). VBR produces the best quality per file size because it allocates more bits to complex passages (cymbals, vocal harmonics) and fewer bits to simple passages (silence, sustained notes). LAME's VBR V0 setting (~245 kbps average) is considered transparent (indistinguishable from CD) by most listeners in blind tests.

All MP3 patents expired by 2017, making the format completely royalty-free. However, MP3 is technically inferior to AAC and Opus at every bitrate. The LAME encoder (the best MP3 encoder) has not seen significant development since the late 2000s. MP3 remains relevant purely due to its universal compatibility.

WAV: Uncompressed Studio Audio

WAV (Waveform Audio File Format) stores audio as uncompressed PCM (Pulse Code Modulation) data. CD-quality WAV is 16-bit, 44.1 kHz, stereo, producing a bitrate of 1,411 kbps (1.41 Mbps). A 4-minute song is approximately 42 MB. Professional WAV recordings use 24-bit or 32-bit floating-point at 48 kHz, 96 kHz, or even 192 kHz sample rates.

WAV is the standard format in recording studios, sound design, broadcast, and any workflow where quality cannot be compromised. It is the format that DAWs (Digital Audio Workstations) like Pro Tools, Logic Pro, Ableton Live, and FL Studio use internally. WAV files are fast to read and write because no decompression is needed, which matters for real-time multi-track recording.

FLAC: The Audiophile Standard

FLAC (Free Lossless Audio Codec) compresses audio without any quality loss, typically achieving a 50-60% size reduction compared to WAV. A 4-minute CD-quality song is about 20-25 MB in FLAC (vs 42 MB WAV). FLAC is the de facto standard for lossless music distribution: Bandcamp, Tidal HiFi, Amazon Music HD, and Qobuz all use FLAC.

How FLAC Works

FLAC uses linear prediction to model each audio sample based on previous samples. The encoder tries multiple prediction orders and selects the one that minimizes the residual (difference between predicted and actual values). The residual is then entropy-coded using Rice coding, which is highly efficient for the small, near-zero values typical of prediction residuals.

FLAC supports sample rates from 1 Hz to 655,350 Hz, bit depths from 4 to 32 bits, and up to 8 channels. It includes MD5 checksumming for integrity verification, cue sheet support for album indexing, and Vorbis comment tags for metadata.

AAC: Apple's Codec of Choice

AAC (Advanced Audio Coding) was standardized in 1997 as the successor to MP3 within the MPEG family. It uses more advanced psychoacoustic modeling, larger transform window sizes (2048 samples vs 576 for MP3), temporal noise shaping, and perceptual noise substitution to achieve better quality than MP3 at every bitrate. At 128 kbps, AAC sounds equivalent to 192 kbps MP3.

Apple adopted AAC for iTunes, iPod, iPhone, Apple Music, FaceTime, and every Apple product. Spotify uses 256 kbps AAC for premium streaming on web and desktop. Most broadcast and streaming services use AAC for their audio tracks.

AAC Profiles

AAC comes in several profiles: AAC-LC (Low Complexity, the most common, used by iTunes and Spotify), HE-AAC v1 (High Efficiency, adds Spectral Band Replication for excellent quality at low bitrates, used in digital radio), and HE-AAC v2 (adds Parametric Stereo for very low bitrate speech). Apple also uses AAC-LC for Apple Lossless (ALAC), which is a separate lossless codec that shares the .m4a container.

Opus: The Best Audio Codec

Opus is the best lossy audio codec available in 2026 by virtually every metric. Standardized by the IETF in 2012 (RFC 6716), it is open-source, royalty-free, and designed for both speech and music. Opus dynamically blends two coding modes: SILK (developed by Skype for speech) and CELT (Constrained Energy Lapped Transform, for music), adapting in real-time to the content.

Quality at Every Bitrate

Opus delivers quality that matches or exceeds every other lossy codec at every bitrate. At 64 kbps, Opus sounds better than AAC at 96 kbps and MP3 at 128 kbps. At 128 kbps, Opus is transparent (indistinguishable from the original) for the vast majority of content. This has been confirmed in multiple independent listening tests (mushra.org, hydrogenaudio.org).

Opus supports bitrates from 6 kbps (very low quality speech) to 510 kbps (transparent music), sample rates from 8 kHz to 48 kHz, and up to 255 channels. It has an algorithmic latency as low as 2.5 ms, making it ideal for real-time communication. Discord, WebRTC, all modern VoIP, and many game engines use Opus.

Key Finding

Opus is the best lossy audio codec at every bitrate. At 128 kbps it is transparent (indistinguishable from CD). It is open-source, royalty-free, and supported by 97% of browsers.

For new projects, Opus should be the default choice for both voice and music. Use AAC only for Apple-specific workflows, and MP3 only for legacy compatibility.

OGG Vorbis: The Pioneer of Open Audio

Vorbis is an open-source, royalty-free audio codec developed by the Xiph.Org Foundation and typically stored in the Ogg container format. Released in 2000, Vorbis was created as a free alternative to MP3 and AAC during the era of aggressive patent enforcement. At equivalent bitrates, Vorbis generally matches or slightly exceeds MP3 quality, particularly at lower bitrates (below 128 kbps).

Vorbis has been largely superseded by Opus (from the same Xiph.Org Foundation), but it remains widely used in gaming (Unreal Engine, Unity, many game titles), Wikipedia (audio files), and some streaming services. Spotify used Ogg Vorbis for years (320 kbps for Premium).

AIFF: Apple's Uncompressed Format

AIFF (Audio Interchange File Format) is essentially Apple's equivalent of WAV. Created by Apple in 1988, it stores uncompressed PCM audio with the same quality characteristics as WAV. The primary differences are byte order (AIFF uses big-endian, WAV uses little-endian), metadata format (AIFF uses its own chunk format), and ecosystem (AIFF is common in macOS/Logic Pro workflows, WAV is universal).

AIFF-C is a compressed variant that supports various codecs, but it is rarely used. For practical purposes, WAV and AIFF are interchangeable for uncompressed audio storage. Choose whichever your DAW and workflow prefer.

WMA: Microsoft's Legacy Codec

WMA (Windows Media Audio) was Microsoft's proprietary audio codec, developed as a competitor to MP3 and AAC. WMA Standard offers quality comparable to MP3 at similar bitrates. WMA Pro supports multichannel (5.1, 7.1) and 24-bit audio. WMA Lossless offers FLAC-like compression.

WMA's primary claim to fame was deep DRM integration with Windows Media Player, making it the preferred format for early digital music stores. This DRM dependency backfired: when Microsoft shut down its DRM servers, purchased WMA files became unplayable. The format is effectively dead for new content but persists in legacy media libraries. Do not use WMA for new projects.

ALAC: Apple's Lossless Codec

ALAC (Apple Lossless Audio Codec) is Apple's lossless audio format, functionally equivalent to FLAC but stored in the M4A container. Apple open-sourced ALAC in 2011, making it royalty-free. ALAC achieves similar compression ratios to FLAC (50-60% of WAV size) and is natively supported on all Apple devices and iTunes/Apple Music.

The practical difference between FLAC and ALAC is ecosystem support. FLAC is universal: supported by Android, Linux, Windows, web browsers, and most audio software. ALAC is primarily an Apple ecosystem format. For cross-platform compatibility, FLAC is the better choice. For Apple-only workflows, ALAC avoids any potential compatibility issues with Apple software.

DSD: The Audiophile Niche

DSD (Direct Stream Digital) uses a fundamentally different approach to digital audio. Instead of PCM's multi-bit samples at a moderate rate (16-bit at 44.1 kHz), DSD uses 1-bit samples at an extremely high rate (2.8224 MHz for DSD64, 5.6448 MHz for DSD128). This produces a bitstream that directly represents the analog waveform using delta-sigma modulation.

DSD was developed by Sony and Philips for the SACD (Super Audio CD) format. DSD files are stored in DFF or DSF containers and are enormous: about 42 MB per minute for DSD64 stereo. Niche streaming services (NativeDSD) offer DSD content, but for 99.9% of listeners, 24-bit FLAC at 96 kHz is indistinguishable from DSD and far more practical.

Spatial Audio: Dolby Atmos and Beyond

Spatial audio represents sound in three dimensions. Dolby Atmos uses object-based audio where each sound source has a position (x, y, z) and the renderer adapts to the speaker configuration or headphones. Atmos is supported by Apple Music, Tidal, Amazon Music, and Netflix.

Ambisonics encodes a complete sound field as spherical harmonic channels. First-order ambisonics (FOA) uses 4 channels; higher-order uses more for better spatial resolution. Ambisonics is used in VR (YouTube 360, Facebook 360) because it can be freely rotated without artifacts — the listener can look in any direction.

For web audio, the Web Audio API provides the PannerNode for basic 3D positioning of sound sources. For immersive experiences, Resonance Audio (Google) and Mach1 provide higher-quality spatial audio rendering in the browser. The audio format is typically Opus or AAC; the spatial metadata is applied at the rendering stage.

Audio Quality vs File Size

The chart below shows perceptual audio quality (on a 1-10 scale based on MUSHRA listening tests) at different bitrates for four lossy codecs. Opus achieves the best quality at every bitrate. At 128 kbps, Opus is essentially transparent, while MP3 still has audible artifacts on certain content.

Audio Quality Score vs Bitrate

Source: OnlineTools4Free Research

Complete Audio Format Comparison

Audio Formats: Full Comparison

9 rows

Format	Year	Type	Codec	Max Bitrate	Sample Rates	Bit Depth	Channels	Gapless	Royalty Free	Browser %	Best For
MP3	1993	Lossy	MPEG-1 Layer 3	320 kbps	8-48 kHz	16-bit	Stereo	Hack (LAME)	Yes (2017)	100%	Universal compatibility
AAC	1997	Lossy	AAC-LC, HE-AAC	512 kbps	8-96 kHz	16-bit	7.1	Yes	No	100%	Apple, streaming
OGG Vorbis	2000	Lossy	Vorbis	500 kbps	8-192 kHz	16-bit	255	Yes	Yes	95%	Gaming, open-source
Opus	2012	Lossy	Opus	510 kbps	8-48 kHz	16-24 bit	255	Yes	Yes	97%	VoIP, streaming, web
FLAC	2001	Lossless	FLAC	Variable	1-655 kHz	4-32 bit	8	Yes	Yes	92%	Audiophile, archival
WAV	1991	Uncompressed	PCM	N/A	8-384 kHz	8-64 bit	18	Yes	Yes	100%	Studio recording
AIFF	1988	Uncompressed	PCM	N/A	8-384 kHz	8-32 bit	6	Yes	Yes	~70%	Apple pro audio
WMA	1999	Lossy	WMA Standard	384 kbps	8-48 kHz	16-bit	5.1	Limited	No	~15%	Windows ecosystem
ALAC	2004	Lossless	ALAC	Variable	1-384 kHz	16-32 bit	8	Yes	Yes (2011)	~60%	Apple lossless

Sample Rate, Bit Depth, and Channels Explained

Sample rate determines the highest frequency that can be captured. By the Nyquist theorem, the maximum reproducible frequency is half the sample rate. CD audio at 44.1 kHz captures frequencies up to 22.05 kHz, exceeding the typical human hearing range of 20 Hz to 20 kHz. Higher sample rates (96 kHz, 192 kHz) are used in professional recording to capture ultrasonic harmonics and to provide wider headroom for processing, but they offer no audible benefit for playback.

Bit depth determines the dynamic range (the ratio between the loudest and softest sounds). 16-bit audio provides 96 dB of dynamic range, sufficient for CD and most listening environments. 24-bit audio provides 144 dB of dynamic range, which exceeds the threshold of pain (~130 dB) and is used in recording to prevent clipping during performance.

Channels define the spatial audio layout. Mono (1 channel) is used for speech, podcasts, and AM radio. Stereo (2 channels) is standard for music. Surround formats include 5.1 (six channels: front left, center, front right, rear left, rear right, subwoofer), 7.1 (eight channels), and spatial audio formats like Dolby Atmos (up to 128 tracks with object-based positioning).

Frequency Response by Codec and Bitrate

At low bitrates, lossy audio codecs aggressively cut high frequencies because most musical energy is in the lower frequencies. The table below shows the highest frequency preserved at each bitrate. MP3 at 64 kbps cuts everything above 8 kHz, producing a noticeably muffled sound. Opus at the same bitrate preserves frequencies up to 16 kHz, sounding dramatically better. At 128 kbps and above, all modern codecs preserve the full audible range (20 kHz).

Highest Preserved Frequency (Hz) by Bitrate

Source: OnlineTools4Free Research

Audio Blind Test Results (MUSHRA Scores)

MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor) is the standard methodology for subjective audio quality evaluation. Listeners compare encoded samples against the hidden original on a 0-100 scale, where 100 means indistinguishable from the original. The data below represents averages from 20 expert listeners across 10 critical music samples (orchestral, vocal, percussion, electronic, jazz).

The results confirm Opus's dominance: at 128 kbps it scores 94/100 (transparent for most listeners), while MP3 LAME at the same bitrate scores 78. The practical takeaway: Opus at 96 kbps sounds as good as MP3 at 256 kbps, saving 60% of bandwidth.

MUSHRA Blind Test Scores by Codec (100 = transparent)

Source: OnlineTools4Free Research

Complete Sample Rate Reference

The Nyquist theorem states that a digital audio system can perfectly reproduce frequencies up to half the sample rate. CD audio at 44.1 kHz captures up to 22.05 kHz, comfortably exceeding the typical human hearing range of 20 Hz to 20 kHz. Higher sample rates are used in professional recording not for audible benefit, but for processing headroom and to push anti-aliasing filters well above the audible range.

Audio Sample Rates and Nyquist Frequencies

11 rows

Sample Rate	Nyquist Freq	Quality Tier	Common Use
8,000 Hz	4,000 Hz	Telephone	VoIP, old telephony
16,000 Hz	8,000 Hz	Wideband speech	HD Voice, podcasts
22,050 Hz	11,025 Hz	AM radio	Low-quality streaming
44,100 Hz	22,050 Hz	CD	Music distribution, streaming
48,000 Hz	24,000 Hz	DVD/Broadcast	Video audio, DAWs
88,200 Hz	44,100 Hz	High-res	Professional recording
96,000 Hz	48,000 Hz	High-res	Professional recording, Blu-ray
176,400 Hz	88,200 Hz	Ultra high-res	Mastering, archival
192,000 Hz	96,000 Hz	Ultra high-res	Studio mastering
352,800 Hz	176,400 Hz	DSD-equivalent	Audiophile niche
384,000 Hz	192,000 Hz	Maximum	Research, measurement

Part 5: Data & Serialization Formats

~5,000 words covering 11 data formats

Data formats determine how structured information is stored and transmitted between systems. The right format depends on whether humans need to read it, how fast it needs to be parsed, whether a schema is required, and what ecosystem you are building for.

JSON Variants: JSON5, JSONC, JSON Lines

Standard JSON's lack of comments and trailing commas has spawned several variants that address common frustrations while maintaining JSON's simplicity.

JSONC (JSON with Comments) extends JSON with // and /* */ comments. Used by VS Code settings (settings.json), TypeScript (tsconfig.json), and ESLint configurations. JSONC parsers strip comments before parsing as standard JSON.

JSON5 extends JSON more aggressively: single-quoted strings, trailing commas, unquoted object keys, Infinity/NaN, hex numbers, and multiline strings. Used by Babel (babel.config.json5) and some build tools. JSON5 is a superset of JSON: all valid JSON is valid JSON5.

JSON Lines (JSONL, NDJSON) stores one JSON object per line, separated by newlines. This format is ideal for log files, streaming data, and large datasets because each line can be parsed independently without loading the entire file into memory. Tools like jq, DuckDB, and pandas can process JSONL files efficiently.

JSON Merge Patch (RFC 7396) and JSON Patch (RFC 6902) define standard ways to express partial updates to JSON documents. JSON Merge Patch uses a simple merge semantic (set new values, null to delete). JSON Patch uses an array of operations (add, remove, replace, move, copy, test) for precise manipulation.

JSON: The Language of the Web

JSON (JavaScript Object Notation) was formalized by Douglas Crockford in 2001 and standardized as ECMA-404 and RFC 8259. It has become the universal data interchange format for the web, used by virtually every REST API, configuration file, and NoSQL database. JSON's success comes from its simplicity: it supports only six data types (string, number, boolean, null, array, object) with a grammar that fits on a business card.

JSON Syntax and Types

JSON values are one of: strings (double-quoted), numbers (integer or floating-point, no hex, no NaN/Infinity), booleans (true or false), null, arrays (ordered lists in square brackets), or objects (unordered key-value pairs in curly braces, keys must be strings).

Common pitfalls: JSON does not support comments (a frequent source of frustration for configuration files), trailing commas are invalid, single quotes are not valid (only double quotes), and there is no date type (dates are typically ISO 8601 strings). JSON numbers have no integer/float distinction and no precision guarantee (JavaScript Number.MAX_SAFE_INTEGER is 2^53 - 1; larger integers lose precision).

JSON Schema

JSON Schema is a vocabulary for describing and validating JSON data. It defines the expected structure, data types, required fields, value constraints (minimum, maximum, pattern), and relationships between properties. JSON Schema powers OpenAPI (Swagger) API documentation, form validation in React JSON Schema Form, and configuration validation in VS Code settings.

XML: The Enterprise Veteran

XML (Extensible Markup Language) was standardized by the W3C in 1998 and dominated data interchange for a decade before JSON overtook it. XML uses a tag-based syntax similar to HTML, with strict nesting, required closing tags, and case-sensitive element names. XML is more verbose than JSON (an XML representation of the same data is typically 30-50% larger) but offers features JSON lacks.

When XML is Still Needed

XML remains the standard for: SOAP web services (enterprise APIs), SVG (vector graphics), XHTML (strict HTML), RSS/Atom feeds, Office documents (DOCX, XLSX are XML inside ZIP), Android layouts, Maven/Gradle configuration, SAML (authentication), and many government and healthcare standards (HL7 FHIR, NIEM). If you work in enterprise software, finance, healthcare, or government, you will encounter XML regularly.

XML's advantages over JSON: namespaces (allowing elements from different vocabularies to coexist), schemas (XSD provides far more powerful validation than JSON Schema), XSLT transforms (converting XML to HTML, PDF, or other XML), and comments. XML also supports mixed content (text interleaved with elements), which is natural for document markup.

YAML: Human-Friendly Configuration

YAML (originally "Yet Another Markup Language," now "YAML Ain't Markup Language") was created in 2001 as a human-readable data serialization format. It uses indentation instead of brackets, supports comments (# prefix), and infers types (true, 1, 1.5, null are automatically parsed as boolean, integer, float, null).

YAML Gotchas

YAML's flexibility creates notorious gotchas. The string "Norway" abbreviated as "NO" is parsed as boolean false. "1.0" is a float, not a string. Indentation must use spaces, never tabs. Multiline strings have six different modes (literal, folded, with/without trailing newline, with/without chomp indicator). The "YAML Norway Problem" led YAML 1.2 to change boolean recognition, but many parsers still use YAML 1.1 rules.

YAML dominates DevOps configuration: Kubernetes manifests, Docker Compose files, GitHub Actions workflows, Ansible playbooks, Terraform (HCL is YAML-inspired), GitLab CI, CircleCI, and many more. Its readability makes it the best format for configuration files that humans edit regularly.

CSV: The Simplest Data Exchange

CSV (Comma-Separated Values) has been used since the 1970s for tabular data exchange. Despite its apparent simplicity, CSV has surprising complexity: there is no universal standard (RFC 4180 is the closest), delimiter choice varies (comma, semicolon in European locales, tab), quoting rules differ between implementations, encoding is unspecified, and there is no type system (everything is a string).

CSV Encoding Issues

The most common CSV problem is encoding. Excel on Windows opens CSV files as Windows-1252 by default, mangling any UTF-8 characters. The fix: add a UTF-8 BOM (byte order mark, EF BB BF) at the beginning of the file, which tells Excel to use UTF-8. Alternatively, use TSV (tab-separated values), which Excel handles more consistently.

Other CSV pitfalls: numbers with leading zeros (like zip codes "00501") are converted to integers by Excel. Dates in ambiguous formats (01/02/03) are interpreted differently in US/EU locales. Fields containing commas, quotes, or newlines must be enclosed in double quotes, with internal quotes escaped by doubling (""). These edge cases cause countless data import bugs.

JSON Performance Optimization

JSON parsing performance matters at scale. A typical REST API response is 5-50 KB of JSON; at 10,000 requests per second, that is 50-500 MB/s of JSON parsing. Several strategies can reduce the cost:

1. Use a fast parser. Standard JSON.parse in V8 is well-optimized (~1,800 ops/sec for 100KB payloads), but simdjson (SIMD-accelerated C++ parser with Node.js bindings) achieves 12,000 ops/sec — a 6.6x improvement. For Python, orjson is 10-20x faster than the standard json module.

2. Minify before transit. Removing whitespace typically reduces JSON size by 20-30%. Combined with GZIP or Brotli compression, a 100 KB formatted JSON response becomes approximately 12-18 KB over the wire. Most web frameworks minify JSON responses by default.

3. Avoid unnecessary data. The most effective JSON optimization is not sending data you do not need. Use field selection (GraphQL, sparse fieldsets in REST) to return only requested fields. A typical API response includes 40-60% of fields that the client ignores.

4. Use streaming parsers. For very large JSON files (10 MB+), streaming parsers (SAX-style) process the file incrementally without loading it entirely into memory. JSON Lines (NDJSON) is even better: each line is an independent JSON object, enabling line-by-line processing with any standard JSON parser.

5. Consider binary alternatives. For internal service-to-service communication where human readability is not needed, Protocol Buffers or FlatBuffers provide 3-10x smaller payloads and 20-50x faster parsing. The engineering cost is defining .proto schemas and generating code.

TOML: Typed Configuration

TOML (Tom's Obvious, Minimal Language) was created by Tom Preston-Werner (GitHub co-founder) in 2013 as a configuration format that avoids YAML's complexity while adding explicit types. TOML distinguishes between strings, integers, floats, booleans, dates, times, arrays, and tables using clear syntax with no ambiguity.

TOML is the standard configuration format for the Rust ecosystem (Cargo.toml, rustfmt.toml), Python packaging (pyproject.toml), and is supported by many other tools. Its explicit typing and flat structure make it immune to the YAML Norway Problem and similar gotchas.

Protocol Buffers: Binary Speed

Protocol Buffers (Protobuf) is Google's binary serialization format, used internally for virtually all data storage and RPC communication at Google. Messages are defined in .proto files using a schema language, then compiled to language-specific code (C++, Java, Python, Go, etc.) for serialization and deserialization.

Protobuf produces messages 3-10x smaller than JSON and parses 20-100x faster. It achieves this through field numbering (instead of string keys), varint encoding for integers, binary representation of all values, and schema-driven encoding that omits default values. The trade-off is human readability: Protobuf binary data is not human-readable, and debugging requires schema-aware tools.

Protobuf is the serialization format for gRPC (Google's RPC framework), used by major companies including Google, Netflix, Square, Lyft, and CoreOS. If you need high-throughput, low-latency data exchange between services, Protobuf with gRPC is the current gold standard.

MessagePack: Binary JSON

MessagePack is a binary serialization format that is schema-less (like JSON) but more compact and faster to parse. It represents the same data types as JSON (strings, numbers, booleans, null, arrays, maps) but in a binary encoding that is typically 30-50% smaller than JSON. MessagePack is often described as "binary JSON" — it preserves JSON's flexibility while eliminating the overhead of text parsing.

MessagePack is popular in gaming (Unity, Redis, Fluentd), real-time systems, and any situation where JSON's human readability is not needed but its schema-less flexibility is. Libraries are available for 50+ programming languages.

INI: The Simplest Configuration Format

INI files (initialized in the early 1980s for MS-DOS) are the simplest configuration format: sections in square brackets, key-value pairs separated by = or :, and comments starting with ; or #. INI has no standard specification, which means every parser handles edge cases differently (does it support nested sections? multi-line values? escaping?).

Despite its limitations, INI persists in many contexts: Git configuration (.gitconfig), PHP configuration (php.ini), Python packaging (setup.cfg), systemd unit files, desktop entry files (.desktop on Linux), and Windows registry exports. For new projects, TOML is the recommended upgrade from INI — it adds explicit types, nested tables, and arrays while maintaining INI's readability.

.env: Environment Variable Files

The .env file format (popularized by the dotenv library) stores environment variables as KEY=VALUE pairs, one per line. It is the standard way to configure application secrets (API keys, database URLs) in development. The .env file is loaded into the process environment at startup and should never be committed to version control.

.env is not a proper data format — it has no specification, no types (everything is a string), no nesting, and inconsistent quote handling across implementations. Despite this, it has become universal: Node.js (dotenv), Python (python-dotenv), Ruby (dotenv-rails), Go (godotenv), and most frameworks support it. Docker and Docker Compose also read .env files. For sensitive configuration, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, Infisical) rather than .env files in production.

HCL: HashiCorp Configuration Language

HCL (HashiCorp Configuration Language) is used by Terraform, Packer, Vault, Consul, and other HashiCorp tools. It was designed to be more human-friendly than JSON while being more machine-friendly than YAML. HCL uses a block-based syntax with explicit types, expressions, and functions — making it closer to a programming language than a data format.

HCL supports interpolation (${var.name}), conditional expressions, for-each loops, and functions. This makes Terraform configurations readable and maintainable for infrastructure-as-code workflows. HCL can be parsed from and converted to JSON, which is useful for programmatic generation.

While HCL is specific to the HashiCorp ecosystem, its design philosophy — structured blocks with typed attributes — has influenced other configuration formats. The Pkl language (Apple, 2024), CUE (Google), and Dhall are similar configuration languages that validate structure at the format level rather than relying on external schema tools.

FlatBuffers: Zero-Copy Deserialization

FlatBuffers, developed by Google, takes a radically different approach to serialization. Instead of deserializing data into language-native objects (which requires allocating memory and copying bytes), FlatBuffers allows direct access to serialized data without unpacking. The receiver reads fields directly from the binary buffer using calculated offsets.

This zero-copy approach makes FlatBuffers the fastest serialization format for deserialization: 50,000+ operations per second compared to 15,000 for Protocol Buffers and 1,800 for JSON.parse. The trade-off is that FlatBuffers is more complex to use (you cannot easily inspect the data without the schema) and produces slightly larger messages than Protobuf because it includes offset tables.

FlatBuffers is widely used in game engines (Unity uses FlatBuffers internally), Android (the Android runtime uses FlatBuffers for the dex file format), and any system where deserialization latency is critical. Google uses FlatBuffers for TensorFlow model files (.tflite).

CBOR: The IoT Binary Format

CBOR (Concise Binary Object Representation, RFC 8949) is a binary data format designed for IoT and constrained environments. Like MessagePack, CBOR is a binary version of the JSON data model (maps, arrays, strings, numbers, booleans, null). Unlike MessagePack, CBOR is an IETF standard with well-defined extension points, deterministic encoding rules, and support for tags (typed values like dates, URIs, and big numbers).

CBOR is used in COSE (CBOR Object Signing and Encryption, the backbone of FIDO2/WebAuthn), CoAP (Constrained Application Protocol, HTTP for IoT), and CTAP2 (Client to Authenticator Protocol, the YubiKey protocol). Its small parser footprint (typically 1-5 KB of code) makes it suitable for microcontrollers with kilobytes of RAM.

For web developers, CBOR is most relevant through WebAuthn: the attestation and assertion data exchanged during passwordless authentication with FIDO2 security keys is encoded in CBOR. Understanding CBOR structure helps debug authentication failures in WebAuthn implementations.

SQLite: The Database as File Format

SQLite is not traditionally classified as a data format, but it deserves mention because it is increasingly used as one. A SQLite database is a single cross-platform file that can be read by any programming language with a SQLite library (essentially all of them). Unlike CSV or JSON, SQLite supports typed columns, indexes, transactions, and complex queries.

The "SQLite as a file format" approach is advocated by D. Richard Hipp (SQLite creator) and is used by numerous applications: Firefox (bookmarks, history, cookies), Chrome (history, cookies, web data), macOS Photos, WhatsApp, Signal, and many mobile apps. For datasets that are too large or complex for CSV but too small to justify a server-based database, SQLite is an excellent choice.

Recent developments have made SQLite even more attractive as a data format. Litestream enables real-time replication of SQLite databases to S3. Turso/LibSQL adds multi-tenant capabilities. sql.js brings SQLite to the browser via WebAssembly. DuckDB can directly query Parquet, CSV, and JSON files using SQL, blurring the line between data formats and databases.

Avro and Parquet: Big Data Formats

Apache Avro and Apache Parquet are designed for big data processing. Avro is a row-oriented format with an embedded schema, ideal for data serialization in Hadoop, Kafka, and streaming pipelines. Each Avro file contains its schema, making it self-describing and enabling schema evolution (adding/removing fields without breaking existing consumers).

Parquet is a columnar format, storing all values of each column together rather than all columns of each row. This enables dramatic compression (similar values in a column compress well) and selective column reading (query engines read only the columns needed, skipping the rest). Parquet is the standard format for data lakes, analytics engines (Spark, Presto, Athena, BigQuery), and data warehouses.

Rule of thumb: use Avro for write-heavy streaming workloads (Kafka, event sourcing) and Parquet for read-heavy analytical workloads (data lakes, reporting, dashboards).

Data Format Comparison

Data & Serialization Formats: Full Comparison

11 rows

Format	Year	Readable	Typed	Schema	Comments	Binary Data	Streaming	Parse Speed	File Size	Best For
JSON	2001	Yes	No (string inference)	JSON Schema	No	Base64 string	JSON Lines	Fast	Medium	APIs, config, web
XML	1998	Verbose	XSD types	XSD, DTD, RelaxNG	Yes	Base64	SAX, StAX	Medium	Large	Enterprise, SOAP, config
YAML	2001	Best	Yes (inferred)	YAML Schema	Yes	!!binary	Multi-document	Slow	Small	Config, CI/CD, K8s
CSV	1972	Yes	No	RFC 4180	No (convention)	No	Line-by-line	Very Fast	Small	Tabular data, spreadsheets
TSV	1970	Yes	No	IANA	No	No	Line-by-line	Very Fast	Small	Copy-paste from spreadsheets
TOML	2013	Yes	Yes	Taplo	Yes	No	No	Fast	Small	App config, Cargo.toml
INI	1981	Yes	No	None	Yes	No	No	Very Fast	Minimal	Simple key-value config
Protocol Buffers	2008	No (binary)	Yes	.proto files	Yes (.proto)	Native	Yes (delimited)	Fastest	Minimal	gRPC, high-perf APIs
MessagePack	2008	No (binary)	Yes	No standard	No	Native	Yes	Very Fast	Minimal	Binary JSON alternative
Avro	2009	No (binary)	Yes	Embedded	No	Native	Yes	Very Fast	Minimal	Hadoop, data pipelines
Parquet	2013	No (binary)	Yes	Embedded	No	Native	Row groups	Fast (columnar)	Minimal	Analytics, data lakes

Data Format Popularity by Use Case

Data Format Adoption by Use Case (%)

Source: OnlineTools4Free Research

Key Finding

JSON dominates web APIs (92%) and mobile apps (85%). YAML dominates CI/CD (90%). Parquet dominates big data analytics (65%). No single format is best for everything.

Choose based on use case: JSON for APIs, YAML for config, CSV for tabular exchange, Protobuf for high-performance services, Parquet for analytics.

Format and Validate JSON

Use the tool below to format, validate, and minify JSON data. Paste your JSON to check its syntax, or minify it for production use.

Try it yourself

Json Formatter

Open full tool

Data Format Parsing Performance

Parsing speed matters for APIs handling thousands of requests per second. The benchmark below tests serialization and deserialization of 10,000 records with 15 fields each. FlatBuffers is the fastest for deserialization (50,000 ops/sec) because it uses zero-copy access — the data is read directly from the buffer without unpacking. Protocol Buffers are 15,000 ops/sec. JSON.parse is respectable at 1,800 ops/sec, but simdjson (a SIMD-accelerated parser) reaches 12,000 ops/sec.

Data Format Parsing Performance (ops/sec, 10K records)

12 rows

Format	Serialize ops/s	Deserialize ops/s	Size (KB)	Language
JSON (JSON.parse)	1200	1800	2450	JavaScript
JSON (simdjson)	0	12000	2450	C++
XML (SAX)	400	2200	4200	Java
XML (DOM)	400	600	4200	Java
YAML	200	350	2100	Python
CSV	5000	8000	1200	Python
Protocol Buffers	8000	15000	480	C++
MessagePack	4500	9000	650	Go
Avro	6000	10000	520	Java
TOML	1000	1500	1800	Rust
CBOR	5500	11000	590	Go
FlatBuffers	12000	50000	510	C++

Data Format Size Comparison

How much space does the same data take in different formats? Using 1,000 KB of JSON as a baseline, the chart below shows relative sizes. Parquet is the smallest at 15% of JSON size, thanks to columnar compression. Protocol Buffers are 28%, MessagePack is 35%. Even minified+gzipped JSON is only 18% of the original. XML is 45% larger than JSON due to verbose tag syntax.

Data Format Size Comparison (1MB JSON baseline)

12 rows

Format	Size (KB)	Ratio vs JSON	Readable	Schema Required
JSON	1000	1.00x	Yes	No
JSON (minified)	720	0.72x	No	No
JSON (gzipped)	180	0.18x	No	No
XML	1450	1.45x	Yes	No
YAML	860	0.86x	Yes	No
CSV	420	0.42x	Yes	No
MessagePack	350	0.35x	No	No
Protocol Buffers	280	0.28x	No	Yes
Avro	310	0.31x	No	Yes
Parquet	150	0.15x	No	Yes
CBOR	340	0.34x	No	No
FlatBuffers	320	0.32x	No	Yes

Code Example: JSON Schema Validation

JSON Schema validates the structure and values of JSON documents. The example below defines a schema requiring name (non-empty string), email (valid format), and age (integer 0-150). This schema can power form validation, API request validation, and configuration file checking.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "name": { "type": "string", "minLength": 1 },
    "email": { "type": "string", "format": "email" },
    "age": { "type": "integer", "minimum": 0, "maximum": 150 }
  },
  "required": ["name", "email"]
}

Part 6: Archive & Compression Formats

~4,000 words covering archives and compression algorithms

Archive formats bundle multiple files into one, often with compression to reduce total size. Compression algorithms reduce the size of individual files or data streams. Some formats (ZIP, 7z, RAR) combine archiving and compression. Others (GZIP, Brotli, ZSTD) are pure compression formats that operate on single streams, typically paired with TAR for multi-file archiving.

Compression Concepts: Dictionary, Entropy, and Window

Before diving into specific formats, understanding three core compression concepts helps explain why different algorithms perform differently.

Dictionary-based compression (LZ77/LZ78 family)works by finding repeated sequences in the data and replacing subsequent occurrences with references to earlier ones. The "dictionary" is built from the data itself as it is processed. Longer repeated sequences and more recent occurrences produce better compression. The window size determines how far back the encoder can look for matches: DEFLATE uses 32 KB, Brotli uses 16 MB, ZSTD uses up to 128 MB.

Entropy coding (Huffman, ANS, arithmetic)assigns shorter binary codes to more frequent symbols and longer codes to rarer symbols. In English text, 'e' appears far more often than 'z', so Huffman coding assigns 'e' a 3-bit code and 'z' a 12-bit code. The theoretical minimum encoding length is the Shannon entropy of the data, measured in bits per symbol. Modern entropy coders (ANS, arithmetic) approach this limit within 0.01%.

Context modelingimproves compression by using surrounding data to predict the next symbol. In English text after "th", the next letter is most likely 'e', 'a', or 'i'. A context-aware encoder uses different probability tables for different contexts, achieving better compression than a single global table. Brotli and ZSTD both use context modeling; DEFLATE and LZ4 do not.

All practical compression algorithms combine dictionary-based and entropy coding stages. The dictionary stage exploits exact repetitions (structural redundancy), and the entropy stage exploits statistical patterns (statistical redundancy). The compression ratio is bounded by the data's intrinsic entropy: truly random data cannot be compressed at all.

ZIP: The Universal Archive

ZIP was created by Phil Katz in 1989 and is the most widely supported archive format in the world. Windows, macOS, Linux, iOS, and Android all include native ZIP support. No third-party software is needed to create or extract ZIP files on any major operating system.

ZIP uses DEFLATE compression by default, which provides a good balance of compression ratio and speed. Each file in a ZIP archive is compressed independently, allowing random access to individual files without decompressing the entire archive. This is a key advantage over solid archives (7z, RAR) but comes at the cost of lower compression ratio, since the encoder cannot exploit similarities between files.

ZIP64 Extensions

The original ZIP format was limited to 4 GB per file and 65,535 files per archive. ZIP64 extensions (supported since 2001) remove these limits, allowing individual files up to 16 exabytes and unlimited file count. Modern ZIP tools use ZIP64 automatically when needed.

ZIP Encryption

ZIP supports two encryption methods: the original PKZIP encryption (known as "traditional" or "ZipCrypto") and AES-256 encryption. ZipCrypto is weak and can be cracked within minutes; never use it for sensitive data. AES-256 ZIP encryption (supported by WinZip, 7-Zip, and macOS Archive Utility) is cryptographically secure but not supported by all tools.

GZIP: Web Compression Workhorse

GZIP was created in 1992 by Jean-loup Gailly and Mark Adler as a free replacement for the Unix compress program. It uses the same DEFLATE algorithm as ZIP but operates on single files. GZIP is the foundation of web compression: HTTP Content-Encoding: gzip compresses HTML, CSS, JavaScript, and JSON responses, typically reducing transfer sizes by 60-80%.

For multi-file archiving on Unix/Linux, GZIP is paired with TAR: first TAR creates an uncompressed archive of multiple files, then GZIP compresses the entire archive. The result is a .tar.gz (or .tgz) file. This two-step approach is standard in the Unix world and produces better compression than ZIP because the solid archive allows GZIP to exploit inter-file redundancy.

Brotli: Google's Better Compression

Brotli was developed by Google and standardized as RFC 7932 in 2016. It achieves 15-25% better compression than GZIP with comparable decompression speed. Brotli is supported by all modern browsers for HTTP Content-Encoding: br and has become the preferred compression for static web assets.

Brotli Compression Levels

Brotli supports levels 0-11. Levels 0-4 are fast and suitable for dynamic content (similar speed to GZIP with better ratio). Levels 5-9 offer a good balance for CDN pre-compression. Levels 10-11 are extremely slow (minutes for large files) but achieve the best ratios, suitable only for pre-compressed static assets that are compressed once and served millions of times.

Brotli's secret weapon is a built-in dictionary of common web content (HTML tags, CSS properties, JavaScript keywords, HTTP headers). This dictionary gives Brotli a significant advantage over GZIP for web assets specifically, because common strings do not need to be encoded from scratch.

Zstandard: The Modern All-Rounder

Zstandard (ZSTD) was created by Yann Collet at Facebook and standardized as RFC 8878 in 2021. It is arguably the most important compression algorithm of the 2010s, offering GZIP-level compression ratios at 5-10x the speed, or significantly better ratios at the same speed. ZSTD supports levels from -7 (faster than LZ4, lower ratio) to 22 (slower than LZMA, comparable ratio).

At level 3 (the default), ZSTD compresses at 450 MB/s with a 2.9:1 ratio, versus GZIP's 85 MB/s at 2.5:1. At level 19, ZSTD achieves a 3.6:1 ratio at 5 MB/s, approaching LZMA's 4.5:1 ratio. Decompression is always fast: 1,200 MB/s regardless of compression level.

ZSTD is used by the Linux kernel (since 5.12), Facebook (for log compression, database backups, and storage), and is increasingly adopted as a GZIP replacement for system packages and backups. HTTP Content-Encoding: zstd support is growing in browsers and CDNs.

7z: Maximum Compression

7z is the archive format of the 7-Zip project, using LZMA2 compression by default. LZMA2 achieves the best compression ratios of any general-purpose algorithm: typically 20-40% smaller than ZIP and 10-20% smaller than RAR at maximum settings. The 7z format supports solid archiving, AES-256 encryption (with header encryption), Unicode filenames, and multi-volume archives.

The trade-off is speed and memory: LZMA2 compression at high settings requires 200+ MB of RAM and is 10x slower than DEFLATE. Decompression is faster but still slower than GZIP. For distributing large files where download bandwidth is the bottleneck, 7z's superior compression saves significant time and cost.

RAR: Proprietary but Feature-Rich

RAR is a proprietary archive format created by Eugene Roshal (the "R" in RAR). WinRAR is shareware (famously, the trial never actually expires). RAR5 (current version) offers compression ratios between ZIP and 7z, with unique features including recovery records (redundant data that allows repairing damaged archives) and recovery volumes (separate parity files for multi-volume archives).

RAR's recovery records are its primary advantage: you can allocate 1-10% of the archive size to redundancy data, which can repair corruption from bad sectors, incomplete downloads, or transmission errors. No other common archive format offers this. However, RAR is proprietary (compression requires WinRAR; decompression is available via unrar), making it unsuitable for open-source projects and automated pipelines.

TAR: The Archive Without Compression

TAR (Tape Archive) was created in 1979 for writing data to magnetic tape. It bundles multiple files and directories into a single stream while preserving Unix file permissions, ownership, timestamps, and symbolic links. TAR does not compress; it is always paired with a compression tool: tar.gz (with GZIP), tar.bz2 (with Bzip2), tar.xz (with XZ/LZMA2), or tar.zst (with Zstandard).

This separation of concerns (archiving vs compression) is the Unix philosophy in action: each tool does one thing well, and tools compose via pipes. The advantage is flexibility: you can use any compression algorithm with TAR. The disadvantage is that extracting a single file requires decompressing the entire archive (since the archive is compressed as a solid stream).

Compression Algorithm Benchmarks

The table below benchmarks 12 compression algorithms on the Silesia corpus (a standard benchmark dataset). Compression ratio is uncompressed size divided by compressed size. Speed is measured in MB/s on a single core of a modern CPU.

Compression Algorithm Benchmarks (Silesia Corpus)

12 rows

Algorithm	Ratio	Compress (MB/s)	Decompress (MB/s)	Memory (MB)
DEFLATE (zlib)	2.5	85	320	0.3
GZIP -9	2.7	25	320	0.3
Brotli -6	3.2	30	350	4
Brotli -11	3.8	2	350	256
ZSTD -3	2.9	450	1200	1
ZSTD -19	3.6	5	1200	128
LZMA2	4.5	8	150	200
LZ4	2.1	780	4300	0.016
LZ4 HC	2.7	40	4300	0.064
Snappy	1.8	550	1800	0.032
RAR5	4	12	180	128
Bzip2	3.5	15	55	8

Compression Ratio vs Speed

The scatter plot below visualizes the fundamental trade-off in compression: faster algorithms achieve lower ratios, while better ratios require slower algorithms. LZ4 is the fastest but compresses least. LZMA2 compresses most but is the slowest. Zstandard offers the best balance, sitting in the sweet spot where reasonable speed meets good compression.

Compression Ratio vs Speed (MB/s)

Source: OnlineTools4Free Research

Key Finding

Zstandard (ZSTD) offers the best balance of compression ratio and speed. At default settings, it compresses 5x faster than GZIP with 15% better ratio.

ZSTD is the recommended replacement for GZIP in most scenarios: system packages, backups, log compression, and increasingly HTTP compression.

Archive Format Comparison

Archive & Compression Formats

10 rows

Format	Year	Algorithm	Ratio	Encryption	Solid	Streamable	OS Support	Free	Best For
ZIP	1989	DEFLATE	2.5:1	AES-256	No	Yes	Universal	Yes	Universal sharing
GZIP	1992	DEFLATE	2.5:1	No	N/A	Yes	Universal	Yes	HTTP compression, tar.gz
7Z	1999	LZMA2	4.5:1	AES-256	Yes	No	Widespread	Yes	Maximum compression
RAR	1993	RAR5	4.0:1	AES-256	Yes	No	Widespread	No	Recovery records
TAR	1979	None	1:1	No	Yes	Yes	Unix/Linux	Yes	Archiving (with gzip/bz2)
Brotli	2015	Brotli	3.2:1	No	N/A	Yes	Widespread	Yes	HTTP compression (static)
Zstandard	2016	ZSTD	3.0:1	No	N/A	Yes	Growing	Yes	Real-time compression
XZ	2009	LZMA2	4.5:1	No	N/A	Yes	Unix/Linux	Yes	Linux packages
BZ2	1996	Burrows-Wheeler	3.5:1	No	N/A	Yes	Unix/Linux	Yes	Good ratio + streaming
LZ4	2011	LZ4	2.1:1	No	N/A	Yes	Widespread	Yes	Speed-critical compression

Compression Ratio by Content Type

Compression effectiveness varies dramatically based on content type. Log files compress extremely well (6-8x) because they contain highly repetitive patterns. HTML and CSS compress well (4.5-6x) due to repeated tags and property names. Binary executables compress poorly (2-3x) because they have high entropy. Already-compressed files (JPEG, MP4) gain essentially zero benefit from additional compression.

Compression Ratio by Content Type (ratio = uncompressed / compressed)

10 rows

Content Type	GZIP	Brotli	ZSTD	LZ4	LZMA
HTML	4.8	5.5	5.2	3.2	6
CSS	5.2	6	5.8	3.5	6.5
JavaScript	4.5	5.2	5	3	5.8
JSON API response	4.2	4.8	4.6	2.8	5.2
Source code (mixed)	3.8	4.5	4.2	2.6	5
Log files (syslog)	6.5	7.2	7	4.5	8
Database dump (SQL)	5.5	6.2	6	3.8	6.8
CSV data	4	4.6	4.4	2.6	5
Binary executable	2.2	2.5	2.4	1.8	2.8
Already-compressed (JPEG)	1.02	1.02	1.01	1	1.03

HTTP Compression Adoption (2015-2026)

The web's transition from GZIP to Brotli has been gradual but steady. GZIP dominated until 2020, when Brotli crossed 28% adoption. By 2026, Brotli has overtaken GZIP (48% vs 40%). Zstandard (ZSTD) for HTTP is emerging at 11% adoption, primarily driven by CDN providers. The remaining 1% serves uncompressed content, which is a significant missed optimization opportunity.

HTTP Content-Encoding Usage on Top 10M Websites (%)

Source: OnlineTools4Free Research

Part 7: Font Formats

~3,000 words covering 6 font formats

Font formats determine how typeface data (glyph outlines, metrics, hinting instructions, kerning pairs) is stored and delivered. On the web, the format directly impacts page load performance: fonts are render-blocking resources that delay text display until downloaded.

Why Font Format Choices Impact Performance

Fonts are render-blocking resources: the browser will not paint text until it has downloaded and parsed the font file (or a timeout triggers the fallback). A poorly optimized font setup can add 2-4 seconds to text rendering on slow connections. The format, size, loading strategy, and subsetting all affect how quickly text appears.

The performance-optimal font loading strategy in 2026 involves five techniques: (1) WOFF2 format for 50-60% smaller files than TTF, (2) variable fonts to replace multiple static font files with one, (3) unicode-range subsetting to load only needed character sets, (4) font-display: swap for immediate text rendering, and (5) preloading the primary font with <link rel="preload" as="font" type="font/woff2" crossorigin>.

Self-hosted fonts are generally faster than Google Fonts because they eliminate a DNS lookup, TCP connection, and TLS handshake to fonts.googleapis.com. However, Google Fonts provides automatic format negotiation (serving WOFF2 where supported), automatic subsetting based on CSS unicode-range, and CDN distribution. For performance-critical sites, self-host the specific subset of WOFF2 files you need.

Color Fonts and Emoji

Color fonts embed multi-color glyphs, enabling emoji, brand logos, and decorative text in a single font file. Four competing color font technologies exist:

COLR/CPAL (v0 and v1): Uses layered vector shapes with a color palette. COLR v0 (flat colors) is supported by all browsers. COLR v1 (gradients, compositing, transformations) is supported by Chrome 98+ and Firefox 107+. Google Fonts uses COLR v1 for its Noto Color Emoji font, producing much smaller files than bitmap approaches.

CBDT/CBLC: Uses embedded PNG bitmaps at multiple resolutions. This is what Android uses for emoji. Files are large (10+ MB for a full emoji set) because each emoji is stored as multiple PNGs (18x18, 36x36, 72x72, 144x144 pixels).

sbix:Apple's color font format, also using embedded PNGs. Used by Apple Color Emoji, the font shipped on every Mac and iPhone. Even larger than CBDT because Apple stores high-resolution bitmaps.

SVG-in-OpenType: Embeds SVG documents as glyphs. Supported by Firefox and Safari but not Chrome. Produces the most flexible results (arbitrary SVG with gradients, filters, animations) but the largest files and slowest rendering.

For the web in 2026, COLR v1 is the recommended color font technology. It produces the smallest files (Noto Color Emoji is 9.4 MB in CBDT but only 1.85 MB in COLR v1), supports gradients and compositing, and has growing browser support. System emoji fonts are typically 20-40 MB, which is why browsers load them from the OS rather than downloading them.

WOFF2: The Web Font Standard

WOFF2 (Web Open Font Format 2) is the current standard for web fonts. It uses Brotli compression to achieve 30% smaller files than WOFF and 50-60% smaller than raw TTF/OTF. A typical Latin-character font is 15-25 KB in WOFF2, versus 40-60 KB in TTF. WOFF2 is supported by 98% of global browsers (all modern browsers since 2018).

WOFF2 is simply a wrapper around TTF or OTF font data with Brotli compression applied to the font tables. The font data itself (glyph outlines, metrics, features) is identical. Browsers decompress WOFF2 and use the font data exactly as they would raw TTF/OTF. There is no quality difference; only file size and load time benefit.

WOFF: The First Web Font Format

WOFF (Web Open Font Format) was standardized in 2010 as the first purpose-built web font format. It uses zlib (DEFLATE) compression, achieving about 40% smaller files than raw TTF/OTF. WOFF is supported by 99% of browsers, including older versions that lack WOFF2 support.

In 2026, WOFF serves primarily as a fallback for the rare browsers that do not support WOFF2. Include both in your @font-face declaration: format("woff2") first, format("woff") second. The browser will download only the first format it supports.

TTF and OTF: Desktop Fonts

TrueType (TTF, 1991, Apple/Microsoft) and OpenType (OTF, 1996, Microsoft/Adobe) are the standard desktop font formats. OpenType is technically a superset of TrueType, using either TrueType outlines (quadratic Bezier curves, .ttf extension) or CFF outlines (cubic Bezier curves, .otf extension). OpenType adds support for advanced typographic features: ligatures, small caps, old-style numerals, stylistic alternates, contextual alternates, and more.

For desktop use (word processors, design tools), TTF and OTF are interchangeable for basic functionality. OTF with CFF outlines typically produces slightly smaller files and better rendering at small sizes on some platforms. For web use, always convert TTF/OTF to WOFF2.

Variable Fonts: One File, Every Weight

Variable fonts (OpenType 1.8, 2016) contain a single set of glyph outlines plus mathematical instructions to interpolate between design extremes along defined axes. Instead of loading separate files for Regular, Medium, Semibold, Bold, and Black (5 files, ~100 KB WOFF2 each = 500 KB), a single variable font file covers all weights in ~80-120 KB.

Standard Axes

The OpenType specification defines five registered axes: wght (weight, 100-900), wdth (width, 75-125), ital (italic, 0-1), slnt (slant, -90 to 90 degrees), and opsz (optical size, adapting design for different point sizes). Font designers can also create custom axes for any variation they choose.

CSS supports variable fonts via the font-variation-settings property for fine-grained control, or via standard properties like font-weight: 650 (any integer from 1-1000, not just 100/200/.../900). Google Fonts now serves variable fonts by default for supported families, automatically reducing page weight.

EOT: The IE-Only Format

EOT (Embedded OpenType) was created by Microsoft in 2008 for Internet Explorer. It was the only web font format supported by IE 6-8 and was required for IE compatibility until IE was deprecated in 2022. In 2026, there is zero reason to include EOT in your @font-face declarations. Remove it to simplify your CSS and reduce maintenance burden.

Font Loading Strategies

How fonts are loaded affects both performance and user experience. The CSS font-display property controls behavior while fonts load:

font-display: swap— shows text immediately in a fallback font, then swaps to the custom font when loaded. Best for body text where readability during load is critical. Causes a "Flash of Unstyled Text" (FOUT).

font-display: optional — shows text in fallback font if the custom font does not load within ~100ms. The font is still downloaded for subsequent page loads. Best for non-critical decorative fonts. Eliminates layout shift.

font-display: block— hides text for up to 3 seconds while waiting for the font. Causes a "Flash of Invisible Text" (FOIT). Generally not recommended because invisible text is worse than styled text.

font-display: fallback — blocks for ~100ms, then falls back. If the font loads within ~3s, it swaps. After 3s, fallback persists. A middle ground between swap and optional.

For optimal Core Web Vitals, use font-display: swap for primary text fonts and preload them with <link rel="preload" as="font" type="font/woff2" crossorigin>. For decorative or icon fonts, use font-display: optional.

Font Format Comparison

Font Formats: Comparison

6 rows

Format	Year	Compression	Avg Size (KB)	Browser %	Variable	Color	Best For
WOFF2	2018	Brotli	18	98%	Yes	Yes	Modern web fonts
WOFF	2010	zlib	28	99%	Yes	Yes	Web font fallback
TTF	1991	None	45	99%	Yes	Limited	Desktop applications
OTF	1996	CFF	40	99%	Yes	Yes	Design, advanced typography
EOT	2008	LZ	35	IE only	No	No	Legacy IE support
SVG Font	2001	None	120	<5%	No	Yes	Nothing (obsolete)

Font File Size by Format

Font File Size by Format (KB)

Source: OnlineTools4Free Research

Key Finding

WOFF2 reduces font file size by 50-60% compared to TTF, and 30% compared to WOFF. A single variable font file can replace 10-20 static font files, reducing total payload by 70-90%.

Always serve WOFF2 with WOFF fallback. Consider variable fonts for sites using multiple weights of the same family.

Font Subsetting Savings

Subsetting removes unused glyphs from a font file. A full Inter Variable font with all 2,548 glyphs is 98 KB in WOFF2. Subsetting to Latin characters only (230 glyphs) reduces it to 18 KB — an 82% savings. For a site that only needs digits and basic punctuation (25 glyphs), the font shrinks to just 4 KB.

Font Subsetting Savings (Inter Variable, WOFF2)

5 rows

Subset	Glyphs	Size (KB)	Savings %
Full font (all glyphs)	2548	98	0
Latin + Latin Extended	420	28	71
Latin only	230	18	82
US-ASCII only	95	10	90
Digits + basic punctuation	25	4	96

Variable Font Axis Impact on File Size

Each additional variation axis increases the variable font file size, but the savings compared to static font files grow even faster. A weight-only variable font (82 KB) replaces 9 static files (162 KB total). Adding width and italic axes increases the variable font to 155 KB but replaces 54 static files totaling 972 KB — an 84% reduction. The break-even point is typically 3-4 weights of the same family.

Variable Font Savings vs Static Font Files

4 rows

Axes	Static Files Replaced	Static Total (KB)	Variable (KB)	Savings
Weight only (wght)	9	162	82	49%
Weight + Width (wght, wdth)	27	486	110	77%
Weight + Width + Italic	54	972	155	84%
Weight + Width + Italic + opsz	108	1944	195	90%

Code Example: Variable Font @font-face

The CSS below shows how to load a variable font with weight axis support, unicode-range subsetting, and optimal font-display strategy.

@font-face {
  font-family: 'Inter';
  src: url('/fonts/Inter-Variable.woff2') format('woff2');
  font-weight: 100 900;
  font-style: normal;
  font-display: swap;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153;
}

Part 8: 3D & Specialized Formats

~3,000 words covering 10 specialized formats

Beyond the common categories of images, documents, video, and audio, specialized formats serve specific industries: 3D modeling and printing, medical imaging, scientific data, and film production. Understanding these formats is essential for anyone working in these domains.

STL: The 3D Printing Standard

STL (Stereolithography) was created by 3D Systems in 1987 for their stereolithography 3D printers. It describes 3D objects as a collection of triangulated surfaces (triangle meshes) with no color, texture, or material information. Every 3D printer in the world accepts STL files, making it the de facto standard for 3D printing.

STL exists in two formats: ASCII (human-readable but enormous) and binary (compact, preferred). A typical 3D model is 1-50 MB in binary STL. The format stores only surface geometry as triangles, so it cannot represent curves exactly, internal structures, color, or materials. For these features, use 3MF (3D Manufacturing Format), which is slowly replacing STL for advanced 3D printing.

OBJ: The Simple 3D Exchange Format

OBJ (Wavefront Object) was developed by Wavefront Technologies in the early 1990s. It stores 3D geometry as vertices, texture coordinates, normals, and faces in a plain text format. Material properties are stored in an accompanying MTL (Material Template Library) file. OBJ is widely supported by 3D modeling software (Blender, Maya, 3ds Max, Cinema 4D) and is commonly used for static model exchange.

OBJ limitations: no animation support, no scene graph (lights, cameras), no binary format (always text, resulting in large files), and limited material definitions. For modern workflows, glTF has largely replaced OBJ for 3D content distribution.

glTF/GLB: The JPEG of 3D

glTF (GL Transmission Format) is an open standard developed by the Khronos Group for efficient 3D content delivery. Often called "the JPEG of 3D," glTF is designed for runtime loading rather than authoring. It supports meshes, materials (PBR metallic-roughness), textures, animations (skeletal, morph targets), cameras, lights, and scene hierarchy.

glTF comes in two variants: .gltf (JSON text file referencing external binary buffers and textures) and .glb (single binary file containing everything). GLB is preferred for distribution because it is a single file, easier to download, and avoids the complexity of managing multiple files.

Mesh compression extensions (Draco, meshopt) can reduce GLB file sizes by 90% for geometry-heavy models. All major 3D engines (Three.js, Babylon.js, Unity, Unreal, Godot) support glTF loading. Apple uses a USDZ variant but increasingly supports glTF through their ecosystem.

FBX: The Industry Pipeline Format

FBX is a proprietary format owned by Autodesk, widely used for exchanging animated 3D content between DCC (Digital Content Creation) tools. It supports meshes, materials, skeletal animation, blend shapes, lights, cameras, and scene hierarchy. FBX is the standard interchange format for game development (Unreal Engine and Unity both prefer FBX for asset import) and film VFX.

The main criticism of FBX is its proprietary nature: Autodesk controls the specification, and the FBX SDK is the only reliable way to read/write FBX files. Open-source alternatives (Assimp, OpenFBX) provide partial support but cannot guarantee full compatibility. For open workflows, glTF or USD are preferred.

STEP and IGES: CAD Interchange

STEP (Standard for the Exchange of Product Data, ISO 10303) and IGES (Initial Graphics Exchange Specification) are the standard formats for exchanging CAD (Computer-Aided Design) data between different CAD software. Unlike mesh formats (STL, OBJ), STEP preserves exact mathematical surface definitions using B-Rep (Boundary Representation) with NURBS curves and surfaces.

STEP is the current standard, supporting assemblies, tolerances, materials, and product manufacturing information (PMI). IGES is the legacy format, limited to geometry and basic annotation. When exchanging CAD data between SolidWorks, CATIA, NX, Inventor, or Fusion 360, use STEP. Only use IGES when the receiving system cannot read STEP.

DICOM: Medical Imaging

DICOM (Digital Imaging and Communications in Medicine) is the universal standard for medical imaging, used by CT scanners, MRI machines, X-rays, ultrasound, mammography, PET, and nuclear medicine. Every medical imaging device in every hospital worldwide produces DICOM files.

A DICOM file contains both the image data and extensive metadata: patient demographics, study/series information, acquisition parameters (slice thickness, field strength, pulse sequence), and institutional data. DICOM uses a tag-based data structure with thousands of defined data elements.

DICOM supports multiple image compression formats (uncompressed, JPEG, JPEG 2000, JPEG-LS, RLE) and handles multi-frame images (CT/MRI volumes are stored as series of 2D slices). DICOM networking (PACS) enables image storage, retrieval, and display across hospital networks.

NetCDF and HDF5: Scientific Data

NetCDF (Network Common Data Form) and HDF5 (Hierarchical Data Format 5) are the standard formats for scientific and research data. NetCDF stores multidimensional arrays (temperature grids, time series, satellite data) with self-describing metadata (variable names, units, coordinate systems). It is the standard format for climate science, meteorology, and oceanography.

HDF5 provides a more general hierarchical data model with groups (like directories), datasets (N-dimensional arrays), and attributes (metadata). It supports chunked storage, compression (gzip, szip, LZF), parallel I/O, and virtual datasets. HDF5 is used in astronomy (LIGO gravitational wave data), particle physics (CERN), genomics, and deep learning (model weights are often stored in HDF5).

USD: The Future of 3D

USD (Universal Scene Description) was developed by Pixar and open-sourced in 2016. It is a scene composition framework designed for film VFX workflows where hundreds of artists work on the same scene simultaneously. USD supports non-destructive layering (like Photoshop layers for 3D), references, variant sets, and time-sampled animation.

Apple adopted USDZ (USD packaged in ZIP) for AR content on iOS and visionOS. NVIDIA uses USD as the foundation for Omniverse. The Alliance for Open USD (AOUSD, founded by Apple, Pixar, Adobe, Autodesk, NVIDIA) is standardizing USD for cross-industry use. USD may eventually become the standard scene description format for all 3D industries.

Specialized Format Comparison

3D & Specialized Formats

10 rows

Format	Year	Domain	Data Model	Textures	Animation	Compression	Open Standard	Best For
STL	1987	3D Printing	Triangle mesh	No	No	No	Yes	3D printing, prototyping
OBJ	1992	3D Graphics	Polygon mesh	Yes (MTL)	No	No	Yes	3D model exchange
glTF/GLB	2017	Web 3D	Scene graph	Yes	Yes	Draco, meshopt	Yes (Khronos)	Web 3D, AR/VR
FBX	2006	Game Dev	Scene graph	Yes	Yes	Yes	No (Autodesk)	Game engines, film VFX
STEP	1994	CAD	B-Rep solid	No	No	No	Yes (ISO 10303)	CAD data exchange
IGES	1980	CAD	Surface/Wireframe	No	No	No	Yes (ANSI)	Legacy CAD exchange
DICOM	1993	Medical	Image + metadata	N/A	Series	JPEG, JPEG 2000	Yes (NEMA)	Medical imaging
NetCDF	1989	Science	N-D arrays	N/A	Time dim	zlib, szip	Yes (Unidata)	Climate, atmospheric data
HDF5	1998	Science	Hierarchical groups	N/A	Time series	gzip, szip, LZF	Yes (HDF Group)	Large scientific datasets
USD	2016	3D/Film	Scene composition	Yes	Yes	Crate (binary)	Yes (Pixar/AOUSD)	Film VFX, Apple Vision Pro

3D Model File Size Comparison

The same 3D model (Stanford Bunny, 69,000 triangles) produces vastly different file sizes depending on the format. The most dramatic difference is between STL ASCII (6.8 MB) and GLB with Draco compression (380 KB) — a 17x reduction. For web 3D content, GLB+Draco is the clear winner: it is 90% smaller than uncompressed formats while loading faster than text-based alternatives.

3D Model File Size Comparison (Stanford Bunny, 69K triangles)

11 rows

Format	Size (KB)	Load Time (ms)	Compression
STL (ASCII)	6800	850	No
STL (binary)	3400	120	No
OBJ	4200	280	No
OBJ + MTL	4250	320	No
glTF (JSON + bin)	3800	95	Yes
GLB	3400	85	Yes
GLB + Draco	380	150	Yes
GLB + meshopt	520	92	Yes
FBX (binary)	3600	180	Yes
USD (binary/crate)	2800	110	Yes
PLY	3200	145	No

3MF: The Modern 3D Printing Format

3MF (3D Manufacturing Format) was developed by the 3MF Consortium (Microsoft, HP, Autodesk, Shapeways, and others) as a modern replacement for STL. While STL stores only geometry as a triangle mesh, 3MF supports colors, materials, textures, lattice structures, and multiple objects within a single file. The format is XML-based inside a ZIP container (like DOCX and EPUB), making it self-describing and extensible.

3MF also solves common STL problems: it guarantees watertight meshes (no gaps or inverted normals), uses efficient binary encoding for triangle data, and includes build instructions (orientation, support structures) that STL cannot express. Major 3D printing services (Shapeways, i.Materialise) and slicers (PrusaSlicer, Cura, Bambu Studio) support 3MF. For new 3D printing workflows, prefer 3MF over STL whenever your printer software supports it.

PLY: The Point Cloud Format

PLY (Polygon File Format, also called Stanford Triangle Format) was designed at Stanford University for storing 3D scanned data. Unlike STL and OBJ which store only surface geometry, PLY can store per-vertex properties including color, normals, texture coordinates, and custom data channels. PLY supports both ASCII and binary encoding.

PLY is widely used in photogrammetry (3D reconstruction from photographs), LiDAR scanning, and cultural heritage digitization. Tools like Meshlab, CloudCompare, and Open3D use PLY as a primary format. For point cloud data (millions of unconnected 3D points), PLY is often more appropriate than mesh formats like glTF because it naturally represents unstructured point data.

Alembic: Baked Animation Cache

Alembic (.abc) is an interchange format for baked (pre-computed) animation data, developed by Sony Pictures Imageworks and Industrial Light & Magic. Unlike FBX or glTF which store skeletal rigs and skinning weights, Alembic stores the final deformed geometry at each frame. This makes it ideal for transferring complex simulations (cloth, fluid, hair) between DCC tools without worrying about rig compatibility.

Alembic is the standard format in film VFX pipelines for passing animated geometry between departments. A character animated in Maya is exported as Alembic for lighting in Houdini, then imported as Alembic in Nuke for compositing. Each department only needs the final mesh positions, not the underlying rig.

Image Format Maximum Specifications

Technical limits of each image format

Every image format has hard limits on resolution, color depth, and file size imposed by its specification. These limits rarely matter for web use but are critical for professional workflows. JPEG is limited to 65,535 x 65,535 pixels and 12-bit color depth. JPEG XL pushes these limits dramatically: up to 1 billion pixels per side, 32-bit float per channel, and up to 4,099 channels. WebP's 16,383 pixel limit is the most restrictive of the modern formats and prevents its use for certain panoramic and professional photography applications.

Image Format Maximum Specifications

12 rows

Format	Max Width	Max Height	Max Bit Depth	Max Channels	Color Spaces	Specification
JPEG	65535	65535	12	4	sRGB, CMYK	ISO/IEC 10918-1
PNG	2147483647	2147483647	16	4	sRGB, ICC	ISO/IEC 15948
WebP	16383	16383	8	4	sRGB	Google WebP spec
AVIF	65536	65536	12	4	sRGB, P3, BT.2020	ISO/IEC 23000-22
JPEG XL	1073741823	1073741823	32	4099	sRGB, P3, BT.2020, ICC	ISO/IEC 18181
HEIC	65535	65535	10	4	sRGB, P3, BT.2020	ISO/IEC 23008-12
GIF	65535	65535	8	1	Palette (256)	GIF89a spec
TIFF	4294967295	4294967295	64	Unlimited	sRGB, CMYK, LAB, ICC	TIFF 6.0 + supplements
BMP	2147483647	2147483647	32	4	sRGB	BMP v5 header
SVG	0	0	0	0	sRGB, P3 (CSS)	W3C SVG 2.0
ICO	256	256	32	4	sRGB	ICO format spec
JPEG 2000	4294967295	4294967295	38	16384	sRGB, ICC, LAB	ISO/IEC 15444-1

JPEG XL's extreme specifications (billion-pixel images, 4,099 channels, 32-bit float) make it suitable for scientific imaging, remote sensing, and any application that pushes beyond the capabilities of existing formats. The 4,099-channel support accommodates multispectral and hyperspectral imaging where sensors capture hundreds of wavelengths simultaneously.

How Compression Algorithms Work: Step by Step

Detailed internal operation of DEFLATE, LZ4, Brotli, and Zstandard

All lossless compression algorithms exploit the same fundamental principle: data contains patterns, and patterns can be represented more compactly than raw bytes. The difference between algorithms is how they find and encode those patterns, and the trade-offs they make between speed, memory, and compression ratio. The sections below explain the internal operation of four important algorithms step by step.

DEFLATE (Used by ZIP, GZIP, PNG)

DEFLATE is the most widely deployed compression algorithm in history, used by ZIP, GZIP, and PNG. Despite being designed in 1993, it remains the baseline that every other algorithm is compared against. DEFLATE operates in four steps:

Step 1: LZ77 Matching.The algorithm scans the input data looking for repeated byte sequences. When it finds a match, instead of storing the bytes again, it emits a (length, distance) pair: "copy 15 bytes from 342 bytes ago." The search window is limited to 32 KB in standard DEFLATE. Longer matches and closer distances produce better compression.

Step 2: Lazy Matching.Before committing to a match, the algorithm checks whether starting one byte later would produce a longer match. If so, it emits the current byte as a literal and uses the longer match instead. This "lazy evaluation" improves compression by 2-5% at the cost of additional CPU time.

Step 3: Huffman Coding. The encoder builds frequency tables for all the literal bytes and length/distance pairs. More frequent symbols get shorter codes; rarer symbols get longer codes. Two Huffman trees are built: one for literals+lengths and one for distances. The trees themselves are encoded in the output so the decoder can reconstruct them.

Step 4: Block Splitting. The input is divided into blocks. Each block can use one of three modes: stored (no compression, for already-compressed data), fixed Huffman (predefined tables, fast but suboptimal), or dynamic Huffman (custom tables per block, best compression). Block boundaries are chosen to maximize compression.

LZ4 (Used by Linux Kernel, ZFS, Game Engines)

LZ4 prioritizes decompression speed above all else. It decompresses at 4.3 GB/s (faster than memcpy on some architectures) while still achieving meaningful compression ratios (2.1:1). LZ4 achieves this speed through radical simplification:

Step 1: Hash Matching. Each 4-byte sequence is hashed and looked up in a hash table. Unlike DEFLATE which searches chains of matches, LZ4 uses a single-probe hash (one lookup, no chains). If the probe misses, the byte is emitted as a literal. This makes matching extremely fast but misses some opportunities.

Step 2: Literal/Match Encoding. The output consists of alternating literal runs (unmatched bytes copied directly) and match references (offset + length). A single token byte encodes both the literal length (high 4 bits) and match length (low 4 bits). Values over 15 use additional bytes.

Step 3: Offset Encoding.Match offsets are stored as 16-bit little-endian values, limiting the match window to 64 KB. This constraint is intentional: 16-bit offsets can be decoded with a single memory read, contributing to LZ4's exceptional decompression speed.

Brotli (Used for Web Content Compression)

Brotli achieves 15-25% better compression than GZIP primarily through two innovations: a built-in dictionary and context modeling.

Step 1: Dictionary Lookup.Brotli includes a 120 KB static dictionary of common web content: HTML tags (<div>, <span>), CSS properties (font-size, margin-bottom), JavaScript keywords (function, return), and HTTP header fragments. When the encoder finds a match in this dictionary, it encodes just the dictionary reference (a few bytes) instead of the full string. This gives Brotli a 5-10% advantage over GZIP specifically for web content.

Step 2: LZ77 Matching with Configurable Depth.Like DEFLATE, Brotli searches for repeated sequences. But Brotli supports much deeper searches at high compression levels (level 11 explores thousands of candidates) while using a 16 MB window (vs DEFLATE's 32 KB). More candidates and a larger window mean better matches.

Step 3: Context Modeling. Brotli uses the previous two bytes to select different probability tables for different contexts. English text after a space has different character frequencies than text after a vowel. By maintaining separate tables for different contexts, Brotli achieves better entropy coding than a single global table.

Step 4: ANS Coding. Instead of Huffman coding (used by DEFLATE), Brotli uses Asymmetric Numeral Systems (ANS), which achieves compression within 0.01% of the theoretical optimum while being faster to encode and decode than arithmetic coding.

Zstandard (Modern All-Rounder)

Zstandard (ZSTD) achieves GZIP-level compression at 5-10x the speed through several innovations:

Step 1: Multi-Hash Matching. ZSTD uses multiple hash tables with different hash sizes to find matches at different lengths efficiently. Short matches (3-4 bytes) use a 3-byte hash; longer matches use a 4-6 byte hash. This avoids the trade-off between speed and match quality that simpler schemes face.

Step 2: FSE (Finite State Entropy). ZSTD uses FSE, a table-based entropy coder that is faster than Huffman coding while approaching arithmetic coding accuracy. FSE is particularly efficient for small alphabets (like match lengths and offsets) and can be vectorized for SIMD acceleration.

Step 3: Repeat Offset Tracking. ZSTD tracks the 3 most recent match offsets. When a new match has the same offset as a recent one, it is encoded with just 1-2 bits instead of the full offset value. This is highly effective for structured data (JSON, XML, source code) where similar patterns repeat at consistent intervals.

Step 4: Dictionary Mode. ZSTD can be trained on sample data to create a custom dictionary. For small payloads (API responses under 1 KB), dictionary compression can improve ratios by 2-5x because the dictionary provides context that the payload alone cannot. This is why ZSTD with dictionaries is the preferred compression for database page compression and small API responses.

The MP3 Psychoacoustic Model Explained

How MP3 decides which sounds to keep and which to discard

MP3's compression is built on a psychoacoustic model — a mathematical representation of how human hearing perceives sound. The model identifies which parts of an audio signal are inaudible (either too quiet to hear or masked by louder nearby sounds) and discards them. This is fundamentally different from lossless compression: MP3 permanently removes data, but the removed data was inaudible anyway.

Simultaneous Masking

When a loud tone is playing, it raises the hearing threshold for nearby frequencies. A 1 kHz tone at 60 dB makes frequencies between 800 Hz and 1.2 kHz effectively inaudible unless they are also loud. The MP3 encoder calculates this masking effect for every frequency band and discards any signal components that fall below the masking threshold.

The masking effect is asymmetric: a tone masks higher frequencies more effectively than lower frequencies. This is because the basilar membrane in the inner ear propagates energy from the base (high frequencies) toward the apex (low frequencies). The table below shows how a masker at each frequency raises the hearing threshold at other frequencies.

Temporal Masking

Masking also occurs in the time domain. A loud sound masks softer sounds for approximately 5-20 ms after the loud sound ends (post-masking) and even 1-5 ms before it begins (pre-masking, due to neural processing delays). The MP3 encoder uses a short window (192 samples) when it detects transients (drum hits, consonants) to preserve temporal detail, and a long window (576 samples) for sustained tones to achieve better frequency resolution.

Absolute Threshold of Hearing

Even without masking, the human ear has a frequency-dependent sensitivity curve. We are most sensitive between 1-4 kHz (the speech frequency range) and much less sensitive at very low (<100 Hz) and very high (>16 kHz) frequencies. The MP3 encoder permanently discards any signal components below this absolute threshold, regardless of other content. At 128 kbps, MP3 typically cuts all frequencies above 16 kHz; at 320 kbps, it preserves up to 20 kHz.

Why Opus Sounds Better Than MP3

Opus uses a more sophisticated psychoacoustic model than MP3 and operates on different principles for different content types. For speech (below 8 kHz), Opus uses SILK (a linear prediction codec optimized for voice, developed by Skype). For music (above 8 kHz), it uses CELT (a frequency-domain codec with better frequency resolution than MP3's MDCT). The encoder dynamically blends these two modes based on the content, giving Opus a significant advantage over MP3's one-size-fits-all approach.

Additionally, Opus benefits from 20 years of psychoacoustic research that was not available when MP3 was designed. Opus uses a more accurate masking model, better bit allocation, and a more efficient entropy coder. The result is transparent quality at 128 kbps (where MP3 still has audible artifacts) and acceptable quality down to 32 kbps (where MP3 sounds terrible).

CDN Auto-Format Negotiation

How CDNs automatically serve the best format

Modern CDNs can automatically convert and serve images in the optimal format based on the browser's Accept header. When a browser sends Accept: image/avif,image/webp,*/*, the CDN can serve AVIF to that browser while serving WebP or JPEG to older browsers — all from a single original image. This eliminates the need to pre-generate multiple format versions.

CDN Image Format Auto-Negotiation Capabilities

6 rows

CDN	Image Formats	HTTP Compression	Auto Resize	Cost
Cloudflare	AVIF, WebP, original	Brotli, GZIP, ZSTD	Yes (Polish)	Free (basic)
Vercel/Next.js	AVIF, WebP (via next/image)	Brotli, GZIP	Yes	Included
Cloudinary	AVIF, WebP, JPEG XL, HEIC	N/A	Yes	Free tier + paid
Imgix	AVIF, WebP, JPEG XL	N/A	Yes	$5/1000 images
Bunny CDN	AVIF, WebP	Brotli, GZIP	Yes (Optimizer)	$0.01/GB
Fastly	AVIF, WebP, JPEG XL	Brotli, GZIP, ZSTD	Yes	$0.08/GB

For most websites, using a CDN with automatic format negotiation is the simplest path to serving modern image formats. You upload JPEG or PNG originals, and the CDN handles conversion, caching, and content negotiation automatically. This approach requires zero changes to your HTML — the same <img> tag serves different formats to different browsers.

Key Finding

CDN-based format negotiation is the lowest-effort path to serving AVIF and WebP. Cloudflare Polish, Vercel next/image, and Cloudinary all handle format conversion automatically from a single source image.

For sites not using these CDNs, the HTML <picture> element with multiple <source> tags provides the same capability with slightly more markup.

Part 9: Encoding & Character Sets

~3,000 words covering encodings and binary-to-text transforms

Character encoding determines how text is stored as bytes. A wrong encoding turns readable text into garbled nonsense (known as "mojibake"). This section explains every major encoding scheme, why UTF-8 won the encoding war, and the binary-to-text encodings that enable binary data in text-only contexts.

Unicode: One Character Set to Rule Them All

Before Unicode, the world had hundreds of incompatible character encodings. Japanese text used Shift_JIS. Russian used KOI8-R. Arabic used ISO 8859-6. Chinese used GB2312. A document written in one encoding would display as garbled nonsense ("mojibake") if opened with a different encoding. International text that mixed scripts was essentially impossible.

Unicode, first published in 1991 and now maintained by the Unicode Consortium, solved this by defining a single character set that includes every character from every writing system in the world. As of Unicode 16.0 (2024), it contains 149,813 characters covering 161 scripts, from modern Latin and CJK to ancient Egyptian hieroglyphics. Unicode also includes 3,790 emoji, mathematical symbols, musical notation, and Braille patterns.

Unicode assigns each character a code point: a number from U+0000 to U+10FFFF. The total code space is 1,114,112 positions, of which about 13% are currently assigned. The space is divided into 17 planes of 65,536 code points each. Plane 0 (Basic Multilingual Plane, BMP) contains the most commonly used characters. Planes 1-16 (Supplementary Planes) contain emoji, rare scripts, CJK extensions, and mathematical symbols.

The critical distinction: Unicode defines which characters exist and what their code points are. It does not define how those code points are stored as bytes. That is the job of encodings: UTF-8, UTF-16, and UTF-32 are three different ways to encode the same Unicode code points as byte sequences.

ASCII: Where It All Began

ASCII (American Standard Code for Information Interchange) was standardized in 1963 and defines 128 characters using 7 bits per character. The first 32 codes (0-31) are control characters (carriage return, line feed, tab, bell). Codes 32-126 are printable: uppercase and lowercase Latin letters, digits 0-9, punctuation, and a few symbols.

ASCII was designed for English and does not support accented characters, non-Latin scripts, or symbols beyond basic punctuation. However, ASCII compatibility is the foundation of all modern encodings: UTF-8, Latin-1, and Windows-1252 are all supersets of ASCII for codes 0-127.

UTF-8: The Universal Encoding

UTF-8 was designed by Ken Thompson and Rob Pike in September 1993 at a New Jersey diner. It is a variable-width encoding that represents Unicode code points using 1 to 4 bytes:

1 byte (0xxxxxxx): ASCII characters U+0000 to U+007F. This means all ASCII text is valid UTF-8 without any modification — backward compatibility that was critical for adoption.

2 bytes (110xxxxx 10xxxxxx): Latin, Greek, Cyrillic, Arabic, Hebrew characters U+0080 to U+07FF. Most European and Middle Eastern text uses 1-2 bytes per character.

3 bytes (1110xxxx 10xxxxxx 10xxxxxx): Chinese, Japanese, Korean (CJK) characters, most of the BMP (Basic Multilingual Plane), U+0800 to U+FFFF.

4 bytes (11110xxx 10xxxxxx 10xxxxxx 10xxxxxx): emoji, rare scripts, historical characters, U+10000 to U+10FFFF.

UTF-8 is self-synchronizing: the first byte of each character uniquely identifies the character length, and continuation bytes (10xxxxxx) cannot be confused with start bytes. This means if bytes are lost or corrupted, only the affected characters are damaged; the decoder can resynchronize at the next start byte.

As of 2026, 98.2% of all websites use UTF-8. The W3C, WHATWG, and IETF all recommend UTF-8 as the default encoding. There is no longer any valid reason to use any other encoding for new content.

UTF-16: Windows and Java Internals

UTF-16 uses 2 or 4 bytes per character. Characters in the Basic Multilingual Plane (U+0000 to U+FFFF) use 2 bytes. Characters outside the BMP (U+10000 and above, including emoji) use a surrogate pair of two 16-bit code units (4 bytes total).

UTF-16 is used internally by Windows (the NTFS filesystem, Win32 API, and .NET all use UTF-16), Java (String objects are sequences of UTF-16 code units), and JavaScript (string indices are UTF-16 code unit positions, which is why "emoji".length returns unexpected values). It is rarely used for file storage or web content because it is not backward-compatible with ASCII and has byte order issues (big-endian vs little-endian, requiring a BOM).

UTF-32: Fixed Width, Maximum Waste

UTF-32 uses exactly 4 bytes for every character, providing direct mapping between code points and code units. This simplifies certain string operations (random access by code point index is O(1)) but wastes enormous space: ASCII text in UTF-32 is 4x the size of UTF-8. UTF-32 is used internally by some text processing libraries and Python 3 (for narrow builds), but it is never used for file storage or network transmission.

Legacy Encodings: Latin-1, Windows-1252, and Others

Before Unicode, every language or region had its own encoding. Latin-1 (ISO 8859-1) covers Western European languages with 256 characters. Windows-1252 is Microsoft's extension of Latin-1, adding typographic characters (curly quotes, em dash, euro sign) in the 128-159 range where Latin-1 has control characters. This difference causes the common bug where Word documents show 'smart quotes' as garbage characters on non-Windows systems.

Shift_JIS (Japanese), GB2312/GBK/GB18030 (Chinese), EUC-KR (Korean), and KOI8-R (Russian) are legacy encodings that are still encountered in older data. All should be converted to UTF-8 when possible. The Python chardet library and the ICU library can detect encoding automatically for conversion.

Base64: Binary to Text

Base64 encoding converts arbitrary binary data into a text representation using 64 ASCII characters: A-Z (0-25), a-z (26-51), 0-9 (52-61), + (62), / (63), with = for padding. Every 3 bytes of input produce 4 characters of output, resulting in a 33% size increase.

Base64 is used for: email attachments (MIME encoding), embedding images in CSS/HTML (data: URIs), storing binary data in JSON or XML, HTTP Basic Authentication (encoding username:password), and JWT (JSON Web Token) payloads. Base64url is a variant that replaces + with - and / with _, making it safe for URLs.

Important: Base64 is encoding, not encryption. It provides zero security — anyone can decode it. Never use Base64 to "protect" sensitive data.

URL Encoding: Percent Encoding

URLs can only contain a subset of ASCII characters: letters, digits, and a few symbols (- _ . ~). All other characters must be percent-encoded: the character's UTF-8 bytes are represented as %XX where XX is the hexadecimal byte value. Space becomes %20 (or + in form data), the euro sign becomes %E2%82%AC (three UTF-8 bytes), and emoji become even longer sequences.

URL encoding is defined in RFC 3986. Common mistakes: encoding the entire URL (only the path and query parameters should be encoded, not the scheme, host, or port), double-encoding (encoding already-encoded characters), and forgetting that + means space only in application/x-www-form-urlencoded data (in regular URLs, + is literal).

Encoding Comparison

Character Encodings Compared

9 rows

Encoding	Year	Bits/Char	Max Characters	ASCII Compatible	Self-Sync	Web Usage	Coverage	Best For
ASCII	1963	7	128	N/A	Yes	1.2%	English only	Legacy systems, protocols
UTF-8	1993	8-32	1112064	ASCII	Yes	98.2%	All Unicode	Web, files, everything
UTF-16	1996	16-32	1112064	No	Partial	0.01%	All Unicode	Windows internals, Java
UTF-32	2000	32	1112064	No	Yes	<0.01%	All Unicode	Fixed-width processing
Latin-1 (ISO 8859-1)	1987	8	256	ASCII	Yes	0.6%	Western European	Legacy Western content
Windows-1252	1985	8	256	ASCII (mostly)	Yes	0.1%	Western European+	Legacy Windows apps
Shift_JIS	1982	8-16	7000	ASCII (mostly)	No	<0.1%	Japanese	Legacy Japanese systems
GB18030	2000	8-32	1112064	ASCII, GBK	Yes	<0.1%	All Unicode + CJK	Chinese government standard
EUC-KR	1991	8-16	17000	ASCII	No	<0.1%	Korean	Legacy Korean systems

Key Finding

UTF-8 is used by 98.2% of all websites and should be the only encoding used for new content. It supports all 149,000+ Unicode characters while maintaining full backward compatibility with ASCII.

Always specify encoding explicitly: <meta charset='utf-8'> in HTML, Content-Type: text/plain; charset=utf-8 in HTTP headers, and UTF-8 as the default in your editor and database.

Encode and Decode

Use the Base64 encoder/decoder below to convert between binary data and Base64 text representation.

Try it yourself

Base64 Encoder

Open full tool

Text to encode

Character Encoding Adoption Over Time

The web's transition to UTF-8 is one of the most successful standardization efforts in technology history. In 2010, only 50% of websites used UTF-8. By 2026, that figure has reached 98.2%. The remaining holdouts are primarily legacy content in regional encodings (Latin-1, Windows-1252, Shift_JIS). For any new content, there is zero reason to use anything other than UTF-8.

Character Encoding Usage on the Web (%)

Source: OnlineTools4Free Research

File Format Detection: Magic Bytes

Operating systems and web servers often determine file types by examining the first few bytes of a file (called "magic bytes" or "file signatures") rather than relying on file extensions, which can be easily changed. The table below shows the signature bytes for common formats. Knowing these signatures is useful for debugging content-type issues, building file upload validators, and understanding why renaming a .txt file to .jpg does not make it an image.

File Format Magic Bytes (Signatures)

25 rows

Format	Hex Bytes	ASCII
JPEG	FF D8 FF	N/A
PNG	89 50 4E 47 0D 0A 1A 0A	.PNG....
GIF87a	47 49 46 38 37 61	GIF87a
GIF89a	47 49 46 38 39 61	GIF89a
WebP	52 49 46 46 .. .. .. .. 57 45 42 50	RIFF....WEBP
AVIF	.. .. .. .. 66 74 79 70 61 76 69 66	....ftypavif
HEIC	.. .. .. .. 66 74 79 70 68 65 69 63	....ftypheic
BMP	42 4D	BM
TIFF (LE)	49 49 2A 00	II*.
TIFF (BE)	4D 4D 00 2A	MM.*
PDF	25 50 44 46	%PDF
ZIP	50 4B 03 04	PK..
GZIP	1F 8B	N/A
7z	37 7A BC AF 27 1C	7z....
RAR5	52 61 72 21 1A 07 01 00	Rar!....

Page 1 of 2

Common MIME Types Reference

MIME types (also called media types or content types) are sent in HTTP Content-Type headers to tell the browser how to handle a response. Using the wrong MIME type causes rendering failures: a CSS file served as text/html will not be applied; a WOFF2 font served without the correct type may be blocked by CORS. The table below lists the most commonly used MIME types.

Common MIME Types

30 rows

Extension	MIME Type	Category
.html	text/html	Document
.css	text/css	Document
.js	text/javascript	Document
.json	application/json	Data
.xml	application/xml	Data
.csv	text/csv	Data
.jpg	image/jpeg	Image
.png	image/png	Image
.gif	image/gif	Image
.webp	image/webp	Image
.avif	image/avif	Image
.svg	image/svg+xml	Image
.jxl	image/jxl	Image
.ico	image/x-icon	Image
.mp4	video/mp4	Video

Page 1 of 2

Part 10: Glossary of File Format Terms

80 terms defined with context and related tools

This glossary defines 80 essential terms used throughout this guide and in file format discussions generally. Each term includes a 2-3 sentence definition and, where applicable, a link to a related tool. Terms are organized alphabetically within categories.

3D

B-Rep (Boundary Representation)

A method for representing 3D solid objects using the surfaces that bound them. Used in CAD formats (STEP, BREP). Unlike mesh formats (STL, OBJ), B-Rep preserves exact mathematical surface definitions (NURBS, analytical surfaces).

NURBS

Non-Uniform Rational B-Splines: mathematical representations of 3D geometry that can accurately describe any shape from simple lines to complex organic forms. Used in CAD (STEP, IGES) for exact surface definitions, unlike mesh approximations.

PBR (Physically Based Rendering)

A rendering approach that uses physically accurate models of light interaction with surfaces. PBR materials use metallic/roughness or specular/glossiness workflows. glTF and USD use PBR material definitions.

Audio

Gapless Playback

The ability to play consecutive audio tracks without silence gaps between them. Critical for live albums and classical music. MP3 has inherent gaps due to encoder padding; AAC, Opus, FLAC, and Vorbis support true gapless playback natively.

Opus

An open, royalty-free audio codec standardized by IETF (RFC 6716). Designed for both speech and music, it adapts dynamically between SILK (speech) and CELT (music) modes. Delivers the best quality-per-bit of any audio codec at most bitrates. Used by Discord, WebRTC, and YouTube.

PCM (Pulse Code Modulation)

The standard method of digitally representing analog audio. Each sample amplitude is quantized and stored as a number. WAV and AIFF files typically contain PCM data. CD audio is 16-bit PCM at 44.1 kHz, representing 1,411 kbps.

Sample Rate

The number of audio samples captured per second, measured in Hz or kHz. CD quality is 44.1 kHz (44,100 samples/second), professional audio uses 48 kHz or 96 kHz. Higher sample rates capture higher frequencies (Nyquist theorem: max frequency = sample rate / 2).

Audio/Image

Bit Depth

The number of bits used to represent each sample in digital audio or each color channel in an image. CD audio uses 16-bit (65,536 levels), while professional audio uses 24-bit (16.7 million levels). Images typically use 8-bit per channel (256 levels).

Audio/Video

Bitrate

The number of bits processed per unit of time, usually measured in kbps (kilobits per second) or Mbps (megabits per second). In audio, 128 kbps MP3 means 128,000 bits of data per second. Higher bitrate generally means better quality but larger files.

Codec

Short for coder-decoder (or compressor-decompressor). Software or hardware that encodes and decodes a data stream. H.264 is a video codec, AAC is an audio codec. Not to be confused with container formats (like MP4 or MKV) which hold codec-encoded streams.

Transcoding

The process of converting media from one codec/format to another. For example, converting a ProRes MOV to H.264 MP4. Transcoding requires decoding the source and re-encoding in the target format, which causes generational quality loss with lossy codecs.

Variable Bitrate (VBR)

An encoding method where the bitrate varies depending on the complexity of the content. Complex passages get more bits, simple passages get fewer. Results in better quality per file size compared to Constant Bitrate (CBR). Supported by MP3, AAC, and most modern codecs.

WebRTC

Web Real-Time Communication — a set of APIs and protocols for peer-to-peer audio, video, and data communication in browsers. Uses Opus for audio and VP8/VP9/H.264 for video. No plugins required. Used by Google Meet, Discord, and Zoom web client.

Compression

Compression Ratio

The ratio between the uncompressed size and compressed size of data. A ratio of 10:1 means the compressed file is one-tenth the original size. Lossless compression typically achieves 2:1 to 4:1, while lossy can reach 10:1 to 50:1 depending on acceptable quality loss.

DEFLATE

The compression algorithm used by ZIP, GZIP, and PNG. Combines LZ77 (dictionary-based) and Huffman coding (entropy coding). Created by Phil Katz in 1993 and standardized as RFC 1951. Provides a good balance of compression ratio and speed.

Entropy

In information theory, the measure of randomness or unpredictability in data. Higher entropy data (like encrypted or random data) cannot be compressed effectively. Lower entropy data (like text or simple images) compresses well. Measured in bits per symbol.

Entropy Coding

A lossless data compression technique that assigns shorter codes to more frequent symbols and longer codes to rarer ones. Huffman coding and arithmetic coding are the two main types. Used as the final step in JPEG, H.264, and most compression algorithms.

Lossless Compression

A class of data compression algorithms that allow the original data to be perfectly reconstructed. Used in PNG, FLAC, ZIP, and GZIP. Typical compression ratios are 2:1 to 4:1. The compressed data contains the exact same information as the original.

Lossy Compression

A class of data compression algorithms that discard some information to achieve higher compression ratios. Used in JPEG, MP3, and H.264. Typical ratios are 10:1 to 50:1. The discarded information is chosen to minimize perceptible quality loss.

LZMA (Lempel-Ziv-Markov chain Algorithm)

A compression algorithm that provides significantly better compression ratios than DEFLATE at the cost of much higher memory usage and slower compression speed. Used by 7-Zip (LZMA2 variant) and XZ. Decompression is faster than compression.

LZW (Lempel-Ziv-Welch)

A dictionary-based lossless compression algorithm used in GIF and TIFF. It builds a dictionary of previously seen patterns and replaces repeated occurrences with shorter codes. Was patent-encumbered until 2004, which drove the creation of PNG.

Self-Extracting Archive

An archive file that includes its own decompression program, allowing extraction without requiring separate software. Creates a .exe (Windows) or .sh (Unix) file. Supported by ZIP, 7z, and RAR. Convenient but larger than regular archives.

Solid Archive

An archive where multiple files are compressed together as a single data stream rather than individually. Achieves better compression ratios (especially for many similar small files) but requires decompressing the entire archive to extract a single file. Supported by 7z and RAR.

Zstandard (ZSTD)

A fast lossless compression algorithm developed by Facebook (Yann Collet, 2016). Provides compression ratios similar to DEFLATE but at much faster speeds. Supports dictionary compression. Used by Linux kernel, Facebook, and as a replacement for gzip.

Data

JSON Schema

A vocabulary that allows you to annotate and validate JSON documents. Defines the structure, data types, required fields, and constraints of JSON data. Used in API documentation (OpenAPI), form validation, and configuration file validation.Try the json formatter tool

Protocol Buffers (Protobuf)

A language-neutral, platform-neutral binary serialization format developed by Google. Messages are defined in .proto files and compiled to language-specific code. 3-10x smaller and 20-100x faster than XML. Used in gRPC and internal Google services.

Document

PDF/A

An ISO-standardized subset of PDF designed for long-term preservation of electronic documents. Requires embedded fonts, prohibits encryption, forbids external references, and mandates device-independent color. Used by governments, archives, and legal systems.

Semantic Markup

Using HTML elements or document structures that convey meaning, not just presentation. Markdown headings, PDF bookmarks, and EPUB semantic inflection improve accessibility and enable machine processing.

Encoding

Base64

A binary-to-text encoding scheme that represents binary data using 64 ASCII characters (A-Z, a-z, 0-9, +, /). Every 3 bytes of input produce 4 characters of output, resulting in ~33% size increase. Used in data URIs, email attachments (MIME), and JSON/XML when binary data must be embedded as text.Try the base64 encoder tool

BOM (Byte Order Mark)

A Unicode character (U+FEFF) placed at the beginning of a text file to indicate the byte order (endianness) of the encoding. Required for UTF-16, optional for UTF-8 (where it is discouraged), and rarely used in UTF-32.

Character Set vs Encoding

A character set is the collection of characters (Unicode defines 149,186 characters). An encoding is the method of representing those characters as bytes (UTF-8, UTF-16). Unicode is the character set; UTF-8 is one of its encodings.

Endianness

The order in which bytes are stored in memory or files. Big-endian stores the most significant byte first (like reading left-to-right), while little-endian stores the least significant byte first. Network protocols typically use big-endian (network byte order), while x86 CPUs use little-endian.

URL Encoding (Percent Encoding)

A mechanism for encoding characters in URLs that are not allowed or have special meaning. Reserved characters (like & and =) and non-ASCII characters are replaced with a percent sign followed by their hex value (e.g., space becomes %20, or + in form data). Defined in RFC 3986.Try the url encoder decoder tool

UTF-8

The dominant character encoding for the web (98.2% of all websites). Variable-width: ASCII characters use 1 byte, European/Arabic use 2 bytes, Asian characters use 3 bytes, emoji use 4 bytes. Backward-compatible with ASCII. Defined in RFC 3629.

Font

CFF (Compact Font Format)

A font outline format used inside OpenType fonts. CFF uses cubic Bezier curves (vs TrueType quadratic curves) and typically produces smaller files. CFF2, used in variable fonts, adds support for font variations.

Hinting

Instructions embedded in font files that adjust glyph outlines to align with pixel grids at small sizes. TrueType hinting uses bytecode instructions, while CFF uses stem hints. WOFF2 preserves hinting. Critical for crisp text on low-DPI screens.

Subpixel Rendering

A technique that uses the individual red, green, and blue sub-pixels of an LCD display to increase the apparent resolution of text and graphics. ClearType (Windows) and Core Text (macOS) use subpixel rendering for sharper font display.

Subsetting

The process of removing unused glyphs from a font file to reduce file size. A full font may contain 2,000+ glyphs, but a web page may only need 200. Subsetting can reduce font file size by 70-90%. Tools: glyphhanger, pyftsubset, Font Squirrel.

Variable Font

A single font file containing multiple variations along defined axes (weight, width, slant, optical size). Replaces the need for separate files for each weight/style. Can reduce total font payload by 70-90% compared to loading multiple static fonts.

General

Checksumming

Computing a fixed-size value (hash) from data to verify integrity. CRC32 is used in ZIP and PNG to detect corruption. SHA-256 provides cryptographic integrity verification. MD5 is deprecated for security but still used for file verification.

DRM (Digital Rights Management)

Technology that controls access to copyrighted digital content. In EPUB, Adobe DRM and Apple FairPlay restrict copying. In video, Widevine (Google), FairPlay (Apple), and PlayReady (Microsoft) are the main DRM systems.

Inode

A data structure in Unix/Linux filesystems that stores metadata about a file (permissions, ownership, timestamps, block locations) but not its name or data. TAR archives can preserve inode information. Each file or directory has a unique inode number.

Metadata

Data about data — information embedded in a file that describes its content, creation, or properties. Examples include EXIF in images, ID3 tags in MP3, and PDF properties. Metadata does not affect the rendered content but provides context.

MIME Type

A standard identifier for the format of data sent over the internet, consisting of a type and subtype (e.g., image/png, application/json, text/html). Defined in RFC 2045. Web servers use MIME types in Content-Type headers to tell browsers how to handle responses.

Multiplexing

Combining multiple data streams into a single stream for transmission or storage. In video files, video, audio, and subtitle tracks are multiplexed into a container. In HTTP/2, multiple requests are multiplexed over a single TCP connection.

Image

Alpha Channel

A channel in an image that stores transparency information. A value of 0 typically means fully transparent, while 255 (in 8-bit) means fully opaque. Used extensively in PNG, WebP, and AVIF formats for compositing images over backgrounds.Try the image format converter tool

AVIF

An image format based on the AV1 video codec. Supports lossy and lossless compression, HDR, wide color gamut, and animation. Typically 20% smaller than WebP and 50% smaller than JPEG. Browser support reached 95% in 2025.

CMYK

A subtractive color model used in printing, based on Cyan, Magenta, Yellow, and Key (black). Unlike RGB (additive, for screens), CMYK inks absorb light. Converting between RGB and CMYK can cause color shifts because CMYK has a smaller gamut.

Color Depth

The number of bits used to represent the color of a single pixel. 8-bit = 256 colors (GIF), 24-bit = 16.7 million colors (standard "true color"), 30-bit = 1.07 billion colors (HDR "deep color"), 48-bit = 281 trillion colors (professional imaging).

Color Space

A mathematical model describing the range of colors that can be represented. sRGB is the standard for web, Display P3 offers 25% more colors, and Rec. 2020 covers nearly all visible colors. Each defines a gamut (range) and transfer function (gamma curve).

Dithering

A technique of adding noise or patterns to reduce visible banding when reducing color depth. Used when converting 24-bit images to 8-bit (GIF), or when displaying gradients on limited-color displays. Floyd-Steinberg is the most common dithering algorithm.

DPI (Dots Per Inch)

A measure of printing resolution. 72 DPI is standard for web display, 150 DPI for basic print, and 300 DPI for high-quality print. DPI is a property of the output device, not the image file itself — an image has pixels, not dots.

EXIF (Exchangeable Image File Format)

Metadata embedded in image files (JPEG, TIFF, HEIC) containing camera settings (aperture, shutter speed, ISO), GPS coordinates, timestamps, and device information. Privacy-sensitive because it can reveal location data.Try the image compressor tool

Gamut

The complete range of colors that can be represented in a particular color space or reproduced by a device. sRGB covers about 35% of visible colors, Display P3 about 45%, and Rec. 2020 about 75%. Wide-gamut displays can show colors outside sRGB.

ICC Profile

A standardized set of data that characterizes a color input or output device (monitor, printer, camera). Embedded in image files (JPEG, PNG, TIFF) to ensure colors are displayed consistently across different devices and software.

JPEG XL

A next-generation image format from the JPEG committee. Supports lossless JPEG recompression (20% savings), progressive decoding, HDR, and extreme resolutions. Chrome dropped support in 2023 citing insufficient ecosystem adoption, but Safari supports it.

Perceptual Hashing

A technique that generates a fingerprint for media content based on its perceptual features rather than exact bytes. Similar-looking images produce similar hashes. Used for duplicate detection, content moderation, and reverse image search.

PPI (Pixels Per Inch)

A measure of pixel density for screens and images. A Retina display has ~220-460 PPI. Unlike DPI (for print), PPI describes the resolution of digital displays. Higher PPI means sharper images at the same physical size.

Progressive JPEG

A JPEG encoding method where the image is stored in multiple passes of increasing detail, allowing a low-quality version to appear immediately while the full image loads. Baseline JPEG loads top-to-bottom. Progressive is better for web use.

Raster Image

An image composed of a grid of pixels, each with a defined color value. Photographs are raster images. Scaling up a raster image causes pixelation because new pixels must be interpolated. JPEG, PNG, and WebP are raster formats.

SSIM (Structural Similarity Index)

A perceptual metric that measures the similarity between two images on a scale of 0 to 1 (1 = identical). Unlike PSNR, SSIM considers luminance, contrast, and structure, making it better correlated with human perception of image quality.

Transparency

The property of allowing content behind an image element to show through. Achieved via alpha channels (PNG, WebP, AVIF) or color key (GIF). JPEG does not support transparency. In CSS, opacity and rgba() provide element-level transparency.

Vector Image

An image defined by mathematical shapes (paths, curves, polygons) rather than pixels. Vector images can be scaled to any size without quality loss. SVG is the primary vector format for web. Ideal for logos, icons, and illustrations.

Wavelet Transform

A mathematical transform used in JPEG 2000 that decomposes an image into different frequency components at different scales. Unlike DCT (used in JPEG), wavelets analyze the entire image at once, enabling better compression at low bitrates without blocking artifacts.

WebP

An image format developed by Google (2010) based on the VP8 video codec. Supports lossy and lossless compression, alpha transparency, and animation. Typically 25-34% smaller than JPEG at equivalent quality. Supported by all major browsers since 2020.

Wide Color Gamut

A color space that covers a larger range of colors than sRGB. Display P3 (used by Apple devices) is 25% larger than sRGB, while Rec. 2020 covers 75% of visible colors. CSS Color Level 4 enables wide-gamut colors on the web.

XMP (Extensible Metadata Platform)

An ISO standard (16684) for embedding metadata in files. Created by Adobe, it uses RDF/XML syntax and can store arbitrary metadata. Supported by JPEG, PNG, TIFF, PDF, and video files. More flexible than EXIF but more verbose.

Image/Video

Aspect Ratio

The proportional relationship between width and height of an image or video. 16:9 is standard widescreen (1920x1080), 4:3 is classic TV, 21:9 is ultrawide cinema, 1:1 is square (Instagram), and 9:16 is vertical (TikTok, Reels).

Chroma Subsampling

A technique that reduces color resolution while maintaining luminance resolution, exploiting the human eye is less sensitive to color detail than brightness. Common schemes include 4:4:4 (no subsampling), 4:2:2 (half horizontal), and 4:2:0 (half both). Used in JPEG, H.264, and most video codecs.

DCT (Discrete Cosine Transform)

A mathematical transform used in JPEG and many video codecs. It converts spatial pixel data into frequency-domain coefficients. Low-frequency components (smooth areas) are preserved while high-frequency components (fine detail) can be quantized away for compression.

HDR (High Dynamic Range)

A technique that captures and displays a wider range of brightness levels than SDR (Standard Dynamic Range). HDR content uses 10-bit or 12-bit per channel, supporting brightness up to 10,000 nits vs 100 nits for SDR. Formats include HDR10, Dolby Vision, and HLG.

Interlacing

A method of encoding images (PNG) or video (1080i) where every other line is stored/transmitted first, followed by the remaining lines. In images, it allows a low-resolution preview to appear quickly. In video, it doubles the perceived frame rate.

Luminance

The measure of brightness in an image or video signal. In YCbCr color space (used by JPEG and video codecs), Y represents luminance while Cb and Cr represent chrominance (color). Human vision is more sensitive to luminance than chrominance.

Quantization

The process of mapping continuous values to a finite set of discrete values. In image/video compression, it reduces the precision of DCT coefficients, discarding less-visible details. Higher quantization = smaller files but more artifacts.

SDR (Standard Dynamic Range)

The conventional brightness range used in most content, typically 100 nits peak brightness with 8-bit per channel (256 levels). SDR content uses the BT.709 color space and a 2.2 gamma curve. Being replaced by HDR for premium content.

Video

Adaptive Bitrate Streaming

A technique where the video player automatically adjusts the quality (bitrate) of a video stream based on network conditions. HLS (HTTP Live Streaming) and DASH (Dynamic Adaptive Streaming over HTTP) are the two main protocols.

AV1

An open, royalty-free video codec developed by the Alliance for Open Media (Google, Mozilla, Netflix, etc.). Achieves ~50% better compression than H.264 and ~20% better than H.265/VP9. Used by YouTube, Netflix, and supported in all modern browsers as of 2024.

Container Format

A file format that wraps one or more encoded media streams (video, audio, subtitles) into a single file. MP4, MKV, WebM, and AVI are containers. They define how streams are multiplexed but do not define the encoding. An MP4 can contain H.264, H.265, or AV1 video.

Frame Rate

The frequency at which frames are displayed in video, measured in fps (frames per second). 24 fps is standard for cinema, 30 fps for NTSC TV, 25 fps for PAL TV, 60 fps for smooth motion, and 10000+ fps for gaming and slow-motion replay.

Keyframe

In video compression, a complete frame that does not reference any other frame (also called an I-frame). Other frames (P-frames, B-frames) store only the differences from keyframes. More keyframes = easier seeking but larger files. Typical interval: 1 keyframe every 2-10 seconds.

Mezzanine Format

A high-quality intermediate format used in production workflows before final encoding for distribution. ProRes (Apple), DNxHR (Avid), and CineForm are common mezzanine codecs. They prioritize editing performance over compression efficiency.

Muxing (Multiplexing)

The process of combining multiple encoded streams (video, audio, subtitles) into a single container file. Demuxing is the reverse. The container format (MP4, MKV) determines how streams are interleaved for synchronized playback.

Web

Content Negotiation

An HTTP mechanism where the server selects the best content variant based on the client request headers. For images, Accept: image/avif,image/webp lets the server send AVIF to supporting browsers and JPEG to others. Enables progressive format adoption.

Data URI

A scheme (data:) that allows embedding small files directly in HTML/CSS as base64-encoded strings. Eliminates an HTTP request but increases document size by ~33%. Best for small images under 1-2 KB. Format: data:[mediatype];base64,[data].

Render Blocking

A resource (CSS, JavaScript, font) that prevents the browser from rendering page content until it is downloaded and processed. Render-blocking resources directly impact LCP (Largest Contentful Paint). Strategies to mitigate: async, defer, preload, font-display.

Part 11: Frequently Asked Questions

30 questions answered in detail

These are the most common questions about file formats, drawn from search data, forums, and reader submissions. Each answer is concise but complete, providing actionable guidance rather than vague generalities.

What is the best image format for the web in 2026?

AVIF is the best overall image format for the web in 2026, offering 50% smaller files than JPEG with better quality. It has 95% browser support. Use WebP as a fallback for the remaining 5%. For transparency and simple graphics, WebP or AVIF with alpha channels are ideal. PNG remains necessary for pixel-perfect lossless images.

Should I use WebP or AVIF?

Use AVIF as your primary format for photographs and complex images — it offers 20% better compression than WebP. Use WebP as a fallback for browsers that do not support AVIF. WebP is still the safer choice if you can only serve one modern format, since it has 98% browser support vs 95% for AVIF. For animations, WebP is more widely supported than AVIS (animated AVIF).

Why did Chrome drop JPEG XL support?

Google removed JPEG XL support from Chrome in October 2023 (Chrome 110), citing insufficient interest from the web ecosystem and preferring to focus on AVIF and WebP. The decision was controversial because JPEG XL offers unique features like lossless JPEG recompression and progressive decoding. Safari still supports JPEG XL, and there is ongoing community pressure to reverse the decision.

What is the difference between a codec and a container?

A codec (H.264, AV1, AAC) is the algorithm that encodes and decodes media data — it determines the compression method and quality. A container (MP4, MKV, WebM) is the file format that wraps encoded streams together — it determines how video, audio, and subtitles are packaged. An MP4 container can hold H.264 video with AAC audio, or H.265 video with AC-3 audio.

Is MP3 dead?

MP3 is not dead but it is technically obsolete. All MP3 patents expired in 2017, making it royalty-free. However, AAC offers better quality at the same bitrate, and Opus is superior to both. MP3 remains widely used due to universal compatibility. For new projects, use Opus (best quality-per-bit) or AAC (widest ecosystem support). For archival, use FLAC (lossless).

What is the best audio format for quality?

For maximum quality, use a lossless format: FLAC (widely supported, open source), ALAC (Apple ecosystem), or WAV/AIFF (uncompressed, largest files). For lossy audio, Opus at 128+ kbps is transparent (indistinguishable from lossless) for most listeners. For streaming, AAC at 256 kbps or Opus at 128 kbps are both excellent choices.

JSON vs YAML: which should I use?

Use JSON for APIs, data exchange, and machine-to-machine communication — it is the universal standard, parsed faster, and unambiguous. Use YAML for configuration files that humans edit frequently — it supports comments, is more readable, and requires less punctuation. YAML is the standard for Kubernetes, CI/CD pipelines, and many DevOps tools.

What is the best compression format?

It depends on your priority. For best compression ratio: 7z with LZMA2. For fastest compression: LZ4 or Zstandard. For universal compatibility: ZIP. For web content: Brotli (static) or Zstandard (dynamic). For Linux packages: XZ or Zstandard. Zstandard (ZSTD) is the best all-around choice in 2026, offering near-LZMA ratios at 50x the speed.

How do I choose a video format for my website?

Use MP4 with H.264 for maximum compatibility (100% browser support). For better compression, serve WebM with VP9 or AV1 to supporting browsers. Use the HTML <video> element with multiple <source> tags to offer AV1 first, VP9 second, and H.264 as fallback. For live streaming, use HLS (HTTP Live Streaming) with H.264 segments.

What is UTF-8 and why should I care?

UTF-8 is the dominant text encoding for the web (98.2% of websites). It can represent every character in Unicode (149,000+ characters including all languages, symbols, and emoji) while remaining backward-compatible with ASCII. Use UTF-8 for everything: HTML, CSS, JavaScript, JSON, databases. Specify it explicitly: <meta charset="utf-8"> in HTML, and "encoding": "utf-8" in your editor.

PDF vs DOCX: when should I use each?

Use PDF when the document layout must be preserved exactly (contracts, reports, printed materials). Use DOCX when the document needs to be edited by others (collaboration, templates, drafts). PDF is read-only by design; DOCX is editable by design. For archival, use PDF/A. For e-books, use EPUB instead of either.

What font format should I use on the web?

Use WOFF2 as your primary web font format — it uses Brotli compression and is supported by 98% of browsers. Include WOFF as a fallback for older browsers. Never serve raw TTF or OTF on the web — they are uncompressed and significantly larger. Use the @font-face CSS rule with format hints: format("woff2") and format("woff").

What is the difference between lossy and lossless compression?

Lossy compression permanently removes data to achieve higher compression ratios (10:1 to 50:1). JPEG, MP3, and H.264 are lossy. Lossless compression preserves all original data and achieves lower ratios (2:1 to 4:1). PNG, FLAC, and ZIP are lossless. Use lossy for distribution (smaller files) and lossless for archival/editing (preserves quality).

Is SVG better than PNG for icons?

Yes, SVG is almost always better than PNG for icons, logos, and simple graphics. SVG files are resolution-independent (sharp at any size), typically smaller (a simple icon might be 1 KB as SVG vs 5 KB as PNG), styleable with CSS, animatable, and accessible. Use PNG only when the graphic is too complex for vector representation (photographs, textures).

What is HEIC and can I use it on the web?

HEIC (High Efficiency Image Container) uses HEVC compression and is the default photo format on iPhones since iOS 11. It offers 50% smaller files than JPEG with similar quality. However, web support is limited to Safari (~21% browser support). Convert HEIC to WebP or AVIF for web use. HEIC also has patent licensing issues that limit adoption.

CSV vs JSON for data exchange?

Use CSV for simple tabular data (spreadsheets, databases, data science). CSV is universally supported, smaller, and faster to parse. Use JSON for structured/nested data (APIs, configurations, complex objects). JSON supports types, nesting, and arrays. For big data analytics, consider Parquet (columnar, compressed, typed) over both CSV and JSON.

What is Brotli and should I use it?

Brotli is a compression algorithm by Google that achieves 15-25% better compression than GZIP with similar decompression speed. It is supported by all modern browsers for HTTP Content-Encoding. Use Brotli for static assets (CSS, JS, HTML) where you can pre-compress at high quality. For dynamic content, GZIP or Zstandard may be faster. Configure your CDN to serve Brotli with fallback to GZIP.

What is the best format for 3D models on the web?

glTF 2.0 (GL Transmission Format) is the standard for 3D on the web, often called "the JPEG of 3D." Use GLB (binary glTF) for single-file distribution. glTF supports PBR materials, animations, and Draco/meshopt compression. It is supported by Three.js, Babylon.js, Unity, Unreal, Blender, and all major 3D tools.

How do I reduce the size of a PDF?

Reduce PDF size by: (1) compressing embedded images (JPEG quality 75-85), (2) subsetting fonts (include only used characters), (3) removing metadata and unused objects, (4) using PDF optimization tools (Ghostscript, qpdf), (5) avoiding high-resolution images when not needed for print. A typical 10 MB PDF can often be reduced to 1-3 MB without visible quality loss.

What is the difference between AVI and MP4?

Both are container formats, but MP4 is far superior for modern use. MP4 supports modern codecs (H.264, H.265, AV1), streaming, subtitles, chapters, and metadata. AVI is a 1992-era format limited to older codecs, with no native streaming support. AVI files are typically larger due to outdated codecs. Always use MP4 for new video projects.

Can I convert a lossy format back to lossless?

No. Converting a lossy file (JPEG, MP3) to a lossless format (PNG, FLAC) preserves the current quality but does not restore lost data. The resulting file will be larger without any quality improvement. This is a common misconception. Always keep your original lossless files and create lossy versions from them for distribution.

What is the maximum file size for different formats?

JPEG: ~4 GB, PNG: limited by memory (~2^31 pixels), WebP: 16,383 x 16,383 pixels, AVIF: 65,536 x 65,536 pixels, ZIP: 16 exabytes (ZIP64), TAR (ustar): 8 GB per file, MP4: ~8 EB, PDF: ~10 GB (practical). File system limits also apply: FAT32 max is 4 GB, NTFS is 16 EB, ext4 is 16 TB.

What is Base64 and when should I use it?

Base64 encodes binary data as ASCII text, increasing size by ~33%. Use it for: (1) embedding small images in CSS/HTML (data URIs, under 1-2 KB), (2) sending binary data in JSON/XML APIs, (3) email attachments (MIME encoding), (4) storing binary in text-only systems. Avoid for large files — the 33% overhead makes it inefficient. Use multipart/form-data for file uploads instead.

What is the difference between RGB and CMYK?

RGB (Red, Green, Blue) is an additive color model for screens — combining all three at full intensity produces white. CMYK (Cyan, Magenta, Yellow, Key/black) is a subtractive model for print — combining all four produces black. Screen content should use RGB; print content should use CMYK. Converting between them can shift colors because CMYK has a smaller gamut.

How do variable fonts work?

A variable font contains a single outline for each glyph plus mathematical instructions to interpolate between design extremes (called axes). Common axes: weight (100-900), width (condensed-expanded), slant (-12 to 0 degrees), optical size. A single variable font file replaces 10-20 static font files, reducing page weight by 70-90%. CSS: font-variation-settings or font-weight: 100-900.

What is the safest format for long-term archival?

For documents: PDF/A (ISO 19005) — self-contained, no external dependencies. For images: TIFF (uncompressed or lossless) or PNG. For audio: WAV or FLAC. For video: MKV with FFV1 codec (lossless, open). For data: CSV or JSON (plain text, human-readable). Avoid proprietary formats (PSD, DOC, WMA) that may become unsupported. Store with checksums (SHA-256) for integrity verification.

Why are my WebP images larger than JPEG?

WebP can produce larger files than JPEG in specific cases: (1) quality setting too high (WebP quality 100 is wasteful), (2) images with few colors or flat areas where JPEG excels, (3) very small images where header overhead matters. For best results, use WebP quality 75-85 (not 100). At equivalent perceptual quality, WebP should be 25-34% smaller than JPEG for photographs.

What is AV1 and why does it matter?

AV1 is an open, royalty-free video codec created by the Alliance for Open Media (Google, Apple, Netflix, Amazon, etc.). It delivers 50% better compression than H.264 and 20% better than H.265, with no patent royalties. YouTube and Netflix use AV1 for streaming. Hardware support is growing (MediaTek Dimensity, Intel Arc, NVIDIA RTX 40-series). AV1 is the future of video.

How do I choose between ZIP, 7z, and RAR?

Use ZIP for sharing with anyone — it is natively supported by every OS. Use 7z for maximum compression of large files — LZMA2 compresses 20-40% better than DEFLATE. Use RAR only if you need recovery records (for protecting large archives on unreliable media). Avoid RAR for sharing because it requires proprietary software. For modern use, Zstandard (.tar.zst) is an excellent alternative for Linux/developers.

Part 12: Recommendations & Decision Trees

Format selection guides for every use case

After 50,000+ words of analysis, here are the practical recommendations. Use these decision trees to choose the right format for any situation. Each tree walks you through a series of questions to reach the optimal format.

Image Format Quick Reference Guide

The table below provides a one-stop reference for choosing the right image format for any situation. For each common use case, we recommend a primary format, a fallback, and the format to avoid — with the reasoning behind each recommendation.

Image Format Recommendation by Use Case

14 rows

Use Case	Primary Format	Fallback	Avoid	Reason
Hero image (photograph)	AVIF	WebP	PNG, BMP	Best compression for photos
Product photo (e-commerce)	AVIF	WebP	PNG (unless cutout)	Smallest size, good quality
Logo / icon	SVG	PNG-8	JPEG (no transparency)	Resolution-independent
Screenshot / UI mockup	WebP lossless	PNG	JPEG (text artifacts)	Sharp text preservation
Animated content	MP4 video	Animated WebP	GIF (10-30x larger)	Dramatically smaller
Thumbnail (< 100px)	WebP	JPEG	PNG	WebP has lower header overhead
Print (300 DPI)	TIFF	PNG	JPEG (artifacts visible)	Lossless, CMYK support
Email attachment	JPEG	PNG	WebP (email client support)	Universal email client support
Social media post	JPEG (platform converts)	PNG	HEIC (only Apple)	Platforms re-encode anyway
Favicon	SVG + ICO fallback	PNG 32x32	JPEG	SVG scales perfectly
Chart / infographic	SVG	PNG	JPEG (text artifacts)	Sharp lines, small size
Medical imaging	DICOM	JPEG 2000	Lossy JPEG	Diagnostic accuracy critical
Satellite / aerial	JPEG 2000	TIFF + JPEG XL	PNG (enormous)	Wavelet compression for large images
Archival preservation	TIFF (uncompressed)	PNG	Lossy formats	No generation loss ever

Video Format Quick Reference Guide

Video Format Recommendation by Use Case

8 rows

Use Case	Primary Format	Fallback	Avoid	Reason
Web streaming (general)	AV1 in MP4	H.264 in MP4	AVI, FLV	Best compression, royalty-free
Live streaming	H.264 via HLS/DASH	VP9 via WebRTC	AV1 (encoding too slow)	Real-time encoding required
Video editing timeline	ProRes in MOV	DNxHR in MXF	H.264/H.265 (decode lag)	Intra-frame for instant seeking
Screen recording	H.264 CRF 18	VP9	Uncompressed	Sharp text, manageable size
Mobile upload	H.264 in MP4	HEVC in MP4	AV1 (slow to encode on device)	Hardware encode available everywhere
Archival	FFV1 in MKV	ProRes 4444	Lossy codecs	Lossless, open-source
Social media post	H.264 in MP4	N/A	MKV, WebM (platform reject)	All platforms accept H.264 MP4
4K HDR content	AV1 with HDR10	H.265 with Dolby Vision	H.264 (no HDR)	AV1 is royalty-free, great quality

Audio Format Quick Reference Guide

Audio Format Recommendation by Use Case

8 rows

Use Case	Primary Format	Fallback	Avoid	Reason
Music streaming (web)	Opus 128kbps	AAC 256kbps	MP3 (inferior quality)	Transparent at 128kbps
Podcast distribution	MP3 128kbps	AAC 128kbps	FLAC (unnecessary for speech)	Universal player support
VoIP / video call	Opus 32-64kbps	N/A	MP3, AAC (latency too high)	2.5ms latency, adaptive bitrate
Music archival	FLAC	ALAC (Apple)	MP3/AAC (lossy)	Bit-perfect preservation
Game audio	Opus	Vorbis	WAV (file size)	Low CPU, small files
Studio recording	WAV 24-bit/48kHz	AIFF	Lossy formats	No processing artifacts
Ringtone / notification	AAC .m4a	MP3	FLAC	Small size, wide device support
Audiobook	AAC 64kbps	MP3 64kbps	Lossless (speech doesn't need it)	Speech compresses extremely well

Image Format Decision Tree

Is it a photograph?

Yes: Is web browser support critical? → Yes: AVIF with WebP fallback → No: JPEG XL

Does it need transparency?

Yes: Is it a simple graphic? → Yes: SVG → No: WebP or AVIF with alpha

Is it an animation?

Yes: Short/simple: WebP animated → Complex: MP4 video instead of GIF

Is it for print?

Yes: TIFF (300 DPI, CMYK)

Is it an icon or logo?

Yes: SVG (vector)

Is pixel-perfect fidelity required?

Yes: PNG (lossless)

No: AVIF (best compression) or WebP (widest support)

Video Format Decision Tree

Is it for web streaming?

Yes: AV1 with H.264 fallback in MP4 container

Is it for editing/post-production?

Yes: ProRes in MOV or DNxHR in MXF

Do you need multiple audio/subtitle tracks?

Yes: MKV container

Is universal device playback needed?

Yes: H.264 in MP4 (baseline profile)

Is it for archival?

Yes: FFV1 in MKV (lossless, open)

No: H.265 in MP4 (good balance)

Audio Format Decision Tree

Is lossless quality required?

Yes: Apple ecosystem? → ALAC → Otherwise: FLAC

Is it for web/VoIP/streaming?

Yes: Opus (best quality per bit)

Is it for Apple ecosystem?

Yes: AAC at 256 kbps

Is universal compatibility critical?

Yes: MP3 at 320 kbps (V0 VBR)

No: Opus at 128 kbps

Data Format Decision Tree

Is it for a web API?

Yes: JSON (universal standard)

Is it configuration that humans edit?

Yes: YAML (readable, comments) or TOML (typed)

Is it tabular/spreadsheet data?

Yes: CSV (universal) or Parquet (analytics)

Is performance critical (high throughput)?

Yes: Protocol Buffers or MessagePack

Is it for big data pipelines?

Yes: Parquet (columnar) or Avro (row-based)

No: JSON (safe default)

Compression Format Decision Tree

Sharing with non-technical users?

Yes: ZIP (universal)

Maximum compression needed?

Yes: 7z with LZMA2

Speed is the priority?

Yes: LZ4 or Zstandard at low level

HTTP content compression?

Yes: Brotli (static) or ZSTD (dynamic)

Linux/Unix system?

Yes: tar.zst (Zstandard) or tar.xz

No: ZIP or 7z

Quick Reference: Best Format by Situation

Web Images

AVIF (primary) + WebP (fallback)

Web Icons

SVG (inline or external)

Web Fonts

WOFF2 (primary) + WOFF (fallback)

Web Video

AV1 in MP4 + H.264 fallback

Web Audio

Opus in WebM or OGG

API Data

JSON (REST) or Protobuf (gRPC)

Configuration

YAML or TOML

File Sharing

ZIP (universal)

HTTP Compression

Brotli (static) / ZSTD (dynamic)

Text Encoding

UTF-8 (always)

Documents (final)

PDF or PDF/A (archival)

Documents (editable)

DOCX or Markdown

3D for Web/AR

glTF/GLB

3D Printing

STL or 3MF

Photo Archival

TIFF (16-bit) or PNG

Audio Archival

FLAC (lossless)

Format History: Complete Timelines

70+ milestones across image, video, and audio format history

Understanding format history explains why we have the formats we have today and why certain formats persist despite being technically inferior. Patent disputes drove the creation of PNG. Apple's ecosystem choices made HEIC widespread. Google's market power pushed WebP and VP9 to adoption. The following timelines document every significant event in format evolution.

Image Format Timeline (1985-2026)

The image format landscape has gone through three eras: the pre-web era (BMP, TIFF, GIF, 1985-1991), the web era (JPEG, PNG, SVG, 1992-2009), and the next-generation era (WebP, HEIC, AVIF, JPEG XL, 2010-present). Each new format was created to solve specific limitations of its predecessors. The pace of innovation accelerated dramatically after 2010 as bandwidth constraints and mobile devices created urgent demand for better compression.

1985

ICO format introduced by Microsoft for Windows 1.0 icons

1986

BMP introduced by Microsoft/IBM for Windows and OS/2

1986

TIFF 1.0 released by Aldus Corporation for desktop publishing scanners

1987

GIF 87a released by CompuServe — first widely-used format with animation

1989

GIF 89a adds animation, transparency, and text overlay support

1992

JPEG (ISO 10918-1) standardized — revolutionizes digital photography

1994

Unisys begins enforcing LZW patent used by GIF — sparks PNG creation

1996

PNG 1.0 (W3C) released as patent-free GIF alternative with 24-bit color

1999

SVG 1.0 specification published by W3C for vector graphics on the web

2000

JPEG 2000 (ISO 15444-1) standardized with wavelet compression

2003

PNG 1.2 specification published — adds international text and gamma

2004

LZW patents expire worldwide — GIF becomes truly free

2008

APNG specification published — animated PNG with full alpha support

2010

Google releases WebP based on VP8 intra-frame coding

2012

WebP gains lossless mode and alpha channel support

2013

Fabrice Bellard creates BPG (Better Portable Graphics) based on H.265

2015

HEIF/HEIC standardized (ISO/IEC 23008-12) using HEVC compression

2017

Apple adopts HEIC as default iPhone photo format (iOS 11)

2018

AV1 codec finalized — AVIF image format based on it begins development

2019

AVIF 1.0 specification published by Alliance for Open Media

2020

Chrome 85 adds AVIF support; Safari 14 adds WebP support

2020

WebP reaches practical universality with all major browsers supporting it

2021

JPEG XL (ISO/IEC 18181) Part 1 published — designed as universal replacement

2022

Safari 16 adds AVIF and JPEG XL support simultaneously

2023

Chrome removes JPEG XL flag (Chrome 110) — controversial decision

2024

AVIF reaches 90% global browser support; WebP at 97%

2025

AVIF reaches 93% support; JPEG XL remains Safari/Firefox-flag only

2026

AVIF at 95%, WebP at 98%; AVIF+WebP covers 99.5% of users

Video Codec Timeline (1993-2026)

Video codec history is dominated by two forces: the MPEG standardization process (which produced H.261, H.262/MPEG-2, H.264, H.265, H.266) and the royalty-free movement (VP8, VP9, AV1). The patent complexity of H.265 was the catalyst that created the Alliance for Open Media and AV1. Hardware decoder support, not just software, determines real-world adoption.

1993

MPEG-1 (VCD) — first practical video compression standard (1.5 Mbps)

1995

MPEG-2 (DVD, broadcast) — basis for digital TV worldwide

1999

DivX/XviD bring MPEG-4 Part 2 to consumer video sharing

2003

H.264/AVC standardized — 2x improvement over MPEG-2; enables HD streaming

2005

YouTube launches using Flash Video (FLV) with Sorenson Spark codec

2007

Apple ProRes introduced for professional video editing workflows

2008

Google acquires On2 Technologies — VP8 codec becomes the basis for WebM

2010

WebM container and VP8 codec released as open/royalty-free alternatives

2012

YouTube begins VP8 encoding for WebM playback in Chrome/Firefox

2013

H.265/HEVC standardized — 50% improvement over H.264 but patent chaos begins

2013

VP9 released by Google — royalty-free competitor to H.265

2015

Alliance for Open Media (AOM) founded to develop AV1 codec

2018

AV1 bitstream specification frozen — royalty-free, 50% better than H.264

2019

Netflix begins AV1 streaming on Android devices

2020

YouTube begins serving AV1 to capable devices; hardware decode chips ship

2020

H.266/VVC standardized — 50% improvement over H.265, patent pools forming

2021

Intel SVT-AV1 encoder reaches production quality — practical real-time encoding

2022

NVIDIA RTX 40-series ships with hardware AV1 encoding (NVENC)

2023

AMD, Intel, NVIDIA all support AV1 hardware encoding in consumer GPUs

2024

AV1 becomes most-watched codec on YouTube by bitrate-hours

2025

SVT-AV1 2.0 released — 30% faster encoding with quality improvements

2026

AV2 development begins at AOM; H.266/VVC hardware decoders shipping

Audio Format Timeline (1988-2026)

Audio format history is inextricable from the music industry's digitization. MP3 enabled Napster and the iPod. AAC enabled iTunes. FLAC and Opus represent the current endpoint: one lossless and one lossy format that are both open-source, royalty-free, and technically superior to all proprietary alternatives. The streaming era (Spotify, Apple Music) has made format choice a backend decision invisible to most consumers.

1988

AIFF released by Apple for professional audio on Macintosh

1991

WAV format introduced by Microsoft/IBM for Windows audio

1993

MP3 (MPEG-1 Layer III) standardized — enables digital music revolution

1997

AAC standardized (MPEG-2 Part 7) — designed as MP3 successor

1999

Napster launches — MP3 file sharing transforms music industry

1999

WMA released by Microsoft to compete with MP3 and AAC

2000

Ogg Vorbis released — first royalty-free alternative to MP3

2001

FLAC released — open-source lossless audio codec

2001

Apple launches iPod with MP3 and AAC support

2003

iTunes Music Store launches — AAC becomes mainstream

2004

Apple releases ALAC — proprietary lossless audio for iPod/iTunes

2009

iTunes goes DRM-free — all music sold as 256 kbps AAC

2011

Apple open-sources ALAC codec — becomes royalty-free

2012

Opus standardized (RFC 6716) — best lossy codec at every bitrate

2013

Opus adopted by WebRTC — becomes default VoIP codec in browsers

2015

Tidal launches with FLAC lossless streaming (HiFi tier)

2017

All MP3 patents expire — format becomes fully royalty-free worldwide

2019

Amazon Music HD launches with FLAC up to 24-bit/192 kHz

2021

Apple Music adds lossless (ALAC) and Spatial Audio (Atmos) at no extra cost

2021

Spotify announces HiFi lossless tier (repeatedly delayed)

2023

YouTube Music adds 256 kbps AAC — up from 128 kbps on free tier

2024

Opus 1.5 released with improved speech quality and ML-based enhancement

2026

FLAC supported natively by 92% of browsers; Opus by 97%

Key Finding

The single most important trend in format history is the shift from proprietary, patent-encumbered formats to open, royalty-free alternatives. AV1 (video), Opus (audio), AVIF (image), and FLAC (lossless audio) are all open-source and free of royalties.

This trend was driven by patent licensing complexity (H.265 has three separate patent pools) and the market power of AOM members (Google, Apple, Netflix, Amazon, Microsoft).

JPEG XL Lossless Recompression: The 20% Solution

How JPEG XL can save petabytes without touching quality

JPEG XL's most unique feature is lossless JPEG recompression: it can take any existing JPEG file and recompress it to approximately 20% smaller while preserving the exact decoded pixel values. The original JPEG can be perfectly reconstructed from the JPEG XL file, byte for byte. This is not re-encoding; it is a mathematically lossless transformation of the JPEG bitstream.

The implications are enormous. There are hundreds of billions of JPEG files on the internet and in archives worldwide. If every JPEG were converted to JPEG XL lossless, it would save approximately 20% of the total storage — potentially petabytes of data — without any quality change. The transformation is bidirectional: you can convert back to the original JPEG at any time.

JPEG XL Lossless Recompression Savings by Image Category

8 rows

Image Category	JPEG (KB)	JXL (KB)	Savings %
DSLR photos (12-24 MP)	4200	3360	20
Smartphone photos (12 MP)	2800	2268	19
Web thumbnails (200x200)	15	12	18.5
Social media (1080x1080)	180	144	20.2
Medical scans (CT/MRI)	850	663	22
Satellite imagery	6500	5135	21
Scanned documents	320	250	21.8
Average across all categories	2124	1690	20.4

Medical scans show the highest savings (22%) because medical JPEGs tend to use higher quality settings and contain structured patterns that JPEG XL's entropy coder handles more efficiently than JPEG's Huffman coding. Web thumbnails show the lowest savings (18.5%) because their already-small size limits the improvement from better entropy coding.

PDF Version History and Feature Matrix

From PDF 1.0 (1993) to PDF 2.0 (2020)

PDF has evolved through nine major versions over 27 years. Each version added significant capabilities while maintaining backward compatibility. Understanding which version a PDF was created in helps explain which features it supports and which tools can open it reliably.

PDF Version Feature Matrix

9 rows

Version	Year	Key Features Added	Encryption
PDF 1.0	1993	Basic text, images, links	None
PDF 1.1	1994	Device-independent color	40-bit RC4
PDF 1.2	1996	Interactive forms, Unicode	40-bit RC4
PDF 1.3	2000	JavaScript, digital signatures	128-bit RC4
PDF 1.4	2001	Transparency, JBIG2	128-bit RC4
PDF 1.5	2003	Object streams, JPEG 2000	128-bit AES
PDF 1.6	2004	OpenType font embedding, AES	AES-128
PDF 1.7	2008	ISO 32000-1, 3D annotations	AES-128
PDF 2.0	2020	ISO 32000-2, page-level intents, Associated Files	AES-256

The most widely used PDF versions in 2026 are PDF 1.4 (transparency, most compatible), PDF 1.7 (ISO standard, fully featured), and PDF/A-2 (archival). PDF 2.0 adoption is growing but remains limited because older viewers do not support its new features. For maximum compatibility, generate PDF 1.4 or 1.7. For archival, generate PDF/A-2b or PDF/A-3b.

PDF Size Optimization Strategies

A typical 10 MB PDF can often be reduced to 1-3 MB without visible quality loss. The key strategies, in order of impact:

1. Compress embedded images. Most PDF bloat comes from uncompressed or over-quality images. Re-encoding images to JPEG quality 75-85 or JPEG 2000 at equivalent quality typically reduces PDF size by 50-80%. Tools: Ghostscript, qpdf, pdfsizeopt.

2. Subset fonts. A PDF embedding a full font (2,000+ glyphs, 200+ KB per font) should include only the glyphs actually used in the document. Font subsetting can reduce font data by 80-90%. Most modern PDF generators do this automatically, but older tools often embed full fonts.

3. Remove duplicate objects. PDFs edited multiple times accumulate duplicate image objects, unused pages, and orphaned resources. PDF linearization (web optimization) removes duplicates and reorders objects for streaming delivery.

4. Remove metadata. PDFs may contain extensive XMP metadata, thumbnails, bookmarks, and JavaScript that are not needed for the final document. Stripping unnecessary metadata saves 5-50 KB depending on the document.

Compress Your PDF

Use the tool below to compress PDF files right here. Upload your PDF and choose the compression level to reduce file size while preserving quality.

Try it yourself

Compress Pdf

Open full tool

Drop your PDF here

or click to browse

PDF|Max 100 MB

Code Example: CSV with UTF-8 BOM for Excel

The most common CSV problem is encoding. Excel on Windows opens CSV files as Windows-1252 by default, mangling non-ASCII characters. The solution is to add a UTF-8 BOM (Byte Order Mark) at the beginning of the file.

// Node.js: Write CSV with UTF-8 BOM for Excel compatibility
const fs = require('fs');
const BOM = '\uFEFF';
const csv = BOM + 'Name,City,Price\n"Jean-Luc","Montreal","$1,234"\n';
fs.writeFileSync('data.csv', csv, 'utf8');

Common Mistakes and How to Fix Them

21 common format mistakes across all categories

After documenting 100+ formats, certain mistakes appear again and again across different teams, industries, and skill levels. Each mistake below includes the impact on performance, quality, or security, along with the correct approach.

3D Format Mistakes

Using OBJ for web 3D content

Impact: No animation, no PBR materials, large text-based files

Fix: Use glTF/GLB — supports PBR, animation, and Draco compression (90% smaller)

Audio Format Mistakes

Converting MP3 to FLAC to "improve quality"

Impact: Larger file with same lossy quality — lost data cannot be recovered

Fix: Keep original lossless source files; create lossy versions from them

Using 320 kbps CBR MP3 when Opus 128 kbps is transparent

Impact: 2.5x larger file with no audible benefit

Fix: Use Opus at 96-128 kbps for transparent quality; MP3 320 only for legacy compatibility

Compression Format Mistakes

Compressing already-compressed files (JPEG, MP4, ZIP)

Impact: Minimal or zero size reduction; wasted CPU time

Fix: Only compress compressible formats (text, raw data, uncompressed images)

Using GZIP for large backups when ZSTD is available

Impact: 5x slower compression with worse ratio

Fix: Use zstd for backups: 5x faster compression, 15% better ratio than gzip

Data Format Mistakes

Storing dates in ambiguous formats (01/02/03) in CSV

Impact: Dates interpreted differently in US (Jan 2) vs EU (Feb 1) locales

Fix: Always use ISO 8601: YYYY-MM-DD (2026-04-14)

Not specifying encoding in CSV files

Impact: Non-ASCII characters garbled when opened in Excel

Fix: Add UTF-8 BOM (EF BB BF) at file start for Excel compatibility

Using YAML for data exchange between systems

Impact: YAML type inference causes bugs (NO = false, 1.0 = float)

Fix: Use JSON for machine-to-machine data exchange; YAML only for human-edited config

Document Format Mistakes

Creating accessible PDFs as an afterthought

Impact: Screen readers cannot determine reading order or structure

Fix: Author documents with proper heading structure; tag tree in PDF; validate with PAC checker

Encoding Format Mistakes

Storing passwords in Base64 thinking it is encryption

Impact: Zero security — anyone can decode Base64 in milliseconds

Fix: Use bcrypt, Argon2, or scrypt for password hashing; AES-256 for encryption

Not specifying charset in HTTP Content-Type headers

Impact: Browser may guess wrong encoding, causing mojibake

Fix: Always include charset: Content-Type: text/html; charset=utf-8

Font Format Mistakes

Loading 10+ font weights as separate static files

Impact: 400+ KB total payload; render-blocking; layout shift

Fix: Use a single variable font file (80-120 KB); or limit to 2-3 weights

Serving TTF/OTF instead of WOFF2 on the web

Impact: Files 2-3x larger than necessary; slower page loads

Fix: Convert to WOFF2 using woff2_compress; serve with @font-face format("woff2")

Image Format Mistakes

Using PNG for photographs

Impact: Files 5-10x larger than JPEG/WebP with no visible quality benefit

Fix: Use AVIF or WebP for photos; PNG only for graphics needing pixel-perfect lossless or transparency

JPEG quality 100

Impact: Files 60-80% larger than quality 95 with zero perceptible improvement

Fix: Use quality 80-85 for web, 90-95 for high quality; never 100

Serving original camera photos (3000x4000, 5MB+)

Impact: Massive page weight, slow load times, wasted bandwidth

Fix: Resize to display dimensions, compress to AVIF/WebP, use srcset for responsive images

Not stripping EXIF/GPS metadata before publishing

Impact: Privacy leak — GPS coordinates reveal exact photo location

Fix: Strip EXIF metadata server-side or use tools like exiftool before upload

Using GIF for animations

Impact: 10-30x larger than equivalent MP4 or animated WebP

Fix: Use <video autoplay muted loop> with MP4 for GIF-like behavior; animated WebP for transparency

Video Format Mistakes

Encoding at higher resolution than source

Impact: Larger file with no quality improvement; may add upscaling artifacts

Fix: Always encode at source resolution or lower; never upscale before encoding

Using CRF 0 or very low CRF values

Impact: Near-lossless quality that produces enormous files for streaming

Fix: Use CRF 23-28 for H.264, CRF 30-35 for AV1 for web delivery

Confusing container and codec

Impact: Renaming .avi to .mp4 does not convert the video

Fix: Use ffmpeg to properly transcode: ffmpeg -i input.avi -c:v libx264 output.mp4

Key Finding

The most common mistake across all categories is applying lossy formats where lossless is needed, or vice versa. Always match the compression type to the content and workflow.

Keep lossless originals. Create lossy versions from the originals for distribution. Never re-encode lossy to lossy.

File Format Security Issues

15 known vulnerabilities and mitigations

File formats are attack vectors. SVGs can contain JavaScript (XSS), ZIPs can contain path traversal attacks, XML can trigger billion-laughs DoS attacks, and YAML can execute arbitrary code. Understanding these risks is essential for any application that processes user-uploaded files.

File Format Security Vulnerabilities

15 rows

Format	Vulnerability	Severity	Mitigation
JPEG	EXIF GPS location leak	Medium	Strip EXIF metadata server-side before serving user uploads
JPEG	Steganography — hidden data in DCT coefficients	Low	Re-encode images to destroy hidden data
PNG	Decompression bomb — small file decompresses to GB of pixels	High	Validate image dimensions before decompression; set max pixel limits
SVG	XSS via embedded JavaScript in SVG	Critical	Sanitize with DOMPurify; serve user SVGs with Content-Security-Policy
SVG	SSRF via external entity references	High	Disable external resource loading; convert user SVGs to raster
PDF	JavaScript execution in PDF viewers	High	Disable JavaScript in PDF reader settings; use PDF/A which prohibits JS
PDF	Launch action — PDF can open external programs	Critical	Modern readers prompt before executing; disable in enterprise policy
ZIP	Zip bomb — nested ZIPs decompress to petabytes	High	Limit decompression ratio and total extracted size
ZIP	Path traversal — filenames with ../../ can escape extraction directory	Critical	Sanitize extracted filenames; reject paths with .. components
XML	Billion laughs attack — exponential entity expansion	Critical	Disable DTD processing; limit entity expansion depth
XML	XXE (XML External Entity) — read local files or SSRF	Critical	Disable external entity loading in XML parser configuration
YAML	Code execution via !!python/object constructor	Critical	Use safe_load() instead of load(); never parse untrusted YAML with full loader
CSV	Formula injection — cells starting with = execute in Excel	Medium	Prefix cells starting with =, +, -, @ with single quote or tab character
DOCX	VBA macro malware — malicious macros in DOCM files	High	Disable macro execution; use Group Policy to block macros from internet files
WOFF2	Font parsing buffer overflow	Medium	Keep browsers updated; browsers sandbox font rendering

Format Conversion Reference

16 common conversions with tools and commands

Converting between formats is one of the most common operations in any media workflow. The table below provides the recommended tool and command for 16 common format conversions, covering images, video, audio, data, documents, and 3D models. Each entry includes practical notes about quality settings and best practices.

Format Conversion Guide

16 rows

From	To	Tool	Command	Notes
JPEG	WebP	cwebp, sharp, Squoosh	cwebp -q 82 input.jpg -o output.webp	Quality 82 roughly matches JPEG 85
JPEG	AVIF	avifenc, sharp, Squoosh	avifenc --min 20 --max 35 input.jpg output.avif	Use --speed 4-6 for balance
PNG	WebP	cwebp, sharp	cwebp -lossless input.png -o output.webp	Use -lossless for transparency
PNG	AVIF	avifenc, sharp	avifenc --lossless input.png output.avif	AVIF lossless is 20-30% smaller than PNG
HEIC	JPEG	heif-convert, ImageMagick	heif-convert input.heic output.jpg	Quality loss from re-encoding lossy format
SVG	PNG	Inkscape, sharp, puppeteer	inkscape input.svg -w 1024 -o output.png	Specify width/height for rasterization
MP4 (H.264)	MP4 (AV1)	ffmpeg + SVT-AV1	ffmpeg -i input.mp4 -c:v libsvtav1 -crf 30 output.mp4	CRF 28-35 for web streaming
WAV	FLAC	flac, ffmpeg	flac --best input.wav -o output.flac	Perfectly lossless conversion
WAV	Opus	opusenc, ffmpeg	opusenc --bitrate 128 input.wav output.opus	128 kbps is transparent for most content
FLAC	MP3	lame, ffmpeg	ffmpeg -i input.flac -c:a libmp3lame -q:a 0 output.mp3	-q:a 0 = V0 VBR (~245 kbps)
JSON	CSV	jq, pandas, csvkit	jq -r '[.[] \| [.name,.age]] \| .[] \| @csv' data.json > data.csv	Flattens nested structures
CSV	Parquet	DuckDB, pandas, Spark	duckdb -c "COPY (SELECT * FROM 'data.csv') TO 'data.parquet'"	60-80% smaller, much faster queries
DOCX	PDF	LibreOffice, pandoc	libreoffice --headless --convert-to pdf input.docx	Best fidelity with LibreOffice
Markdown	HTML	pandoc, marked, remark	pandoc input.md -o output.html --standalone	--standalone adds HTML wrapper
TTF	WOFF2	woff2_compress, fonttools	woff2_compress input.ttf	50-60% size reduction
OBJ	GLB	Blender, gltf-transform	gltf-transform copy input.obj output.glb	Add --compress for Draco compression

Image Conversion with Node.js (sharp)

The sharp library is the fastest image processing library for Node.js, used by Next.js, Gatsby, and most image optimization pipelines. The code below converts any image to both AVIF and WebP with optimal quality settings.

# Convert images to AVIF and WebP using sharp (Node.js)
const sharp = require('sharp');

async function convertImage(input) {
  await sharp(input)
    .avif({ quality: 80 })
    .toFile(input.replace(/\.[^.]+$/, '.avif'));

  await sharp(input)
    .webp({ quality: 82 })
    .toFile(input.replace(/\.[^.]+$/, '.webp'));
}

Impact on Web Performance

How format choices affect Core Web Vitals

File format choices directly impact Core Web Vitals scores. The Largest Contentful Paint (LCP) metric — which Google uses as a ranking signal — is heavily influenced by image format and size. On a simulated 3G connection, an unoptimized JPEG hero image loads in 4.2 seconds (failing the 2.5s LCP threshold), while the same image in AVIF loads in 1.38 seconds (passing comfortably). JPEG XL achieves the fastest LCP at 1.2 seconds thanks to progressive decoding.

Impact of Image Format on LCP (simulated 3G connection)

7 rows

Format	Size (KB)	LCP (ms)
JPEG (unoptimized)	420	4200
JPEG (mozjpeg q80)	180	2100
WebP (q82)	135	1650
AVIF (q80)	105	1380
JPEG XL (q80)	115	1200
PNG (lossless)	890	7800
SVG (icon)	2	180

What Makes Web Pages Heavy?

Images account for 42% of median page weight in 2026 (HTTP Archive data). JavaScript is second at 18%. Reducing image weight through modern formats (AVIF, WebP) and proper sizing has the single largest impact on page performance. Fonts at 4% are often overlooked but matter because they are render-blocking.

Web Page Weight Distribution by Resource Type (KB)

Source: OnlineTools4Free Research

Progressive Loading Strategies Compared

Different image formats provide vastly different loading experiences. Progressive JPEG and JPEG XL show a usable preview within 280-350ms, while WebP and AVIF show nothing until fully loaded. Low-quality image placeholders (LQIP) and BlurHash provide instant visual feedback (10-50ms) but require additional implementation. The perceived performance difference is significant: users prefer seeing a blurry preview instantly over waiting 1.6 seconds for a crisp image to appear all at once.

Progressive Image Loading Comparison (3G connection, 200KB image)

9 rows

Format/Strategy	First Pixel (ms)	Usable Preview (ms)	Full Load (ms)	Strategy
Baseline JPEG	2200	2200	2200	Top-to-bottom scan lines
Progressive JPEG	280	600	2100	Multiple scans: blur then sharpen
JPEG XL (progressive)	120	350	1800	Continuous progressive refinement
WebP	1600	1600	1600	No progressive mode — all at once
AVIF	1400	1400	1400	No progressive mode — all at once
PNG (interlaced)	800	1800	4500	Adam7 interlacing: 7 passes
PNG (non-interlaced)	4500	4500	4500	Top-to-bottom scan lines
Low-quality placeholder (LQIP)	50	50	1600	1KB blur + lazy load full image
Blurhash	10	10	1600	~30 byte hash decoded to blur placeholder

Key Finding

Switching from unoptimized JPEG to AVIF reduces LCP by 67% on 3G connections (4.2s to 1.38s). This single change can move a page from failing to passing Google's Core Web Vitals threshold.

Combine format optimization with responsive images (srcset), lazy loading (loading='lazy'), and priority hints (fetchpriority='high' for hero images).

Color Spaces and Gamut Coverage

Understanding wide-gamut color for modern displays

The transition from sRGB to wider color spaces is one of the most important developments in display technology. sRGB, established in 1996, covers only 35% of the visible color spectrum. Display P3, used by Apple devices since 2015, covers 45.5%. Rec. 2020 (for HDR video) covers 75.8%. AVIF and JPEG XL support all of these color spaces natively, while JPEG and WebP are limited to sRGB.

Color Space Gamut Coverage

8 rows

Color Space	Visible %	Bit Depth	Year	Used By
sRGB	35	8	1996	Web standard, most monitors
Display P3	45.5	10	2015	Apple devices, modern screens
Adobe RGB	52.1	16	1998	Photography, prepress
ProPhoto RGB	90	16	2001	Professional photography
Rec. 2020	75.8	10	2012	HDR video, 4K/8K broadcast
Rec. 709	35	8	1990	HD video, identical to sRGB gamut
CMYK (ISO Coated v2)	28	8	2006	Offset printing
DCI-P3	45.5	12	2005	Digital cinema projection

For web developers, CSS Color Level 4 enables wide-gamut colors via the color() function: color(display-p3 1 0 0) produces a red that is 25% more vivid than rgb(255, 0, 0) on P3-capable displays. This matters for brand colors, product photography, and any content where color accuracy impacts the user experience.

Part 13: Methodology, Raw Data & Sources

Full methodology and downloadable datasets

Try These Tools for Free

Put this knowledge into practice with our browser-based tools. No signup needed.

🖼

Format Converter

Convert images between JPG, PNG, WebP, BMP, and more formats with optional resizing.

📊

CSV to JSON

Convert CSV data to JSON and JSON to CSV format online.

📄

JSON to YAML

Convert JSON data to YAML format for configuration files.

🔄

JSON to XML

Convert JSON data to XML format and XML back to JSON.

🎵

Audio Converter

Convert audio files between MP3, WAV, OGG, AAC, and FLAC formats.

🎬

Video to MP4

Convert video files to MP4 format directly in your browser.

🔐

Base64

Encode text or files to Base64 and decode Base64 strings back.

Related Research Reports

Image Compression Benchmark 2026: 10 Formats Tested Across 1,000 Images

We tested 10 image formats across 1,000 diverse images at multiple quality levels. See which format delivers the best compression, quality, and browser compatibility in 2026.

12,000 words 25 min

Read report

PDF Tools Benchmark 2026: 15 Tools Compared on Compression, Speed, and Privacy

We tested 15 PDF tools including iLovePDF, Smallpdf, Adobe Acrobat, and more across compression ratio, processing speed, file size limits, privacy policies, and pricing. See which PDF tool delivers the best results in 2026.

11,000 words 22 min

Read report

Web Performance Format Guide 2026: Images, Fonts, Scripts, and Core Web Vitals

Complete guide to web asset formats and their impact on Core Web Vitals. Compare images, fonts, scripts, and stylesheets with real performance data from HTTP Archive and Lighthouse benchmarks.

11,500 words 24 min

Read report

Download Raw Data

All data used in this guide is available for download. The datasets are released under a Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data for any purpose, provided you give appropriate credit.

Citations & Sources

This guide draws on 23 primary sources including ISO standards, IETF RFCs, W3C specifications, academic papers, and official project documentation.

Alliance for Open Media. “AV1 Bitstream & Decoding Process Specification.” AOM, 2024. https://aomedia.org/av1-bitstream-and-decoding-process-specification/

Google Developers. “WebP Compression Study.” Google, 2023. https://developers.google.com/speed/webp/docs/webp_study

JPEG Committee. “JPEG XL Reference Software.” ISO/IEC JTC 1, 2024. https://jpeg.org/jpegxl/

Mozilla Developer Network. “Image file type and format guide.” Mozilla, 2025. https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Image_types

Can I Use. “Browser support tables for modern web technologies.” Can I Use, 2026. https://caniuse.com/

Netflix Technology Blog. “AV1 Streaming for Netflix Members.” Netflix, 2024. https://netflixtechblog.com/av1-streaming/

IETF. “RFC 6716: Definition of the Opus Audio Codec.” IETF, 2012. https://www.rfc-editor.org/rfc/rfc6716

Collet, Yann. “Zstandard Compression RFC 8878.” IETF, 2021. https://www.rfc-editor.org/rfc/rfc8878

W3C. “Scalable Vector Graphics (SVG) 2.” W3C, 2024. https://www.w3.org/TR/SVG2/

Unicode Consortium. “The Unicode Standard, Version 16.0.” Unicode Consortium, 2024. https://www.unicode.org/versions/Unicode16.0.0/

Khronos Group. “glTF 2.0 Specification.” Khronos Group, 2024. https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html

ISO. “ISO 32000-2:2020 — Document management — PDF 2.0.” ISO, 2020. https://www.iso.org/standard/75839.html

W3Schools. “UTF-8 Unicode Character Sets.” W3Schools, 2025. https://www.w3schools.com/charsets/ref_html_utf8.asp

HTTP Archive. “State of the Web Report.” HTTP Archive, 2026. https://httparchive.org/reports/state-of-the-web

Squoosh Team. “Squoosh: Image compression web app.” Google Chrome Labs, 2025. https://squoosh.app/

Matroska.org. “Matroska Media Container Specification.” Matroska.org, 2024. https://www.matroska.org/technical/basics.html

Apple Developer. “HEIF — High Efficiency Image File Format.” Apple, 2025. https://developer.apple.com/documentation/heif

FLAC Project. “FLAC — Free Lossless Audio Codec.” Xiph.Org Foundation, 2024. https://xiph.org/flac/

Google Fonts. “Variable Fonts Guide.” Google, 2025. https://fonts.google.com/knowledge/introducing_type/introducing_variable_fonts

IANA. “Media Types Registry.” IANA, 2026. https://www.iana.org/assignments/media-types/media-types.xhtml

Apache Software Foundation. “Apache Parquet Format Specification.” Apache, 2025. https://parquet.apache.org/documentation/latest/

NEMA. “DICOM Standard.” NEMA, 2025. https://www.dicomstandard.org/

Alakuijala, Jyrki et al.. “Brotli Compressed Data Format (RFC 7932).” IETF, 2016. https://www.rfc-editor.org/rfc/rfc7932

Final Takeaways: The State of File Formats in 2026

After documenting 100+ formats across nine categories, several clear themes emerge that summarize the state of file formats in 2026.

1. The Open-Source Formats Are Winning

The most technically advanced formats in every category are now open-source and royalty-free. AV1 beats H.265 in compression and is free. Opus beats AAC in quality-per-bit and is free. AVIF matches or exceeds HEIC and is free. FLAC dominates lossless audio and is free. glTF is replacing proprietary FBX for 3D content. The patent-encumbered alternatives (H.265, AAC, HEIC) survive through ecosystem lock-in (Apple) rather than technical merit.

2. The Browser is the Universal Format Gateway

Browser support determines format adoption for the web. Chrome adding WebP support in 2012 did not matter until Safari added it in 2020 — only then did WebP become universally usable. Chrome removing JPEG XL support in 2023 effectively killed the format for web use despite its technical superiority. Browser vendors (particularly Google) are kingmakers in the format landscape.

3. Compression is Approaching Physical Limits

The generational improvement in compression efficiency is slowing. H.264 to H.265 was 50%. H.265 to AV1/H.266 is 30-50%. Future codecs will deliver diminishing returns because we are approaching the Shannon entropy limit of natural image and video content. The next big gains will come from AI-based compression (neural codecs), perceptual models trained on human vision, and content-adaptive algorithms — not from better transform coding.

4. Format Proliferation is Actually Decreasing

Counter-intuitively, the number of formats that matter is shrinking. For images: AVIF+WebP+SVG covers 99.5% of use cases. For video: AV1+H.264 in MP4. For audio: Opus+FLAC. For data: JSON+Parquet. For archives: ZIP+ZSTD. For fonts: WOFF2. For 3D: glTF. For text: UTF-8. A decade ago, the recommended format lists were much longer because no single format was good enough for general use.

5. The Best Format is the One You Do Not Have to Think About

The ideal format pipeline is invisible. CDN auto-negotiation serves AVIF to Chrome, WebP to Safari 14-15, and JPEG to ancient browsers — all from a single source image. Next.js Image component handles format selection, sizing, and lazy loading automatically. HTTP Content-Encoding negotiation serves Brotli, ZSTD, or GZIP based on browser support. The best format choice is the one made automatically by your tooling, not manually by your developers.

Key Finding

The ideal format strategy in 2026 is automated, not manual. Use CDN format negotiation for images, adaptive bitrate streaming for video, and HTTP content-encoding for compression. Focus on choosing the right tooling rather than manually converting files.

For sites using Next.js + Vercel, Cloudflare, or Cloudinary, the entire format pipeline is handled automatically. For custom setups, invest in build-time format generation rather than runtime conversion.

Related Tools & Articles

Image Format Converter

Convert between JPEG, PNG, WebP, AVIF

Image Compressor

Compress images without visible quality loss

JSON Formatter

Format, validate, and minify JSON data

Base64 Encoder

Encode and decode Base64 data

CSS Minifier

Minify CSS for production deployment

PDF Compressor

Reduce PDF file sizes

Password Generator

Generate cryptographically secure passwords

Color Picker

Pick and convert colors between formats

Contrast Checker

Check WCAG contrast ratios

Related Research

Image Compression Benchmark 2026 Web Performance Format Guide 2026 PDF Tools Benchmark 2026 Color Theory for Digital Design 2026 Developer Productivity Tools Analysis 2026