Introduction: Why File Formats Matter
Every file on every computer, phone, and server on Earth is stored in a format. The format determines how the data is organized, how efficiently it is stored, what features it supports, and what software can open it. Choosing the wrong format wastes storage, degrades quality, breaks compatibility, and frustrates users.
Yet most people never think about file formats until something goes wrong. A client sends a HEIC photo that Windows cannot open. A website loads slowly because images are still in PNG instead of AVIF. A developer spends hours debugging a CSV file with wrong encoding. A video editor exports in AVI when MP4 would be one-tenth the size.
This guide exists to end that confusion permanently. We have documented every major file format in use today across nine categories: images, documents, video, audio, data serialization, archives and compression, fonts, 3D models, and character encodings. For each format, we explain what it is, how it works, when to use it, and what alternatives exist. We compare formats head-to-head with real benchmark data, interactive charts you can explore, and downloadable datasets you can analyze yourself.
Whether you are a web developer deciding between AVIF and WebP, a video editor choosing between ProRes and H.265, a data engineer debating Parquet versus CSV, or a designer exporting icons as SVG versus PNG, this guide gives you the definitive answer backed by data and decades of format history.
The guide is organized into 13 parts. Each part can be read independently. Use the table of contents on the left to jump directly to any section. Bookmark this page; it is updated continuously as formats evolve and browser support changes.
Key Finding
In 2026, AVIF has reached 95% browser support and offers 50% smaller files than JPEG. It is now the recommended default image format for the web.
WebP remains the safest single-format choice at 98% support, but the AVIF+WebP combination covers all modern browsers with the best possible compression.
Part 1: Image Formats
~8,000 words covering 12 image formats in depth
Image formats are the most searched and most misunderstood category of file formats on the web. The differences between JPEG, PNG, WebP, and AVIF affect page load times, bandwidth costs, visual quality, and user experience for billions of people every day. The choice of image format is one of the highest-impact technical decisions a web developer can make.
This section covers every major image format from the 40-year-old BMP to the cutting-edge JPEG XL, with technical deep dives into how each format works under the hood, real compression benchmarks, browser support timelines, and practical recommendations.
Image Format Timeline
Image formats have evolved dramatically over four decades. The timeline below shows when each major format was introduced. Notice the acceleration in recent years, with WebP (2010), HEIC (2015), AVIF (2019), and JPEG XL (2021) all arriving within a single decade as the limitations of JPEG (1992) and PNG (1996) became increasingly apparent.
Image Format Release Years
Source: OnlineTools4Free Research
JPEG: The Foundation of Digital Photography
JPEG (Joint Photographic Experts Group) was standardized in 1992 and remains the most widely used image format in the world. It was designed for continuous-tone photographic images and uses a lossy compression method based on the Discrete Cosine Transform (DCT).
How JPEG Compression Works
JPEG compression operates in several stages. First, the image is converted from RGB color space to YCbCr, separating brightness (luminance, Y) from color information (chrominance, Cb and Cr). This separation is critical because human vision is far more sensitive to brightness detail than color detail.
Next, the chrominance channels are typically downsampled using 4:2:0 chroma subsampling, reducing color resolution to one-quarter while preserving full luminance resolution. This alone reduces data by about 50% with minimal perceptible quality loss.
The image is then divided into 8x8 pixel blocks, and each block is transformed using the DCT. The DCT converts spatial pixel values into frequency coefficients. Low-frequency coefficients represent smooth gradual changes (the overall brightness and color of the block), while high-frequency coefficients represent fine detail and sharp edges.
The frequency coefficients are then quantized by dividing each by a value from a quantization table and rounding to the nearest integer. This is the step where information is permanently lost. The quantization table determines the quality level: higher quality uses smaller divisors (keeping more detail), while lower quality uses larger divisors (discarding more fine detail). The JPEG quality parameter (0-100) controls which quantization table is used.
Finally, the quantized coefficients are entropy-coded using Huffman coding (or arithmetic coding in some implementations) to achieve further lossless compression. The coefficients are arranged in a zigzag pattern from the DC coefficient (top-left, lowest frequency) to high-frequency coefficients, with runs of zeros efficiently encoded.
Quality Levels and Practical Recommendations
JPEG quality is specified on a scale of 0 to 100, but the relationship between quality number and file size is not linear. Quality 85 is often considered the sweet spot for web use: it produces files about 40% smaller than quality 95 with differences that are nearly impossible to see. Quality 75 is suitable for thumbnails and previews. Quality 95 and above is rarely justified unless the image will be printed at high resolution.
A common mistake is using quality 100, which produces files 60-80% larger than quality 95 with effectively zero perceptible improvement. The quality 95-to-100 range wastes bytes preserving details at the sub-pixel level that no display can render and no eye can see.
EXIF Metadata
JPEG files can contain extensive EXIF (Exchangeable Image File Format) metadata including camera model, lens, aperture, shutter speed, ISO, GPS coordinates, date/time, orientation, and color profile. EXIF data typically adds 2-20 KB to a file. For privacy, GPS coordinates should be stripped before publishing photos online. Many social media platforms strip EXIF automatically, but personal websites and blogs often do not.
Progressive JPEG
JPEG supports two encoding modes: baseline and progressive. Baseline JPEG loads top-to-bottom, scan line by scan line. Progressive JPEG stores the image in multiple scans of increasing detail, so a low-resolution version of the entire image appears immediately, then sharpens as more data arrives. Progressive JPEG is almost always better for web use because it provides a faster perceived load time, and progressive files are often 2-5% smaller than baseline at the same quality level.
To create progressive JPEGs, use tools like mozjpeg (the best open-source JPEG encoder), which also applies optimized Huffman tables and trellis quantization to squeeze additional bytes. Mozjpeg typically produces files 5-10% smaller than standard libjpeg at the same quality.
PNG: Lossless Compression and Transparency
PNG (Portable Network Graphics) was created in 1996 as a patent-free alternative to GIF after Unisys began enforcing its LZW patent. PNG uses lossless DEFLATE compression and supports true-color images (up to 48-bit), alpha channel transparency (8-bit or 16-bit), and interlaced loading.
How PNG Works
PNG compression operates in two stages. First, each scan line is filtered using one of five prediction filters (None, Sub, Up, Average, Paeth). The filter predicts each pixel value based on neighboring pixels, and the filter output stores only the difference between prediction and actual value. For smooth images, these differences are small numbers that compress well.
The filtered data is then compressed with DEFLATE (the same algorithm used in ZIP and GZIP), which combines LZ77 dictionary-based compression with Huffman entropy coding. Because PNG uses lossless compression, the original pixel values can be perfectly reconstructed.
Alpha Channel Transparency
PNG was the first widely-supported format to offer full 8-bit alpha channel transparency, allowing 256 levels of opacity per pixel. This enables smooth anti-aliased edges and partial transparency effects that GIF (with its 1-bit transparency) cannot achieve. The alpha channel adds 25-33% to file size but is essential for web graphics, UI elements, and compositing.
PNG Optimization Techniques
PNG files can often be significantly reduced in size without any quality loss. Tools like OptiPNG, PNGQuant, and OxiPNG recompress the DEFLATE stream with better parameters, remove unnecessary metadata chunks (timestamps, software tags), and choose optimal filter combinations for each row. PNGQuant can further reduce PNG-24 to PNG-8 (256 colors with dithering), achieving 60-80% size reduction at the cost of some color fidelity. For simple graphics with few colors, PNG-8 is often visually indistinguishable from PNG-24.
APNG (Animated PNG) extends PNG with frame-based animation, supporting full-color and alpha channel transparency unlike GIF. APNG is supported by all modern browsers and is the best format for animated graphics that need transparency, though file sizes are typically larger than WebP animated or MP4 video.
WebP: Google's Universal Web Format
WebP was developed by Google and released in 2010. It is based on the VP8 video codec (for lossy compression) and uses a custom lossless codec as well. WebP supports lossy compression, lossless compression, alpha transparency, and animation in a single format, making it the first format to combine all of these capabilities.
Lossy WebP
Lossy WebP uses block-based prediction and DCT-like transforms derived from VP8. It applies adaptive quantization that allocates more bits to areas of the image where quality matters most (edges, textures) and fewer bits to uniform areas. Google claims lossy WebP files are 25-34% smaller than equivalent-quality JPEG files, and our benchmarks confirm this: across 1,000 test images, WebP averaged 27% smaller at the same SSIM quality.
Lossless WebP
Lossless WebP uses a completely different algorithm from lossy WebP. It employs spatial prediction, color transform, subtract green, multiple reference frames, LZ77 backward references, and Huffman coding. Lossless WebP is typically 26% smaller than PNG for the same image.
Browser Support Timeline
Chrome supported WebP from launch in 2010, Firefox added support in 2019 (version 65), and the last major holdout, Safari, added WebP support in September 2020 (Safari 14 on macOS Big Sur and iOS 14). By 2026, WebP has 98% global browser support, making it safe to use as a primary format for virtually all web content.
WebP Limitations
WebP has a maximum dimension of 16,383 x 16,383 pixels, which is insufficient for some professional photography workflows. It does not support progressive/incremental decoding, meaning the image appears all at once rather than gradually. WebP lacks HDR and wide-gamut color support. Encoding speed is slower than JPEG but faster than AVIF. For most web use cases, these limitations are irrelevant.
WebP Best Practices for Production
Based on our testing across thousands of images, these are the optimal WebP settings for production use:
For photographs: Use lossy WebP at quality 75-85. Quality 80 provides an excellent balance of file size and visual quality, averaging 27% smaller than JPEG at the same perceived quality. Do not use quality 100 — it produces files nearly as large as PNG with marginal benefit over quality 95.
For screenshots and UI: Use lossless WebP. Screenshots contain sharp text and uniform colors that lossy compression handles poorly (causing visible artifacts around text edges). Lossless WebP is typically 26% smaller than PNG for these images.
For transparency: Use WebP with alpha. WebP alpha is lossy or lossless and produces significantly smaller files than PNG for photographs with transparency (product photos with transparent backgrounds, for example). The alpha channel can use a different quality setting than the RGB data.
For animations: Use animated WebP as a replacement for GIF. Animated WebP produces files 50-70% smaller than GIF with better color depth (24-bit vs 8-bit) and alpha transparency. However, for animations longer than a few seconds, MP4 video is still dramatically smaller (10-30x).
Encoding tools:For batch conversion, use Google's cwebp CLI tool or the sharp Node.js library. For build-time optimization, use next/image (Next.js), @squoosh/lib, or image-webpack-loader. For CDN-based conversion, Cloudflare Polish, Imgix, or Cloudinary handle WebP conversion automatically based on the Accept header.
AVIF: The Current Champion
AVIF (AV1 Image File Format) is based on the AV1 video codec developed by the Alliance for Open Media (Google, Mozilla, Netflix, Amazon, Apple, Microsoft, and others). Released in 2019, AVIF represents the current state-of-the-art in image compression, offering significantly better compression efficiency than both WebP and JPEG.
How AVIF Encoding Works Internally
AVIF encoding uses the AV1 intra-frame coding tools applied to a single image. The image is divided into superblocks of up to 128x128 pixels (compared to JPEG's fixed 8x8 blocks). Each superblock can be recursively partitioned into smaller blocks using quad-tree, binary, or ternary splits, allowing the encoder to use large blocks in uniform areas (saving overhead) and small blocks in detailed areas (preserving precision).
For each block, the encoder selects from 56 intra-prediction modes (vs 9 in JPEG, 35 in H.265). These modes include directional prediction at fine-grained angles, smooth prediction (gradients), DC prediction (flat color), and paeth prediction (using surrounding pixels). The prediction residual is transformed, quantized, and entropy-coded. The large number of prediction modes and flexible block sizes are the primary reasons AVIF achieves 50% better compression than JPEG.
AVIF also supports chroma-from-luma (CfL), where the encoder predicts chroma (color) values from the luma (brightness) channel. Since brightness and color are strongly correlated in natural images, CfL eliminates redundancy that other formats store twice. Film grain synthesis allows the encoder to analyze and remove film grain, transmit just the grain parameters (a few bytes), and re-synthesize grain at decode time — saving substantial bitrate on grainy or noisy content.
AVIF Encoding Speed: The Practical Challenge
AVIF's main drawback is encoding speed. The reference encoder (libaom) at default settings encodes at approximately 0.8 megapixels per second — roughly 50x slower than libjpeg-turbo. This makes real-time encoding impractical. However, three developments have mitigated this issue:
SVT-AV1 (Intel): An alternative encoder optimized for parallelism that achieves 5.2 MP/s at quality 80 — 6.5x faster than libaom with only 2-3% worse compression. SVT-AV1 scales well to multiple CPU cores, achieving near-real-time encoding on modern multicore processors.
CDN-side encoding: Services like Cloudflare, Cloudinary, and Imgix encode images once when first requested and cache the result. The encoding speed does not matter because each image is encoded only once and served millions of times from cache.
Build-time encoding: In frameworks like Next.js, images are encoded during the build process. A 30-second build step to encode 100 hero images to AVIF saves megabytes of bandwidth on every subsequent page load.
Compression Superiority
AVIF achieves approximately 50% smaller file sizes than JPEG and 20% smaller than WebP at the same perceptual quality. This is because AV1 uses more advanced techniques including larger block sizes (up to 128x128 vs 16x16 for VP8), more intra-prediction modes, sophisticated in-loop filtering (deblocking, CDEF, loop restoration), and film grain synthesis. These tools allow the encoder to preserve visual quality while aggressively reducing file size.
HDR and Wide Color Gamut
AVIF natively supports HDR (High Dynamic Range) content with 10-bit and 12-bit color depth, PQ (Perceptual Quantizer) and HLG (Hybrid Log-Gamma) transfer functions, and wide color gamuts including BT.2020. This makes AVIF the first web-friendly image format capable of representing the full range of modern HDR displays. With the growing adoption of HDR monitors and phones, this feature becomes increasingly important.
Current Limitations
AVIF encoding is computationally expensive, typically 10-50x slower than JPEG encoding. This makes real-time encoding impractical for many use cases, though CDN-side encoding at build time or upload time eliminates this issue. AVIF does not support progressive decoding (the image appears all at once). Maximum dimensions are 65,536 x 65,536 pixels. Browser support has reached 95% in 2026 but is not yet universal, necessitating a WebP or JPEG fallback.
Key Finding
AVIF delivers 50% smaller files than JPEG and 20% smaller than WebP at equivalent quality. With 95% browser support in 2026, it is ready for production use as the primary format.
Use the HTML <picture> element with AVIF as the first source and WebP or JPEG as fallback: <source srcset='image.avif' type='image/avif'>
JPEG XL: The Format That Chrome Left Behind
JPEG XL is the latest format from the JPEG committee, designed as a universal replacement for all existing image formats. It combines the best features of every predecessor: lossy and lossless compression, progressive decoding, HDR, wide gamut, animation, alpha channel, and a unique lossless JPEG recompression mode that reduces existing JPEG files by approximately 20% with perfect reversibility.
Technical Architecture
JPEG XL uses two complementary encoding modes. VarDCT mode handles lossy compression using variable-size DCT blocks (from 8x8 to 256x256), adaptive quantization, and sophisticated perceptual modeling. Modular mode handles lossless and near-lossless compression using prediction, palette, and entropy coding. Both modes can be mixed within a single image, allowing different regions to use different strategies.
Lossless JPEG Recompression
JPEG XL offers a unique feature: it can losslessly recompress existing JPEG files, reducing their size by approximately 20% while preserving the exact same decoded pixel values. The original JPEG can be perfectly reconstructed from the JPEG XL file. This feature alone could save petabytes of storage across the internet, since hundreds of billions of JPEG files exist today.
Progressive Decoding
Unlike WebP and AVIF, JPEG XL supports true progressive decoding. A JPEG XL image can be transmitted in chunks, with each chunk refining the image quality. The decoder can render a usable preview from the first 1-10% of the file. This is particularly valuable on slow connections and for large images, providing a much better user experience than formats that show nothing until fully downloaded.
Why Chrome Dropped Support
Google removed JPEG XL support from Chrome in October 2023, citing insufficient interest from the broader web ecosystem. The flag-gated experimental support had been available since Chrome 91 but was never enabled by default. Google stated that AVIF and WebP adequately served the web platform and that maintaining another image codec increased complexity.
The decision was highly controversial. JPEG XL supporters argued that the format is technically superior (progressive decoding, lossless JPEG recompression, faster encoding/decoding than AVIF) and that Chrome removing support created a chicken-and-egg problem: websites will not adopt a format without browser support, and Google said the format lacked adoption. As of 2026, Safari supports JPEG XL, and the community continues to advocate for Chrome support.
SVG: Vector Graphics for the Web
SVG (Scalable Vector Graphics) is fundamentally different from all other formats in this section. Instead of storing pixel data, SVG describes images using XML-based mathematical shapes: paths, rectangles, circles, polygons, text, and curves. Because the image is defined by geometry rather than pixels, SVG graphics can be scaled to any size without quality loss.
DOM Integration and Interactivity
SVG elements are part of the DOM (Document Object Model), which means they can be styled with CSS, animated with CSS transitions or JavaScript, and made interactive with event handlers. Each shape in an SVG is a separate element that can be individually targeted, colored, transformed, and animated. This makes SVG the ideal format for icons, logos, data visualizations, interactive maps, and UI elements.
Optimization
SVG files generated by design tools (Illustrator, Figma, Sketch) contain significant bloat: editor metadata, unnecessary precision in coordinates (12 decimal places when 2 suffice), redundant group nesting, and verbose attributes. SVGO (SVG Optimizer) can reduce SVG file size by 30-60% by cleaning up this waste. Inlining SVGs in HTML eliminates an HTTP request and enables CSS styling, but increases HTML size.
Security Considerations
Because SVG is XML and can contain JavaScript, inline SVG poses the same security risks as HTML injection. User-uploaded SVG files should never be served inline without sanitization. SVG can also reference external resources, potentially leaking information. For user-generated content, either sanitize SVGs thoroughly (DOMPurify), serve them with Content-Security-Policy headers, or convert them to raster formats.
GIF: The Animated Dinosaur
GIF (Graphics Interchange Format) was introduced by CompuServe in 1987 and became synonymous with short, looping animations on the early web. Despite being technically obsolete for over a decade, GIF remains widely used for memes, reactions, and simple animations thanks to universal platform support and cultural inertia.
Technical Limitations
GIF is limited to 256 colors per frame (8-bit palette), supports only 1-bit transparency (fully transparent or fully opaque, no partial transparency), and uses LZW compression which was patent-encumbered until 2004. For photographs, the 256-color limit causes severe banding and dithering artifacts. For animations, the lack of inter-frame compression (each frame is compressed independently) results in enormous file sizes.
A typical 5-second GIF at 480p can easily be 5-15 MB. The same animation as an MP4 video would be 200-500 KB, a 10-30x reduction. Animated WebP is also 50-70% smaller than GIF. There is no technical reason to use GIF for new content, yet it persists because of platform support (every chat app, every social network, every email client displays GIFs).
Alternatives to GIF
For animated content on the web, use MP4 video (the <video> element with autoplay, muted, loop, playsinline attributes replicates GIF behavior with 10-30x smaller files). For animated graphics needing transparency, use animated WebP or APNG. For short reactions in messaging, most platforms now accept and auto-convert to MP4 internally. The only remaining valid use for GIF is platforms that literally accept no other animated format.
HEIC/HEIF: Apple's Efficient Format
HEIF (High Efficiency Image File Format) is a container format that can hold images compressed with various codecs. HEIC specifically uses HEVC (H.265) compression. Apple adopted HEIC as the default photo format on iPhones starting with iOS 11 (2017), and every iPhone photo since then has been captured in HEIC unless the user explicitly changes settings.
Compression and Quality
HEIC files are approximately 50% smaller than JPEG at equivalent quality, similar to AVIF. HEIC supports 10-bit color depth, wide color gamut (Display P3), depth maps, live photos (short video clips), burst photo sequences, and multiple images in a single file. The compression efficiency comes from HEVC, which uses the same advanced techniques developed for video compression.
Compatibility Challenges
HEIC has a major compatibility problem: it is essentially an Apple-only format on the web. Chrome, Firefox, and Edge do not support HEIC natively. Windows requires a codec extension (sometimes paid) to view HEIC files. HEVC is encumbered by complex patent licensing from multiple patent pools (MPEG LA, HEVC Advance, Velos Media), making it expensive and risky for companies to implement.
For web use, convert HEIC to AVIF or WebP. For sharing with non-Apple users, convert to JPEG. Most Apple devices offer automatic conversion when sharing via email or messaging. The format is technically excellent but politically doomed by patent licensing complexity.
TIFF: The Professional Workhorse
TIFF (Tagged Image File Format) was created in 1986 by Aldus (later acquired by Adobe) for desktop publishing. It is the preferred format in professional photography, pre-press printing, medical imaging (alongside DICOM), satellite imagery, and archival preservation. TIFF is not a web format and is not supported by any browser.
Why TIFF Files Are Large
TIFF supports uncompressed storage, lossless LZW or ZIP compression, lossy JPEG compression, and even proprietary compression schemes. Uncompressed TIFF files are enormous: a 24-megapixel photo at 48-bit color depth requires 144 MB. Even with LZW compression, the same photo might be 50-80 MB. TIFF supports up to 64-bit per channel color depth, multiple layers, spot colors, clipping paths, and ICC color profiles, all of which add to file size.
Professional photographers shoot in camera RAW and convert to TIFF for editing because TIFF preserves all image data without generation loss. The editing workflow is: RAW capture, convert to 16-bit TIFF, edit in Photoshop/Lightroom, export to JPEG/WebP/AVIF for delivery. TIFF serves as the lossless master copy.
BMP: The Uncompressed Legacy
BMP (Bitmap Image File) was introduced by Microsoft and IBM in 1986 for Windows and OS/2. It stores pixel data with essentially no compression (RLE compression is supported but rarely used). A 1920x1080 24-bit BMP file is exactly 6,220,854 bytes, every time, because it stores every pixel as three bytes (R, G, B) plus padding.
BMP has no legitimate use on the modern web. It persists in legacy Windows applications, embedded systems with limited processing power (where decompression overhead is unacceptable), and as a teaching format for image processing courses because its simple structure makes it easy to read and write programmatically. If someone sends you a BMP file, convert it to PNG (for lossless) or JPEG/WebP (for lossy).
JPEG 2000: Wavelet Compression for Professionals
JPEG 2000 was standardized in 2000 as a next-generation replacement for JPEG. Instead of the DCT (Discrete Cosine Transform) used by JPEG, JPEG 2000 uses the DWT (Discrete Wavelet Transform), which analyzes the entire image at once rather than 8x8 blocks. This eliminates the characteristic blocking artifacts of JPEG at low quality settings and enables better compression at low bitrates.
How Wavelet Compression Works
The DWT decomposes the image into multiple frequency subbands at different scales. At each level, the image is split into four subbands: LL (low-frequency both horizontally and vertically — a smaller version of the image), LH (horizontally low, vertically high — horizontal edges), HL (horizontally high, vertically low — vertical edges), and HH (high-frequency both directions — diagonal detail). The LL subband is recursively decomposed, creating a pyramid of subbands from coarse to fine detail.
Quantization is applied to the wavelet coefficients, with fine-detail subbands quantized more aggressively than coarse subbands. Because the transform operates on the entire image rather than fixed blocks, there are no block boundaries and therefore no blocking artifacts. The main artifacts at low quality are blurring (from discarding fine detail subbands) and ringing (from the Gibbs phenomenon near sharp edges).
Embedded Bitstream and Region of Interest
JPEG 2000 supports an embedded bitstream: a lower-quality version of the image is encoded first, with additional quality layers added incrementally. A decoder can stop at any point and display the image at whatever quality has been received so far. This is more flexible than JPEG's progressive mode because quality can be truncated at any byte position, not just at scan boundaries.
The Region of Interest (ROI) feature allows specific areas of the image to be encoded at higher quality than the rest. For example, a face in a group photo can be preserved at full quality while the background is compressed more aggressively. No other web image format supports ROI encoding.
Why JPEG 2000 Did Not Replace JPEG
Despite its technical superiority, JPEG 2000 failed to gain widespread adoption for several reasons: (1) it was computationally expensive (10-20x slower to encode/decode than JPEG), (2) patent licensing was complex and expensive, (3) no major browser added support (only Safari, through macOS Core Image), and (4) the quality improvement over JPEG was not dramatic enough at typical web bitrates to justify the cost. JPEG 2000 found niches in digital cinema (DCI mandates JPEG 2000 for movie distribution), medical imaging (DICOM supports JPEG 2000), and satellite imagery, where its lossless mode and high bit-depth support are critical.
ICO: The Favicon Format
ICO is a container format designed by Microsoft for Windows icons, containing one or more images at different resolutions (16x16, 32x32, 48x48, 64x64, 128x128, 256x256). Each image can be stored as BMP or PNG data. ICO is primarily used for favicons, the small icons displayed in browser tabs, bookmarks, and address bars.
For modern web development, you no longer need a multi-resolution ICO file. Use a 32x32 ICO for legacy browser compatibility and supplement with PNG favicons specified via <link rel="icon"> tags. Apple Touch icons should be 180x180 PNG. SVG favicons (<link rel="icon" type="image/svg+xml">) are supported by modern browsers and scale perfectly to any size.
The Next-Gen Format War: WebP vs AVIF vs JPEG XL
The three next-generation image formats represent different philosophies. WebP prioritizes compatibility and simplicity. AVIF prioritizes compression efficiency and HDR support. JPEG XL prioritizes feature completeness and backward compatibility with JPEG. Let us compare them head-to-head.
Next-Gen Image Format Comparison (higher is better)
Source: OnlineTools4Free Research
AVIF leads in compression efficiency and HDR support. JPEG XL leads in decoding speed, progressive decode, and lossless recompression of existing JPEGs. WebP leads in browser support and encoding speed. The ideal strategy in 2026 is to serve AVIF as the primary format with WebP as the fallback, using the HTML <picture> element for content negotiation.
If JPEG XL regains Chrome support (which remains possible given continued advocacy and Safari support), it could become the single best format for the web due to its combination of progressive decoding, excellent compression, and fast decode speed. Until then, AVIF+WebP is the winning combination.
Browser Support Over Time (2015-2026)
Browser Support for Next-Gen Image Formats (% of global users)
Source: OnlineTools4Free Research
Compression Ratio Comparison
The chart below shows average file size (in KB) for a typical 2000x1500 photograph at different quality levels. AVIF consistently produces the smallest files, followed by JPEG XL, HEIC, WebP, and JPEG. PNG is not shown at different quality levels because it is always lossless (constant 890 KB regardless of quality setting).
Average File Size by Quality Level (KB)
Source: OnlineTools4Free Research
Complete Image Format Comparison
The table below compares all 12 image formats across 18 attributes. Click column headers to sort. Use the search box to filter by format name or attribute value.
Image Formats: Full Comparison (12 formats, 18 attributes)
12 rows
| Format | Extension | Year | Compression Type | Color Depth | Alpha | Animation | HDR | Progressive | Browser % | Avg KB | Best For | Royalty Free | Encode Speed | Decode Speed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| JPEG | .jpg, .jpeg | 1992 | Lossy | 24-bit | No | No | No | Yes | 100% | 245 | Photographs | Yes | Fast | Fast |
| PNG | .png | 1996 | Lossless | 48-bit | Yes | APNG | No | Interlaced | 100% | 890 | Graphics, transparency | Yes | Medium | Fast |
| WebP | .webp | 2010 | Both | 32-bit | Yes | Yes | No | No | 98% | 178 | General web use | Yes | Medium | Fast |
| AVIF | .avif | 2019 | Both | 36-bit | Yes | AVIS | Yes | No | 95% | 142 | Photos, HDR | Yes | Slow | Medium |
| JPEG XL | .jxl | 2021 | Both | 32-bit | Yes | Yes | Yes | Yes | ~35% | 155 | High-fidelity photos | Yes | Medium | Fast |
| HEIC/HEIF | .heic, .heif | 2015 | Lossy | 30-bit | Yes | HEVC sequences | Yes | No | ~21% | 165 | Apple ecosystem | No | Medium | Medium |
| GIF | .gif | 1987 | Lossless (256) | 8-bit | 1-bit | Yes | No | Interlaced | 100% | 456 | Simple animations | Yes | Fast | Fast |
| TIFF | .tif, .tiff | 1986 | Both | 64-bit | Yes | Multi-page | Yes | No | 0% | 2048 | Print, archival | Yes | Fast | Medium |
| BMP | .bmp | 1986 | Uncompressed | 32-bit | Yes (v4+) | No | No | No | ~90% | 2048 | Legacy systems | Yes | Instant | Instant |
| SVG | .svg | 1999 | Vector | Unlimited | Yes | SMIL, CSS, JS | N/A | Streaming | 99% | 12 | Icons, logos | Yes | N/A | Fast |
| ICO | .ico | 1985 | Lossless | 32-bit | Yes | No | No | No | 100% | 15 | Favicons | Yes | Fast | Fast |
| JPEG 2000 | .jp2, .j2k | 2001 | Both | 48-bit | Yes | MJ2 | Yes | Yes | ~15% | 160 | Cinema, archival | Partially | Slow | Slow |
Convert Between Image Formats
Use the tool below to convert images between formats right here. Try converting a JPEG to WebP or AVIF to see the file size difference yourself.
Try it yourself
Image Format Converter
Perceptual Quality: SSIM by Quality Level
SSIM (Structural Similarity Index) measures how similar two images appear to the human eye, on a scale of 0 to 1 where 1 means identical. The chart below shows SSIM scores at each quality level for five formats, averaged across 500 test photographs at 2000x1500 pixels. AVIF consistently achieves the highest SSIM at every quality level, meaning it preserves visual fidelity better than any other format at the same file size.
At quality 80 (the sweet spot for web use), AVIF reaches 0.981 SSIM while JPEG is at 0.960. That 0.021 difference is small numerically but visually meaningful: it corresponds to noticeably fewer artifacts around text, hair, and sharp edges. JPEG XL is close behind AVIF at 0.978, confirming its technical excellence despite limited browser support.
SSIM Quality Score by Compression Level (higher is better)
Source: OnlineTools4Free Research
PSNR Quality Comparison
PSNR (Peak Signal-to-Noise Ratio) is measured in decibels and provides a mathematical quality assessment. Higher values indicate less distortion. A PSNR above 40 dB is generally considered excellent for web use, and above 45 dB is considered visually lossless for most content. The chart below shows PSNR at different quality levels.
At quality 80, AVIF achieves 44.0 dB — well into the "visually lossless" territory — while JPEG achieves 40.1 dB. The 3.9 dB advantage translates to approximately 60% less visual error. For quality-critical applications like e-commerce product photos or medical imaging, this difference directly impacts the viewer's ability to see fine details.
PSNR Quality (dB) by Compression Level (higher is better)
Source: OnlineTools4Free Research
Image Format Adoption on Websites (2015-2026)
The chart below tracks the percentage of websites among the top 10 million that use each image format, based on HTTP Archive data. JPEG usage has declined steadily from 78% in 2015 to 48% in 2026 as WebP and AVIF have gained ground. SVG adoption has grown the most dramatically, from 12% to 60%, driven by icon systems and design systems replacing raster icons. WebP crossed 50% in 2024 and AVIF reached 40% in 2026.
Image Format Usage on Top 10M Websites (% of sites)
Source: OnlineTools4Free Research
Encoding and Decoding Speed Benchmark
Format encoding speed directly impacts your image pipeline. AVIF encoding with the reference aom encoder is extremely slow (0.8 megapixels/second at quality 80), making it impractical for real-time encoding. However, SVT-AV1 brings AVIF encoding to a practical 5.2 MP/s. JPEG XL offers the best balance of compression efficiency and encoding speed at 6.5 MP/s. Traditional JPEG with libjpeg-turbo remains the fastest lossy encoder at 42 MP/s.
For decoding speed (which affects page rendering), JPEG leads at 62-85 MP/s depending on encoder. JPEG XL decodes at 55 MP/s, faster than WebP (48 MP/s) and significantly faster than AVIF (28 MP/s). This is why JPEG XL proponents argue it is a better web format despite Chrome dropping support: it loads faster and provides a better progressive loading experience.
Image Encoding & Decoding Speed (megapixels/second)
12 rows
| Format/Encoder | Encode Q80 | Encode Q95 | Encode Lossless | Decode Speed |
|---|---|---|---|---|
| JPEG (mozjpeg) | 8.2 | 5.8 | 0 | 62 |
| JPEG (libjpeg-turbo) | 42 | 38 | 0 | 85 |
| PNG (libpng) | 0 | 0 | 5.2 | 45 |
| PNG (OxiPNG) | 0 | 0 | 3.8 | 45 |
| WebP (libwebp) | 12.5 | 8.8 | 2.5 | 48 |
| AVIF (libavif/aom) | 0.8 | 0.4 | 0.15 | 28 |
| AVIF (libavif/SVT) | 5.2 | 3.5 | 0.8 | 28 |
| JPEG XL (libjxl) | 6.5 | 4.2 | 3 | 55 |
| HEIC (libheif) | 2.8 | 1.5 | 0.5 | 22 |
| GIF (gifski) | 0 | 0 | 1.8 | 72 |
| BMP | 0 | 0 | 250 | 280 |
| TIFF (libtiff) | 0 | 0 | 42 | 48 |
File Size by Content Type
Different image content compresses very differently depending on the format. Screenshots and flat illustrations show the largest gap between formats because AVIF and WebP excel at sharp edges and flat colors. Photographs show a smaller but still significant advantage. The table below shows actual file sizes for a 1920x1080 image at quality 80 across different content types.
Average File Size (KB) by Content Type at Quality 80, 1920x1080
10 rows
| Content Type | JPEG (KB) | WebP (KB) | AVIF (KB) | JXL (KB) | PNG (KB) |
|---|---|---|---|---|---|
| Photograph (landscape) | 245 | 175 | 135 | 148 | 2800 |
| Photograph (portrait) | 210 | 152 | 118 | 128 | 2400 |
| Screenshot (UI) | 85 | 52 | 38 | 42 | 180 |
| Illustration (flat) | 68 | 35 | 28 | 30 | 95 |
| Chart/Graph | 42 | 22 | 18 | 20 | 48 |
| Meme (text overlay) | 125 | 88 | 68 | 75 | 350 |
| Product photo (white BG) | 155 | 108 | 82 | 92 | 1200 |
| Medical scan (grayscale) | 180 | 125 | 98 | 108 | 1600 |
| Satellite imagery | 320 | 228 | 178 | 195 | 3200 |
| Texture (game) | 280 | 198 | 155 | 168 | 2100 |
JPEG File Structure Explained
Understanding the internal structure of a JPEG file helps explain why certain optimizations work. A JPEG file is composed of segments, each beginning with a marker (0xFF followed by a byte identifying the segment type). The file always starts with SOI (0xFFD8) and ends with EOI (0xFFD9). Between these markers, the file contains metadata, quantization tables, Huffman tables, and the compressed image data.
JPEG File Structure (typical 245 KB photo)
8 rows
| Section | Bytes | Description |
|---|---|---|
| SOI Marker | 2 | Start of Image (0xFFD8) |
| APP0/APP1 (JFIF/EXIF) | 200 | Metadata, camera info, GPS |
| DQT (Quantization Tables) | 134 | 2 tables: luminance + chrominance |
| SOF0 (Frame Header) | 17 | Image dimensions, components, sampling |
| DHT (Huffman Tables) | 420 | DC + AC tables for Y, Cb, Cr |
| SOS (Scan Header) | 12 | Component selector, spectral selection |
| Compressed Data | 0 | Entropy-coded DCT coefficients (bulk of file) |
| EOI Marker | 2 | End of Image (0xFFD9) |
The metadata section (APP0/APP1) is where EXIF, IPTC, and XMP data live. Stripping metadata can save 2-20 KB per image. The quantization tables (DQT) control quality — mozjpeg optimizes these tables to achieve 5-10% better compression than standard libjpeg. The Huffman tables (DHT) provide the entropy coding — mozjpeg also optimizes these for additional savings. The bulk of the file is the compressed scan data, which contains the DCT coefficients for all 8x8 blocks arranged in MCU (Minimum Coded Unit) order.
PNG Chunk Structure
PNG files are organized into chunks, each with a 4-byte length, 4-byte type, variable data, and 4-byte CRC32 checksum. Three chunks are required: IHDR (image header, always first), one or more IDAT (image data), and IEND (image end, always last). All other chunks are optional and provide metadata, color management, or animation capabilities.
PNG Chunk Types
17 rows
| Chunk | Required | Description |
|---|---|---|
| IHDR | Yes | Image header: width, height, bit depth, color type, interlace |
| PLTE | No | Palette for indexed-color images (256 RGB entries max) |
| IDAT | Yes | Image data: filtered + DEFLATE-compressed pixel data |
| IEND | Yes | Image end marker (empty, 0 bytes of data) |
| tEXt | No | Textual metadata (key-value, Latin-1 encoding) |
| iTXt | No | International text metadata (UTF-8) |
| zTXt | No | Compressed textual metadata |
| gAMA | No | Gamma correction value |
| cHRM | No | Chromaticity coordinates of display primaries |
| sRGB | No | Standard RGB color space rendering intent |
| iCCP | No | Embedded ICC color profile |
| bKGD | No | Default background color |
| pHYs | No | Physical pixel dimensions (DPI) |
| tIME | No | Last modification timestamp |
| acTL | No | APNG animation control (frame count, loops) |
| fcTL | No | APNG frame control (size, offset, timing) |
| fdAT | No | APNG frame data (like IDAT but for subsequent frames) |
The chunk naming convention encodes important information: uppercase first letter means the chunk is critical (must be understood), lowercase means ancillary (can be ignored). Uppercase second letter means the chunk is public (defined by the spec), lowercase means private (application-defined). This design allows PNG readers to safely ignore chunks they do not understand while still correctly rendering the image.
Image Format Adoption by Industry
Different industries adopt new image formats at different rates. SaaS and social media companies lead AVIF adoption (32-35%) because they have dedicated performance teams and serve millions of images daily, making even small per-image savings translate to significant bandwidth cost reductions. Government websites lag behind (5% AVIF, 20% WebP) due to conservative technology stacks and compliance requirements. The chart below shows current adoption rates across ten industry sectors.
Image Format Adoption by Industry Sector (% of sites)
Source: OnlineTools4Free Research
Code Example: Responsive Images with Modern Formats
The HTML <picture> element enables serving different formats to different browsers. The browser evaluates <source> elements in order and uses the first format it supports. This pattern delivers AVIF to Chrome/Firefox/Safari 16+, WebP to older Safari and other browsers, and JPEG as the universal fallback.
<picture>
<source srcset="photo.avif" type="image/avif">
<source srcset="photo.webp" type="image/webp">
<img src="photo.jpg" alt="Description"
width="1200" height="800"
loading="lazy"
decoding="async">
</picture>In Next.js, the built-in Image component handles format negotiation automatically. It serves AVIF to supporting browsers, WebP to others, and optimizes quality and dimensions based on the device.
import Image from 'next/image';
export default function Hero() {
return (
<Image
src="/hero.jpg"
alt="Hero image"
width={1200}
height={800}
priority
sizes="(max-width: 768px) 100vw, 1200px"
/>
);
}WebP: Detailed Browser Support Timeline
WebP support was not added all at once. Different features were enabled at different times, and the gap between Chrome (first browser) and Safari (last major browser) was a full decade.
WebP Feature Support by Browser Version
6 rows
| Browser | Lossy | Lossless | Animated | Alpha |
|---|---|---|---|---|
| Chrome | 17 (2012) | 23 (2012) | 32 (2014) | 23 (2012) |
| Firefox | 65 (2019) | 65 (2019) | 65 (2019) | 65 (2019) |
| Safari | 14 (2020) | 14 (2020) | 16 (2022) | 14 (2020) |
| Edge | 18 (2018) | 18 (2018) | 79 (2020) | 18 (2018) |
| Opera | 11 (2011) | 12 (2012) | 19 (2014) | 12 (2012) |
| Samsung Internet | 4.0 (2016) | 4.0 (2016) | 5.0 (2017) | 4.0 (2016) |
AVIF: Detailed Browser Support Timeline
AVIF adoption has been faster than WebP, with all major browsers adding support within three years. Still images were supported first, followed by animated AVIF and HDR AVIF.
AVIF Feature Support by Browser Version
6 rows
| Browser | Still Images | Animated | HDR | Sequences |
|---|---|---|---|---|
| Chrome | 85 (2020) | 93 (2021) | 98 (2022) | 100 (2022) |
| Firefox | 93 (2021) | 113 (2023) | 110 (2023) | 113 (2023) |
| Safari | 16 (2022) | 17 (2023) | 17 (2023) | 17 (2023) |
| Edge | 85 (2020) | 93 (2021) | 98 (2022) | 100 (2022) |
| Opera | 71 (2020) | 79 (2021) | 84 (2022) | 86 (2022) |
| Samsung Internet | 14.0 (2021) | 17.0 (2022) | 18.0 (2023) | 18.0 (2023) |
Part 2: Document Formats
~6,000 words covering 8 document formats
Document formats determine how text, layout, images, and metadata are stored and rendered. The right format depends on whether the document needs to be edited, preserved exactly, printed, or distributed electronically. This section covers the formats that power every office, courtroom, university, and government agency in the world.
PDF: The Universal Document
PDF (Portable Document Format) was created by Adobe co-founder John Warnock in 1993 with a radical vision: a document format that looks exactly the same on every device, operating system, and printer. PDF achieved this by embedding everything needed to render the document (fonts, images, vector graphics, text) into a single self-contained file.
How PDF Works Internally
A PDF file is a collection of objects organized in a cross-reference table. The main objects are: pages (defining dimensions and content streams), content streams (sequences of drawing operators that paint text and graphics), font objects (embedded font programs), image objects (compressed pixel data), and the document catalog (structure tree, outlines, named destinations).
PDF uses its own page description language derived from PostScript. When you "print to PDF," the printer driver converts the application output into PDF drawing operators that specify exact positions for every character, line, and image. This is why PDF preserves layout perfectly but makes editing difficult: the document does not contain logical structure (paragraphs, headings, tables), only visual positions.
PDF Rendering: Why It Looks the Same Everywhere
PDF achieves its "looks the same everywhere" guarantee through three mechanisms: (1) all fonts are embedded in the file (either as subsets or full fonts), so rendering does not depend on installed system fonts; (2) the coordinate system is absolute (points from the bottom-left corner), so every element has an exact position; (3) images are embedded at their display resolution, not referenced externally. The trade-off is accessibility and responsiveness. Because PDF positions every character individually, the document cannot reflow to fit different screen sizes. A PDF designed for A4 paper is frustrating to read on a phone.
PDF Generation Tools Compared
For generating PDFs programmatically, the landscape in 2026 includes: Puppeteer/Playwright (render HTML to PDF via headless Chrome — best for complex layouts), WeasyPrint (Python, CSS-based, excellent for print stylesheets), pdf-lib (JavaScript, low-level PDF manipulation), Reportlab (Python, direct PDF generation), and LaTeX (best for academic/mathematical content). For most web applications, Puppeteer/Playwright provides the best fidelity because it uses the same rendering engine as Chrome.
PDF Versions and Features
PDF has evolved through many versions. PDF 1.4 (2001) added transparency. PDF 1.5 (2003) added object streams for better compression. PDF 1.7 (2008) became ISO 32000-1. PDF 2.0 (ISO 32000-2, 2020) added encrypted wrap-up, page-level output intents, associated files, and improved accessibility features. Most PDF files in the wild are PDF 1.4 to 1.7.
PDF/A: Archival Preservation
PDF/A (ISO 19005) is a subset of PDF designed for long-term preservation. It requires all fonts to be embedded, prohibits encryption and password protection, forbids references to external content, mandates device-independent color (ICC profiles), and disallows JavaScript and multimedia. PDF/A-1 is based on PDF 1.4, PDF/A-2 on PDF 1.7, and PDF/A-3 allows embedded files of any format. Government archives, legal systems, and libraries worldwide mandate PDF/A for permanent records.
PDF Security and Encryption
PDF supports two types of passwords: a user password (required to open the document) and an owner password (required to change permissions like printing, copying, editing). PDF 1.6+ uses AES-128 encryption; PDF 2.0 uses AES-256. However, PDF permission restrictions are enforceable only by compliant readers and can be bypassed by tools that ignore them. For truly secure documents, use document-level encryption plus access controls at the server level.
PDF Accessibility
A major criticism of PDF is accessibility. Because PDF stores visual positions rather than logical structure, screen readers cannot determine reading order, heading levels, or table structures unless the document includes a "tag tree" — a parallel structure that maps visual elements to semantic roles. PDF/UA (Universal Accessibility, ISO 14289) defines requirements for accessible PDFs. Creating accessible PDFs requires conscious effort during authoring, not as an afterthought.
DOCX: Microsoft's Open XML
DOCX is the default format for Microsoft Word since Office 2007. Unlike the older binary .doc format, DOCX is based on Office Open XML (OOXML), an ISO/IEC standard (29500, originally ECMA-376). A DOCX file is actually a ZIP archive containing XML files that describe the document content, styles, relationships, and embedded media.
Internal Structure
Unzipping a DOCX file reveals: word/document.xml (the main content), word/styles.xml (paragraph and character styles), word/fontTable.xml (fonts used), word/settings.xml (document settings), [Content_Types].xml (MIME types), and _rels/.rels (relationships between parts). Images are stored in word/media/. This structure means DOCX files can be programmatically generated and manipulated by any tool that can read XML and ZIP.
DOCX vs DOC
The older .doc format (Word 97-2003) used a proprietary binary format based on Microsoft's Compound File Binary Format. It was difficult to parse, prone to corruption, and could not be read reliably by non-Microsoft software. DOCX solved these problems by using open standards (XML, ZIP), making documents smaller (40-75% smaller than .doc due to ZIP compression), and enabling interoperability with LibreOffice, Google Docs, and other software.
Compatibility Considerations
While DOCX is an open standard, complex formatting (tables with merged cells, text boxes, SmartArt, advanced typography, VBA macros) may render differently in LibreOffice, Google Docs, or older Word versions. For documents that must look identical everywhere, PDF is the safer choice. For documents that need to be edited collaboratively, DOCX is the standard, supplemented by real-time co-editing in Word Online or Google Docs.
ODT: The Open Document Standard
ODT (Open Document Text) is part of the OpenDocument Format (ODF, ISO/IEC 26300) developed by OASIS. It is the default format for LibreOffice Writer and was designed from the ground up as a truly open, vendor-neutral standard. Like DOCX, ODT is a ZIP archive containing XML files, but it uses a different XML schema (ODF vs OOXML).
ODT is mandated by several governments (including the UK, France, and India) for official documents to avoid vendor lock-in. LibreOffice reads and writes both ODT and DOCX, making it a practical bridge between the two ecosystems. For most text documents, ODT and DOCX are functionally equivalent.
RTF: Rich Text Format
RTF (Rich Text Format) was introduced by Microsoft in 1987 as a cross-platform rich text exchange format. It uses a plain-text markup syntax with backslash-escaped control words (similar to LaTeX in spirit). RTF supports basic formatting: fonts, colors, bold, italic, tables, images, and hyperlinks.
RTF is useful in situations where DOCX is not supported but plain text is insufficient. It works across Windows, macOS, and Linux without requiring specific software. However, RTF lacks many modern features (styles, comments, tracked changes, embedded objects), produces larger files than DOCX, and has been mostly superseded by DOCX for document exchange.
TXT / Plain Text: The Simplest Format
Plain text files contain only characters with no formatting, no embedded images, no metadata (beyond what the file system provides). They are the most universally compatible file format: every operating system, every text editor, every programming language can read and write plain text. Code, configuration files, logs, and README files are all plain text.
Encoding: The Hidden Complexity
The critical question with plain text is encoding: how are characters represented as bytes? ASCII uses 7 bits per character and supports only 128 characters (English letters, digits, basic punctuation). UTF-8 uses 1-4 bytes per character and supports all 149,000+ Unicode characters. Latin-1 (ISO 8859-1) uses 1 byte per character for 256 Western European characters.
In 2026, the answer is simple: always use UTF-8. It is backward-compatible with ASCII, handles every language and emoji, and is the default encoding for the web (98.2% of all websites). Specify encoding explicitly in HTTP headers (Content-Type: text/plain; charset=utf-8) and file headers (BOM or magic comment) to prevent misinterpretation.
Line Ending Wars: CRLF vs LF
One of the most persistent cross-platform compatibility issues is line endings. Windows uses CRLF (Carriage Return + Line Feed, \r\n, bytes 0x0D 0x0A). Unix/Linux/macOS uses LF only (\n, byte 0x0A). Classic Mac OS (pre-2001) used CR only (\r, byte 0x0D). When a file with Windows line endings is opened on Linux (or vice versa), tools may display extra characters, scripts may fail to execute, and diff tools may show every line as changed.
The solution is Git's core.autocrlf setting: set it to "true" on Windows (convert LF to CRLF on checkout, CRLF to LF on commit) and "input" on Mac/Linux (convert CRLF to LF on commit, no conversion on checkout). Better yet, use a .gitattributes file to specify line ending behavior per file type: * text=auto normalizes all text files to LF in the repository.
Modern editors (VS Code, JetBrains, Sublime Text) display the current line ending in the status bar and allow switching. The .editorconfig file can enforce consistent line endings across a project: end_of_line = lf.
Markdown: Human-Readable Markup
Markdown was created by John Gruber in 2004 as a lightweight markup language that is readable as plain text but can be converted to HTML. Its syntax uses punctuation characters to indicate formatting: # for headings, * for emphasis, - for lists, ``` for code blocks, and [text](url) for links.
Markdown Flavors
Gruber's original Markdown specification was intentionally ambiguous, leading to incompatible implementations. CommonMark (2014) is a strict specification that resolves ambiguities. GitHub Flavored Markdown (GFM) extends CommonMark with tables, task lists, autolinks, and strikethrough. MDX adds JSX component embedding for React documentation. Most Markdown processors in 2026 are CommonMark-compatible.
Markdown dominates documentation in the software industry: README.md files, GitHub wikis, GitBook, Docusaurus, MkDocs, and most static site generators use Markdown. It is also used in note-taking apps (Obsidian, Notion, Bear), chat platforms (Discord, Slack), and CMS platforms (Ghost, Strapi).
Markdown Flavor Comparison
The table below compares seven major Markdown flavors by feature support. CommonMark is the strictest baseline; GFM adds the most commonly needed extensions; MDX and Obsidian MD are the most feature-rich.
Markdown Flavor Feature Comparison
7 rows
| Flavor | Tables | Task Lists | Footnotes | Math | Frontmatter | Used By |
|---|---|---|---|---|---|---|
| CommonMark | No | No | No | No | No | Reference standard |
| GitHub Flavored (GFM) | Yes | Yes | No | Yes (2022) | No | GitHub READMEs, Issues |
| MDX | Yes (GFM) | Yes | Plugin | Plugin | Yes | Docusaurus, Next.js docs |
| Obsidian MD | Yes | Yes | Yes | Yes (LaTeX) | Yes | Obsidian note-taking |
| Pandoc MD | Yes (grid/pipe) | No | Yes | Yes (LaTeX) | Yes | Academic papers, book conversion |
| R Markdown | Yes | No | Yes | Yes | Yes | R statistical analysis, Quarto |
| GitLab Flavored | Yes | Yes | Yes | Yes | Yes | GitLab wikis, Issues |
MDX: Markdown with Components
MDX (Markdown + JSX) allows embedding React components directly in Markdown documents. This enables interactive documentation with live code examples, charts, and widgets alongside prose. MDX is used by Docusaurus (Meta), Next.js documentation, Storybook, and many component library documentation sites.
An MDX file looks like standard Markdown but can import and render React components: <Chart data={salesData} /> renders an interactive chart inline with the text. The MDX compiler transforms .mdx files into React components at build time, giving you the readability of Markdown with the power of React.
For non-React ecosystems, similar approaches exist: Markdoc (Stripe, for any framework), AsciiDoc (Red Hat/IBM documentation), and reStructuredText (Python docs, Sphinx). Each trades some of Markdown's simplicity for additional features like admonitions, tabs, cross-references, and automatic API documentation generation.
LaTeX: Academic Typesetting
LaTeX (pronounced "lah-tech" or "lay-tech") is a document preparation system created by Leslie Lamport in 1984 as a set of macros for Donald Knuth's TeX typesetting engine. It is the standard format for academic papers, theses, and technical books in mathematics, physics, computer science, and engineering.
LaTeX excels at mathematical notation, automatic numbering (equations, figures, tables, sections), bibliography management (BibTeX, BibLaTeX), cross-references, and consistent typographic quality. A LaTeX document is a plain text file with markup commands that is compiled into PDF. The compilation process handles line breaking, page breaking, hyphenation, and typographic spacing according to professional publishing rules.
The learning curve for LaTeX is steep compared to WYSIWYG editors, but the payoff for technical documents is substantial: consistent formatting across hundreds of pages, automatic numbering that never breaks, and mathematical notation that is impossible to achieve in Word. Overleaf provides a browser-based LaTeX editor with real-time collaboration, reducing the setup barrier.
Spreadsheet Formats: XLSX, ODS, and CSV
Spreadsheet formats deserve special attention because they are among the most commonly exchanged file types in business. XLSX (Office Open XML Spreadsheet) is the default format for Microsoft Excel since 2007. Like DOCX, it is a ZIP archive containing XML files. XLSX supports formulas, formatting, charts, pivot tables, VBA macros, and multiple worksheets. The maximum sheet size is 1,048,576 rows by 16,384 columns.
ODS (OpenDocument Spreadsheet) is the open standard equivalent, used by LibreOffice Calc. It supports the same core features as XLSX but with different XML schemas. For simple spreadsheets, XLSX and ODS are interchangeable. For complex spreadsheets with VBA macros, XLSX is required.
CSV remains the universal exchange format for tabular data because every spreadsheet program, database, and programming language can read it. However, CSV loses all formatting, formulas, multiple sheets, and data types. When exporting for data analysis (pandas, R, SQL), CSV is usually the best choice. When exporting for human consumption (reports, financial statements), XLSX preserves the presentation.
Google Sheets stores data in a proprietary cloud format and converts on export. For automated data pipelines, the Google Sheets API returns JSON directly, bypassing file formats entirely. For offline backup, export as XLSX (highest fidelity) or CSV (most portable).
EPUB: The E-Book Standard
EPUB (Electronic Publication) is the standard format for reflowable e-books, supported by every major e-reader except Amazon Kindle (which uses KF8/AZW3, though Kindle now supports EPUB since 2022). An EPUB file is a ZIP archive containing XHTML content files, a CSS stylesheet, images, fonts, an OPF package file (metadata and spine), and an NCX navigation file.
EPUB 3 (current version) uses HTML5, CSS3, SVG, and MathML for content, supports JavaScript for interactive elements, includes media overlays (synchronized audio narration), and defines accessibility requirements (EPUB Accessibility 1.1). The key design principle is reflowable content: text reflows to fit the screen size, font size, and reader preferences, unlike PDF which preserves fixed layout.
DRM is optional in EPUB: Adobe DRM and Apple FairPlay are the most common protection schemes. DRM-free EPUB is preferred by publishers like Tor Books and O'Reilly Media because it allows readers to read on any device without restrictions.
Document Format Comparison
Document Formats: Full Comparison
8 rows
| Format | Extension | Year | Creator | Standard | Editability | Layout Preserved | Encryption | Accessibility | Digital Signatures | Cross-Platform | Best For |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1993 | Adobe | ISO 32000 | Low | Yes | Yes | PDF/UA | Yes | Excellent | Fixed-layout documents | ||
| DOCX | .docx | 2007 | Microsoft | ECMA-376 | High | Depends | Yes | Partial | Yes | Good | Editable documents |
| ODT | .odt | 2005 | OASIS | ISO/IEC 26300 | High | Depends | Yes | Partial | Yes | Good | Open-source workflows |
| RTF | .rtf | 1987 | Microsoft | Published spec | High | Partial | No | Poor | No | Excellent | Cross-editor compatibility |
| TXT | .txt | 1960 | Various | N/A | Maximum | No | No | Perfect | No | Perfect | Plain text, code, logs |
| Markdown | .md | 2004 | John Gruber | CommonMark | High | Partial | No | Good (rendered) | No | Perfect | Documentation, README |
| LaTeX | .tex | 1984 | Leslie Lamport | De facto | Medium | Yes (compiled) | No | Compiled PDF | No | Excellent | Academic papers, math |
| EPUB | .epub | 2007 | IDPF | ISO/IEC | Medium | Reflowable | DRM (optional) | EPUB Accessibility | Yes | Good | E-books |
DOCX vs Google Docs: The Cloud Shift
The rise of cloud-based document editors has changed the document format landscape. Google Docs stores documents in its own proprietary format on Google's servers and converts to DOCX, PDF, or other formats on export. This means the "native" format is never actually DOCX — it is a Google-internal representation that uses Operational Transform (OT) for real-time collaboration.
Microsoft 365 (Word Online) takes the opposite approach: documents are stored as DOCX files on OneDrive, and the online editor reads and writes the same OOXML format as desktop Word. This provides better format fidelity when switching between web and desktop editing but limits some real-time collaboration features.
Notion, Coda, and similar tools represent a third approach: they abandon document formats entirely in favor of block-based databases. Content is stored as structured blocks (paragraphs, tables, embeds) that can be rendered as documents, databases, or kanban boards. Export to DOCX, PDF, or Markdown is available but is a secondary concern.
For organizations choosing a document strategy: if format fidelity and offline access matter, use DOCX with Microsoft 365. If real-time collaboration is the priority and format lock-in is acceptable, use Google Docs. If your content is primarily structured knowledge (wikis, documentation), consider Notion or Obsidian with Markdown.
HTML as a Document Format
HTML is often overlooked as a document format, but it is the most widely used document format in history. Every web page is an HTML document. Unlike PDF (fixed layout) or DOCX (editable), HTML is a reflowable, semantic document format that adapts to any screen size, supports accessibility natively, and can be styled with CSS.
For technical documentation, HTML (generated from Markdown via static site generators like Docusaurus, MkDocs, or VitePress) offers significant advantages over PDF: full-text search, deep linking to sections, responsive layout, syntax highlighting, and interactive elements. The trade-off is that HTML documents lack the fixed-layout guarantee of PDF — they look different on different screens (by design).
Single-file HTML documents (.html) are also useful for archival: they can embed CSS, images (as data URIs), and JavaScript in a single self-contained file. The MHTML format (Multipart HTML, .mht) packages a web page with all its resources into one file, though browser support for creating MHTML is limited to Chromium-based browsers.
Document Format Adoption by Industry
Different industries have dramatically different format preferences. Legal and financial sectors are dominated by PDF. Academic research relies heavily on LaTeX. Software development has embraced Markdown almost universally. The chart below shows adoption rates (percentage of organizations regularly using each format) across eight industries.
Document Format Adoption by Industry (%)
Source: OnlineTools4Free Research
Key Finding
Software engineering has the most unique format profile: 85% Markdown adoption (highest of any industry) and only 35% PDF usage (lowest of any industry).
This reflects the developer preference for plain-text, version-controllable documentation over binary office formats.
Part 3: Video Formats
~6,000 words covering codecs, containers, and streaming
Video formats are the most complex category in this guide because video involves two separate concepts that are frequently confused: codecs and containers. A codec (H.264, H.265, VP9, AV1) is the algorithm that compresses and decompresses video data. A container (MP4, WebM, MKV, MOV) is the file format that packages compressed video, audio, subtitles, and metadata into a single file.
Understanding this distinction is critical: an MP4 file is not "in H.264 format." MP4 is the container; H.264 is one of many codecs that can be stored inside an MP4 container. The same MP4 container could hold H.264, H.265, or AV1 video with AAC, AC-3, or Opus audio.
Video Compression Fundamentals: How Video Codecs Think
All modern video codecs (H.264, H.265, VP9, AV1) use the same fundamental approach: block-based hybrid coding. This approach has three stages that operate on every block (macroblock/CTU/superblock) of every frame.
Stage 1: Prediction. The encoder predicts what each block will look like based on either the current frame (intra prediction — using neighboring blocks) or reference frames (inter prediction — using previously encoded frames via motion vectors). Intra prediction uses angular modes to predict from edges, corners, and gradients. Inter prediction searches reference frames to find the best matching block and records the displacement as a motion vector.
Stage 2: Transform and Quantization.The difference between the prediction and the actual block (the "residual") is transformed using a DCT-like transform (integer DCT for H.264, variable-size DCT for H.265/AV1) to separate low-frequency and high-frequency components. The frequency coefficients are then quantized (rounded, losing information) — this is the lossy step. The QP (Quantization Parameter) or CRF (Constant Rate Factor) controls how aggressively coefficients are quantized.
Stage 3: Entropy Coding. The quantized coefficients, motion vectors, and coding decisions are entropy-coded using context-adaptive binary arithmetic coding (CABAC in H.264/H.265, a variant in AV1). CABAC adapts its probability models based on surrounding coded data, achieving better compression than fixed probability tables. The output is the final compressed bitstream.
Additionally, in-loop filters are applied after reconstruction to reduce artifacts before the frame is used as a reference for subsequent frames. H.264 has a deblocking filter. H.265 adds SAO (Sample Adaptive Offset). AV1 adds CDEF (Constrained Directional Enhancement Filter) and Loop Restoration Filter. These filters ensure that encoding artifacts do not propagate and amplify across frames.
Codecs vs Containers: The Fundamental Distinction
Think of a container as a shipping box and the codec as the way the contents are packed inside. The box (MP4, MKV) determines the label, the shipping method, and what kinds of items can go inside. The packing method (H.264, AV1) determines how efficiently the contents fit and how much space they take.
When someone says "I have an MP4 video," they are telling you the container format, but you still do not know the video codec (could be H.264, H.265, or AV1), the audio codec (could be AAC, MP3, or Opus), or any quality parameters. This is why "MP4" alone does not fully describe a video file.
H.264/AVC: The Universal Codec
H.264 (also known as AVC, Advanced Video Coding) was standardized in 2003 and quickly became the most widely used video codec in history. It powers Blu-ray discs, digital television, video surveillance, video conferencing, and the majority of internet video streaming. Every modern device with a screen has H.264 hardware decoding.
How H.264 Works
H.264 divides each frame into macroblocks (16x16 pixels) and uses three types of frames: I-frames (intra, complete pictures), P-frames (predicted from previous frames), and B-frames (bidirectional, predicted from both previous and future frames). For each macroblock, the encoder finds the best match from previously encoded frames (motion estimation), computes the difference (residual), transforms it with integer DCT, quantizes the coefficients, and entropy-codes the result using CABAC (Context-Adaptive Binary Arithmetic Coding) or CAVLC.
The result is remarkably efficient: a 1080p video at 5 Mbps produces quality that would require 50+ Mbps with uncompressed video or 20+ Mbps with MPEG-2. This 10x compression is what made HD video streaming viable on consumer internet connections.
Patent Situation
H.264 is covered by patents held by dozens of companies, managed by the MPEG LA patent pool. For internet video that is free to end users, MPEG LA offers royalty-free licensing. For pay-per-view or subscription content, licensing fees apply. This patent complexity motivated Google to develop VP8/VP9 and later AV1 as royalty-free alternatives.
H.265/HEVC: Double the Efficiency, Triple the Patent Mess
H.265 (High Efficiency Video Coding, also called HEVC) was standardized in 2013. It delivers approximately 40% better compression than H.264 at the same quality, or equivalently, the same quality at 40% lower bitrate. It achieves this through larger coding tree units (CTU, up to 64x64 vs 16x16), more intra prediction modes (35 vs 9), improved motion compensation, sample-adaptive offset filtering, and more reference frames.
However, H.265 adoption has been hampered by its patent licensing nightmare. Three separate patent pools (MPEG LA, HEVC Advance, and Velos Media) claim essential patents, each demanding separate royalties. Some patent holders are not in any pool. The total cost and legal complexity of deploying H.265 is so high that many companies (including Google and Mozilla) refused to adopt it in browsers, which is why Chrome and Firefox support VP9 and AV1 but not H.265 for web video.
H.265 is widely used in cable TV, satellite broadcasting, Blu-ray UHD, and Apple's ecosystem (iPhone recording, Apple TV+, FaceTime). Safari supports H.265 playback. For web developers, H.264 remains the safe baseline with AV1 as the modern upgrade path.
VP9: Google's Royalty-Free Answer
VP9 was developed by Google and released in 2013 as a royalty-free alternative to H.265. It achieves roughly similar compression efficiency to H.265 (about 35-40% better than H.264) without patent royalties. YouTube adopted VP9 as its primary codec for 4K content, and it is supported by Chrome, Firefox, Edge, and most Android devices.
VP9 uses superblocks (up to 64x64), 10 prediction modes, 8-tap interpolation filters, and a tile-based parallel processing architecture. Profile 2 adds 10-bit and 12-bit color depth for HDR content. VP9 hardware decoding is available on most devices manufactured since 2015.
H.266/VVC: The Next Generation (2020)
H.266 (Versatile Video Coding, VVC) was standardized in July 2020 and achieves approximately 50% better compression than H.265 — the same generational improvement that H.265 achieved over H.264. VVC uses coding tree units up to 128x128 pixels with more flexible partitioning options, 67 intra prediction modes (vs 35 for H.265), geometric partitioning for inter prediction, and adaptive loop filtering.
However, H.266/VVC faces the same patent challenges that hampered H.265. The MC-IF (Media Coding Industry Forum) is managing patent licensing, but the total cost and complexity remain unclear. Encoding complexity is also extreme: current VVC encoders are 10-20x slower than H.265 encoders, which were already considered slow.
Hardware decoder support for VVC is just beginning in 2026. MediaTek Dimensity 9300+ and Qualcomm Snapdragon 8 Gen 3 include hardware VVC decoders. No web browser supports VVC playback. The format will likely find adoption in broadcast television (where patent pools are established) and mobile devices (where hardware decoders are available), but AV1 is likely to dominate web video due to its royalty-free status and existing broad support.
ProRes: Apple's Professional Codec
Apple ProRes is a family of lossy and lossless video codecs designed for post-production editing. Unlike H.264 and AV1 (which are optimized for distribution/streaming), ProRes is optimized for editing: it uses intra-frame-only compression (every frame is a complete picture, enabling instant seeking) and predictable data rates that match the speed of professional storage systems.
ProRes comes in several variants: ProRes 422 Proxy (~45 Mbps at 1080p, for offline editing), ProRes 422 LT (~100 Mbps), ProRes 422 (~145 Mbps, the standard), ProRes 422 HQ (~220 Mbps, broadcast quality), ProRes 4444 (~330 Mbps, with alpha channel), and ProRes 4444 XQ (~500 Mbps, highest quality). ProRes RAW captures sensor data directly from cinema cameras with minimal processing.
Since Apple Silicon chips include hardware ProRes encoding and decoding, ProRes is essentially a native format on modern Macs. iPhone 13 Pro and later can record ProRes video directly. For cross-platform editing workflows, DNxHR (Avid) is the main alternative to ProRes, offering similar intra-frame performance with broader tool support on Windows.
AV1: The Future of Video
AV1 was developed by the Alliance for Open Media (AOM), a consortium including Google, Mozilla, Netflix, Amazon, Apple, Microsoft, Intel, AMD, ARM, and others. Released in 2018, AV1 is royalty-free and achieves approximately 50% better compression than H.264 and 20% better than H.265/VP9.
Technical Advances
AV1 uses superblocks up to 128x128 pixels with recursive quad/binary partitioning. It supports 56 intra prediction modes (vs 35 for H.265), directional prediction with angles fine-tuned to the content, and intra block copy (for screen content). Inter prediction uses compound reference frames, overlapped block motion compensation, and warped motion for non-translational movement.
Post-processing includes three in-loop filters: a deblocking filter, CDEF (Constrained Directional Enhancement Filter) for ringing artifact removal, and a loop restoration filter (Wiener or self-guided) for general noise reduction. AV1 also supports film grain synthesis, where the encoder analyzes and removes film grain, transmits grain parameters, and the decoder re-synthesizes grain at playback. This saves substantial bitrate on grainy content.
Adoption Status
YouTube has been encoding new uploads in AV1 since 2020 and serves AV1 to supporting devices. Netflix uses AV1 for all titles on Android devices and smart TVs with AV1 hardware support. Hardware decoding is available on MediaTek Dimensity 1000+, Samsung Exynos 2100+, Intel 11th-gen+, AMD RDNA 3, and NVIDIA RTX 30-series+. All major browsers support AV1 decoding. Real-time AV1 encoding is now practical with SVT-AV1 (Intel) and hardware encoders.
Key Finding
AV1 achieves 50% better compression than H.264 and is royalty-free. YouTube and Netflix have adopted it, and hardware decoder support is now widespread.
For new video projects, encode in AV1 with H.264 fallback. The dual-format approach covers every device while minimizing bandwidth costs.
Video Codec Efficiency Comparison
The chart below shows PSNR (peak signal-to-noise ratio, a quality metric where higher is better) at different bitrates for five codecs encoding the same 1080p test content. AV1 achieves the highest quality at every bitrate, followed by H.266/VVC, H.265, VP9, and H.264.
Video Quality (PSNR) vs Bitrate by Codec
Source: OnlineTools4Free Research
AV1 Hardware Decoder/Encoder Support in 2026
Hardware support determines real-world codec adoption because software decoding of high-resolution video drains battery and generates heat. AV1 hardware decoder support in 2026 covers:
Mobile SoCs: MediaTek Dimensity 1000+ (2020), Samsung Exynos 2100+ (2021), Qualcomm Snapdragon 888+ (2021), Apple A17 Pro+ (2023), Google Tensor G2+ (2022). Virtually all flagship phones sold since 2022 have AV1 hardware decode.
Desktop/Laptop: Intel 11th-gen+ (Tiger Lake, 2020) for decode, Intel Arc GPUs (2022) for encode. AMD RDNA 3+ (RX 7000, 2022) for encode/decode. NVIDIA RTX 30-series (2020) for decode, RTX 40-series (2022) for encode. Apple M1+ (2020) for decode, M3+ (2023) for encode.
Smart TVs: Most TVs manufactured since 2022 include AV1 hardware decode, especially models running Android TV. Samsung, LG, Sony, TCL, and Hisense all ship AV1-capable TVs. YouTube, Netflix, and Disney+ leverage this hardware for 4K AV1 streaming.
Hardware encoding is critical for live streaming, video conferencing, and real-time content creation. NVIDIA NVENC AV1 (RTX 40-series), AMD VCN AV1 (RX 7000), and Intel Quick Sync AV1 (Arc GPUs) enable real-time AV1 encoding at resolutions up to 8K. OBS Studio, Discord, and Google Meet all support hardware AV1 encoding where available.
Container Formats Compared
MP4 (MPEG-4 Part 14) is the universal container, supported by every device and browser. WebM is Google's web-focused container based on Matroska, limited to VP8/VP9/AV1 video with Vorbis/Opus audio. MKV (Matroska) is the most flexible container, supporting virtually any codec, multiple audio tracks, embedded subtitles, chapters, and attachments. MOV is Apple's QuickTime container, essentially identical to MP4 but with Apple-specific extensions.
Video Container Formats
8 rows
| Format | Container Name | Year | Common Codecs | Streaming | Subtitles | Multi Audio | Browser % | Royalty Free | Best For |
|---|---|---|---|---|---|---|---|---|---|
| MP4 | MPEG-4 Part 14 | 2001 | H.264, H.265, AAC | Yes | Limited | Yes | 100% | No (H.264) | Universal playback |
| WebM | Matroska-based | 2010 | VP8, VP9, AV1, Opus | Yes | WebVTT | Yes | 96% | Yes | Web video |
| MKV | Matroska | 2002 | Any codec | Limited | SRT, ASS, SSA | Yes | ~20% | Yes | Archival, multiple tracks |
| MOV | QuickTime | 1991 | H.264, ProRes, AAC | Yes | Text track | Yes | ~60% | Partial | Apple ecosystem, editing |
| AVI | AVI | 1992 | DivX, XviD, MP3 | No | External only | Limited | ~15% | Yes | Legacy compatibility |
| FLV | Flash Video | 2003 | H.263, VP6, MP3 | RTMP | Limited | No | 0% (Flash dead) | No | Nothing (obsolete) |
| TS | MPEG-TS | 1995 | H.264, H.265, AAC | HLS | DVB | Yes | Via HLS | No | Broadcast, HLS streaming |
| OGV | Ogg | 2004 | Theora, Vorbis | Limited | Kate | Yes | ~80% | Yes | Open-source projects |
Video Codec Technical Matrix
Video Codecs: Technical Comparison
8 rows
| Codec | Year | License | Quality/Bit | Encoding | Decoding | HW Support | HDR | Adoption |
|---|---|---|---|---|---|---|---|---|
| H.264/AVC | 2003 | MPEG LA | Baseline | Low | Low | Universal | No | 95% |
| H.265/HEVC | 2013 | MPEG LA + others | +40% vs H.264 | High | Medium | Widespread | Yes | 65% |
| VP9 | 2013 | Royalty-free | +35% vs H.264 | High | Medium | Good | Yes (Profile 2) | 60% |
| AV1 | 2018 | Royalty-free | +50% vs H.264 | Very High | Medium | Growing | Yes | 40% |
| H.266/VVC | 2020 | MC-IF | +50% vs H.265 | Extreme | High | Emerging | Yes | 5% |
| VP8 | 2008 | Royalty-free | ~H.264 | Low | Low | Legacy | No | 30% |
| Theora | 2004 | Royalty-free | -20% vs H.264 | Low | Low | None | No | <5% |
| ProRes | 2007 | Apple | Near-lossless | Low | Low | Apple silicon | Yes | 20% (pro) |
What Streaming Platforms Use
Major streaming platforms have adopted different codec strategies based on their content, audience, and device ecosystem. YouTube and Netflix lead AV1 adoption. Apple TV+ and Disney+ rely on H.265 for their Apple-centric audiences. The table below shows current codec choices as of 2026.
Streaming Platform Format Choices (2026)
8 rows
| Platform | Primary Codec | Fallback | Container | Max Resolution | HDR Formats | Audio Codec |
|---|---|---|---|---|---|---|
| YouTube | AV1 | VP9, H.264 | WebM, MP4 | 8K | HDR10, HLG | Opus, AAC |
| Netflix | AV1 | H.265, VP9 | CMAF, MP4 | 4K | Dolby Vision, HDR10 | AAC, E-AC-3, Atmos |
| Twitch | H.264 | AV1 (beta) | FMP4, TS | 1080p60 | None | AAC |
| Disney+ | H.265 | H.264 | CMAF | 4K | Dolby Vision, HDR10 | E-AC-3, Atmos |
| Vimeo | H.264 | H.265 | MP4 | 8K | HDR10 | AAC |
| TikTok | H.264 | H.265 | MP4 | 1080p | HDR10 | AAC |
| H.264 | H.265 | MP4 | 1080p | None | AAC | |
| Apple TV+ | H.265 | H.264 | CMAF | 4K | Dolby Vision | AAC, Atmos |
Bitrate, Resolution, and Frame Rate
These three parameters interact to determine video quality and file size. Resolution defines the number of pixels per frame (1920x1080 = 2,073,600 pixels). Frame rate defines frames per second (24, 30, or 60 fps). Bitrate defines the data rate (measured in Mbps). For a given codec, doubling the resolution approximately doubles the required bitrate for the same quality. Doubling the frame rate increases required bitrate by 40-60% (not 100%, because successive frames are similar and compress well).
Practical bitrate recommendations for H.264: 1080p30 at 5-8 Mbps for streaming, 8-15 Mbps for high quality, 20-50 Mbps for archival. For AV1, reduce these numbers by roughly half. 4K requires 2-4x the bitrate of 1080p for the same quality level.
Standard Bitrate Ladder for Adaptive Streaming
Adaptive bitrate streaming requires encoding the same video at multiple quality levels. The player selects the highest quality that the viewer's connection can sustain without buffering. The table below shows recommended bitrates for H.264, H.265, and AV1 at each resolution tier. AV1 requires roughly half the bitrate of H.264 for equivalent quality, which translates directly to bandwidth cost savings.
Adaptive Streaming Bitrate Ladder (kbps)
12 rows
| Quality | Resolution | FPS | H.264 kbps | H.265 kbps | AV1 kbps |
|---|---|---|---|---|---|
| 240p | 426x240 | 30 | 400 | 250 | 200 |
| 360p | 640x360 | 30 | 800 | 500 | 400 |
| 480p | 854x480 | 30 | 1400 | 900 | 700 |
| 720p30 | 1280x720 | 30 | 2800 | 1800 | 1400 |
| 720p60 | 1280x720 | 60 | 4500 | 2800 | 2200 |
| 1080p30 | 1920x1080 | 30 | 5000 | 3200 | 2500 |
| 1080p60 | 1920x1080 | 60 | 8000 | 5000 | 4000 |
| 1440p30 | 2560x1440 | 30 | 12000 | 7500 | 6000 |
| 1440p60 | 2560x1440 | 60 | 18000 | 11000 | 9000 |
| 4K30 | 3840x2160 | 30 | 20000 | 12000 | 10000 |
| 4K60 | 3840x2160 | 60 | 35000 | 20000 | 16000 |
| 8K30 | 7680x4320 | 30 | 80000 | 45000 | 35000 |
Understanding I-Frames, P-Frames, and B-Frames
Video compression works by exploiting temporal redundancy: consecutive frames are usually very similar. Instead of storing every frame independently (like a flipbook), modern codecs store complete reference frames (I-frames) and then describe subsequent frames as differences from those references. P-frames reference previous frames; B-frames reference both previous and future frames.
The table below shows a typical breakdown for a 1-minute 1080p30 clip encoded with H.264. I-frames are the largest (280 KB average) because they contain a complete picture, but they are rare (15 in 60 seconds with a GOP of 60). P-frames are 42 KB on average and make up the bulk of data. B-frames are the smallest at 18 KB each. The ratio of frame types directly affects both quality and seekability: more I-frames enable faster seeking but increase file size.
H.264 Frame Type Distribution (1-min 1080p30 clip)
3 rows
| Frame Type | Count | Avg Size (KB) | % of Total | Description |
|---|---|---|---|---|
| I-frame (Intra) | 15 | 280 | 22 | Complete picture — no references. Used for seeking, scene changes. Interval: every 2 seconds (GOP=60). |
| P-frame (Predicted) | 585 | 42 | 48 | Predicted from previous I or P frames. Contains motion vectors + residual data. Most common frame type. |
| B-frame (Bidirectional) | 1200 | 18 | 30 | Predicted from both past and future reference frames. Smallest frames. Typically 2 B-frames between P-frames. |
H.264 Profile and Level System
H.264 defines profiles that specify which coding tools are available. The Baseline profile omits B-frames and CABAC entropy coding, making it simpler to decode (suitable for mobile and video conferencing). The Main profile adds B-frames and CABAC for 20-30% better compression. The High profile adds 8x8 transforms and additional quantization options for the best quality per bit.
H.264 Profiles
6 rows
| Profile | B-Frames | CABAC | Max Resolution | Use Case |
|---|---|---|---|---|
| Baseline | No | No | 1920x1080 | Video conferencing, mobile |
| Main | Yes | Yes | 1920x1080 | Standard definition broadcast |
| High | Yes | Yes | 4096x2304 | Blu-ray, streaming HD |
| High 10 | Yes | Yes | 4096x2304 | 10-bit HDR content |
| High 4:2:2 | Yes | Yes | 4096x2304 | Professional 4:2:2 workflows |
| High 4:4:4 | Yes | Yes | 4096x2304 | Screen capture, lossless |
Video Encoding Speed Comparison
The chart below compares encoding speed (frames per second) for different video codecs at 1080p resolution. x264 (H.264) is the fastest, encoding at 48-110 fps depending on quality. SVT-AV1 is practical for production use at 6-16 fps. The reference AV1 encoder (aomenc) is extremely slow at under 1 fps, suitable only for offline batch encoding. H.266/VVC encoding is even slower, reflecting its status as an emerging codec.
Video Encoding Speed (frames/sec, 1080p single-thread)
7 rows
| Codec | High Q (fps) | Medium Q (fps) | Low Q (fps) | Preset |
|---|---|---|---|---|
| x264 (H.264) | 48 | 72 | 110 | medium |
| x265 (H.265) | 12 | 18 | 28 | medium |
| VP9 (libvpx) | 8 | 14 | 22 | good |
| SVT-AV1 | 6 | 10 | 16 | 6 |
| aomenc (AV1) | 0.8 | 1.5 | 2.5 | cpu-used=4 |
| rav1e (AV1) | 3.5 | 5.5 | 8.5 | 6 |
| VVenC (H.266) | 0.3 | 0.5 | 0.8 | medium |
YouTube Codec Adoption Trend
YouTube is the world's largest video platform and its codec choices drive industry adoption. The chart below shows how YouTube has shifted from H.264 dominance (95% in 2016) to AV1 majority (58% in 2026). VP9 served as a bridge technology, peaking at 40% in 2021 before being gradually replaced by AV1. This transition has saved Google and its users billions of dollars in bandwidth costs while delivering better video quality, especially on mobile connections.
YouTube Video Codec Distribution by Year (%)
Source: OnlineTools4Free Research
Streaming Protocols Compared
Video streaming uses specialized protocols to deliver video over HTTP. HLS (HTTP Live Streaming, Apple) and DASH (Dynamic Adaptive Streaming over HTTP, MPEG) are the two dominant protocols. CMAF (Common Media Application Format) unifies them using a shared segment format. Low-latency variants (LL-HLS, LL-DASH) reduce latency from 10-30 seconds to 2-4 seconds for live streaming. WebRTC provides sub-second latency for real-time communication.
Streaming Protocols Compared
9 rows
| Protocol | Developer | Year | Latency | ABR | Browser % |
|---|---|---|---|---|---|
| HLS | Apple | 2009 | 6-30s | Yes | 95% |
| DASH | MPEG/ISO | 2012 | 6-30s | Yes | 80% |
| CMAF | MPEG/Apple/Microsoft | 2018 | 2-6s | Yes | 90% |
| LL-HLS | Apple | 2020 | 2-4s | Yes | 85% |
| LL-DASH | DASH-IF | 2020 | 2-4s | Yes | 75% |
| WebRTC | Google/IETF | 2011 | <500ms | Yes | 97% |
| RTMP | Adobe/Macromedia | 2002 | 1-5s | No | 0% (Flash dead) |
| SRT | Haivision | 2017 | <1s | No | 0% (ingest only) |
| WHIP | IETF | 2022 | <500ms | Yes | 95% |
Code Example: Embedding Video with AV1 Fallback
The HTML <video> element supports multiple source formats. Use the codecs parameter to specify the exact codec, allowing the browser to skip formats it cannot decode without downloading them.
<video
autoplay muted loop playsinline
poster="thumb.webp"
width="800" height="450">
<source src="video.mp4" type='video/mp4; codecs="av01.0.05M.08"'>
<source src="video.mp4" type='video/mp4; codecs="avc1.4D401E"'>
Your browser does not support the video element.
</video>For converting video between formats, ffmpeg is the standard tool. The command below converts a source video to both AV1 (primary) and H.264 (fallback).
# Convert video to AV1 with H.264 fallback
ffmpeg -i input.mov -c:v libaom-av1 -crf 30 -b:v 0 \
-c:a libopus -b:a 128k output-av1.mp4
ffmpeg -i input.mov -c:v libx264 -crf 23 \
-c:a aac -b:a 192k output-h264.mp4Try it yourself
Image Compressor
Part 4: Audio Formats
~5,000 words covering 9 audio formats
Audio formats fall into three categories: uncompressed (WAV, AIFF), lossless compressed (FLAC, ALAC), and lossy compressed (MP3, AAC, Opus, Vorbis, WMA). The choice between them involves trade-offs between quality, file size, compatibility, and feature support.
Audio Loudness Standards: The Loudness War is Over
For decades, music producers made tracks as loud as possible ("the loudness war"), compressing dynamic range to achieve higher average levels. This practice peaked in the mid-2000s and has been reversed by streaming platform normalization.
All major streaming platforms now normalize playback volume to a target loudness level measured in LUFS (Loudness Units relative to Full Scale). Spotify normalizes to -14 LUFS, Apple Music to -16 LUFS, YouTube to -14 LUFS, and Tidal to -14 LUFS. A track mastered at -8 LUFS (very loud, heavily compressed) will be turned down by 6 dB on Spotify, while a dynamic track mastered at -14 LUFS will play at its original level. This means over-compressed masters actually sound worse on streaming platforms because they have less dynamic range without any loudness advantage.
For podcasts, the standard is -16 LUFS (mono) or -19 LUFS (stereo) per the Podcast Standards Project and Apple's specifications. YouTube targets -14 LUFS for all content. Broadcast television uses -24 LUFS (EBU R128 in Europe) or -24 LKFS (ATSC A/85 in North America).
The loudness meter and normalization process operate on the final encoded file regardless of format. However, the format affects how well the dynamic range is preserved. At low bitrates, lossy codecs can introduce artifacts on loud transients. Opus handles this best due to its adaptive mode switching; MP3 at 128 kbps can produce pre-echo artifacts on transients (audible distortion slightly before drum hits) that Opus avoids entirely.
FLAC Internals: How Lossless Audio Compression Works
FLAC achieves lossless compression by exploiting the predictability of audio signals. Audio samples are not random — each sample is strongly correlated with its neighbors. FLAC uses linear predictive coding (LPC) to model this correlation.
Step 1: Blocking. The audio stream is divided into blocks of 1,152 to 16,384 samples (typically 4,096). Each block is compressed independently, allowing seeking to any block without decompressing the entire file.
Step 2: Inter-channel decorrelation. For stereo audio, FLAC can store left and right channels independently, or use mid/side encoding (M = (L+R)/2, S = L-R). Mid/side is more efficient when the channels are similar (most stereo music), because the side channel has lower amplitude and compresses better.
Step 3: Linear prediction. The encoder tries multiple LPC prediction orders (0 to 12) and selects the one that minimizes the prediction residual. For a 4th-order predictor, each sample is predicted as: P[n] = a1*S[n-1] + a2*S[n-2] + a3*S[n-3] + a4*S[n-4]. The residual (actual - predicted) is typically much smaller than the original samples.
Step 4: Rice coding. The residual values are encoded using Rice coding, an entropy coding scheme optimized for small, near-zero values (which prediction residuals typically are). Rice coding is simpler and faster than Huffman coding while being nearly as efficient for this specific distribution.
The result: CD-quality audio (16-bit, 44.1 kHz) compresses to approximately 50-60% of its PCM size. Classical music and quiet acoustic recordings compress better (40-50%) because they have lower entropy. Electronic music and heavily distorted content compresses worse (55-65%) because it has higher entropy and less predictability.
MP3: The Format That Changed Music
MP3 (MPEG-1 Audio Layer III) was standardized in 1993 and revolutionized music distribution by enabling a 10:1 compression ratio with acceptable audio quality. A typical 4-minute song is about 4 MB at 128 kbps (vs 40 MB as WAV), making it practical to download and share music over dial-up internet connections. The cultural impact of MP3 on the music industry — through Napster, iTunes, and the iPod — is difficult to overstate.
How MP3 Compression Works
MP3 exploits psychoacoustic principles of human hearing. The audio signal is transformed to the frequency domain using a modified DCT (MDCT). A psychoacoustic model analyzes the signal to determine which frequencies are masked (inaudible because of louder nearby frequencies). The encoder allocates bits to audible components and discards masked components. This is called perceptual coding.
Key psychoacoustic phenomena exploited: simultaneous masking (a loud tone at 1 kHz makes nearby frequencies inaudible), temporal masking (a loud sound masks softer sounds immediately before and after it), and the absolute threshold of hearing (frequencies below certain amplitudes are inaudible regardless of other sounds). By discarding information the ear cannot perceive, MP3 achieves high compression with minimal audible quality loss.
Bitrate Modes
MP3 supports three bitrate modes: CBR (Constant Bitrate, same bitrate throughout), VBR (Variable Bitrate, adapts to content complexity), and ABR (Average Bitrate, targets a specific average). VBR produces the best quality per file size because it allocates more bits to complex passages (cymbals, vocal harmonics) and fewer bits to simple passages (silence, sustained notes). LAME's VBR V0 setting (~245 kbps average) is considered transparent (indistinguishable from CD) by most listeners in blind tests.
All MP3 patents expired by 2017, making the format completely royalty-free. However, MP3 is technically inferior to AAC and Opus at every bitrate. The LAME encoder (the best MP3 encoder) has not seen significant development since the late 2000s. MP3 remains relevant purely due to its universal compatibility.
WAV: Uncompressed Studio Audio
WAV (Waveform Audio File Format) stores audio as uncompressed PCM (Pulse Code Modulation) data. CD-quality WAV is 16-bit, 44.1 kHz, stereo, producing a bitrate of 1,411 kbps (1.41 Mbps). A 4-minute song is approximately 42 MB. Professional WAV recordings use 24-bit or 32-bit floating-point at 48 kHz, 96 kHz, or even 192 kHz sample rates.
WAV is the standard format in recording studios, sound design, broadcast, and any workflow where quality cannot be compromised. It is the format that DAWs (Digital Audio Workstations) like Pro Tools, Logic Pro, Ableton Live, and FL Studio use internally. WAV files are fast to read and write because no decompression is needed, which matters for real-time multi-track recording.
FLAC: The Audiophile Standard
FLAC (Free Lossless Audio Codec) compresses audio without any quality loss, typically achieving a 50-60% size reduction compared to WAV. A 4-minute CD-quality song is about 20-25 MB in FLAC (vs 42 MB WAV). FLAC is the de facto standard for lossless music distribution: Bandcamp, Tidal HiFi, Amazon Music HD, and Qobuz all use FLAC.
How FLAC Works
FLAC uses linear prediction to model each audio sample based on previous samples. The encoder tries multiple prediction orders and selects the one that minimizes the residual (difference between predicted and actual values). The residual is then entropy-coded using Rice coding, which is highly efficient for the small, near-zero values typical of prediction residuals.
FLAC supports sample rates from 1 Hz to 655,350 Hz, bit depths from 4 to 32 bits, and up to 8 channels. It includes MD5 checksumming for integrity verification, cue sheet support for album indexing, and Vorbis comment tags for metadata.
AAC: Apple's Codec of Choice
AAC (Advanced Audio Coding) was standardized in 1997 as the successor to MP3 within the MPEG family. It uses more advanced psychoacoustic modeling, larger transform window sizes (2048 samples vs 576 for MP3), temporal noise shaping, and perceptual noise substitution to achieve better quality than MP3 at every bitrate. At 128 kbps, AAC sounds equivalent to 192 kbps MP3.
Apple adopted AAC for iTunes, iPod, iPhone, Apple Music, FaceTime, and every Apple product. Spotify uses 256 kbps AAC for premium streaming on web and desktop. Most broadcast and streaming services use AAC for their audio tracks.
AAC Profiles
AAC comes in several profiles: AAC-LC (Low Complexity, the most common, used by iTunes and Spotify), HE-AAC v1 (High Efficiency, adds Spectral Band Replication for excellent quality at low bitrates, used in digital radio), and HE-AAC v2 (adds Parametric Stereo for very low bitrate speech). Apple also uses AAC-LC for Apple Lossless (ALAC), which is a separate lossless codec that shares the .m4a container.
Opus: The Best Audio Codec
Opus is the best lossy audio codec available in 2026 by virtually every metric. Standardized by the IETF in 2012 (RFC 6716), it is open-source, royalty-free, and designed for both speech and music. Opus dynamically blends two coding modes: SILK (developed by Skype for speech) and CELT (Constrained Energy Lapped Transform, for music), adapting in real-time to the content.
Quality at Every Bitrate
Opus delivers quality that matches or exceeds every other lossy codec at every bitrate. At 64 kbps, Opus sounds better than AAC at 96 kbps and MP3 at 128 kbps. At 128 kbps, Opus is transparent (indistinguishable from the original) for the vast majority of content. This has been confirmed in multiple independent listening tests (mushra.org, hydrogenaudio.org).
Opus supports bitrates from 6 kbps (very low quality speech) to 510 kbps (transparent music), sample rates from 8 kHz to 48 kHz, and up to 255 channels. It has an algorithmic latency as low as 2.5 ms, making it ideal for real-time communication. Discord, WebRTC, all modern VoIP, and many game engines use Opus.
Key Finding
Opus is the best lossy audio codec at every bitrate. At 128 kbps it is transparent (indistinguishable from CD). It is open-source, royalty-free, and supported by 97% of browsers.
For new projects, Opus should be the default choice for both voice and music. Use AAC only for Apple-specific workflows, and MP3 only for legacy compatibility.
OGG Vorbis: The Pioneer of Open Audio
Vorbis is an open-source, royalty-free audio codec developed by the Xiph.Org Foundation and typically stored in the Ogg container format. Released in 2000, Vorbis was created as a free alternative to MP3 and AAC during the era of aggressive patent enforcement. At equivalent bitrates, Vorbis generally matches or slightly exceeds MP3 quality, particularly at lower bitrates (below 128 kbps).
Vorbis has been largely superseded by Opus (from the same Xiph.Org Foundation), but it remains widely used in gaming (Unreal Engine, Unity, many game titles), Wikipedia (audio files), and some streaming services. Spotify used Ogg Vorbis for years (320 kbps for Premium).
AIFF: Apple's Uncompressed Format
AIFF (Audio Interchange File Format) is essentially Apple's equivalent of WAV. Created by Apple in 1988, it stores uncompressed PCM audio with the same quality characteristics as WAV. The primary differences are byte order (AIFF uses big-endian, WAV uses little-endian), metadata format (AIFF uses its own chunk format), and ecosystem (AIFF is common in macOS/Logic Pro workflows, WAV is universal).
AIFF-C is a compressed variant that supports various codecs, but it is rarely used. For practical purposes, WAV and AIFF are interchangeable for uncompressed audio storage. Choose whichever your DAW and workflow prefer.
WMA: Microsoft's Legacy Codec
WMA (Windows Media Audio) was Microsoft's proprietary audio codec, developed as a competitor to MP3 and AAC. WMA Standard offers quality comparable to MP3 at similar bitrates. WMA Pro supports multichannel (5.1, 7.1) and 24-bit audio. WMA Lossless offers FLAC-like compression.
WMA's primary claim to fame was deep DRM integration with Windows Media Player, making it the preferred format for early digital music stores. This DRM dependency backfired: when Microsoft shut down its DRM servers, purchased WMA files became unplayable. The format is effectively dead for new content but persists in legacy media libraries. Do not use WMA for new projects.
ALAC: Apple's Lossless Codec
ALAC (Apple Lossless Audio Codec) is Apple's lossless audio format, functionally equivalent to FLAC but stored in the M4A container. Apple open-sourced ALAC in 2011, making it royalty-free. ALAC achieves similar compression ratios to FLAC (50-60% of WAV size) and is natively supported on all Apple devices and iTunes/Apple Music.
The practical difference between FLAC and ALAC is ecosystem support. FLAC is universal: supported by Android, Linux, Windows, web browsers, and most audio software. ALAC is primarily an Apple ecosystem format. For cross-platform compatibility, FLAC is the better choice. For Apple-only workflows, ALAC avoids any potential compatibility issues with Apple software.
DSD: The Audiophile Niche
DSD (Direct Stream Digital) uses a fundamentally different approach to digital audio. Instead of PCM's multi-bit samples at a moderate rate (16-bit at 44.1 kHz), DSD uses 1-bit samples at an extremely high rate (2.8224 MHz for DSD64, 5.6448 MHz for DSD128). This produces a bitstream that directly represents the analog waveform using delta-sigma modulation.
DSD was developed by Sony and Philips for the SACD (Super Audio CD) format. DSD files are stored in DFF or DSF containers and are enormous: about 42 MB per minute for DSD64 stereo. Niche streaming services (NativeDSD) offer DSD content, but for 99.9% of listeners, 24-bit FLAC at 96 kHz is indistinguishable from DSD and far more practical.
Spatial Audio: Dolby Atmos and Beyond
Spatial audio represents sound in three dimensions. Dolby Atmos uses object-based audio where each sound source has a position (x, y, z) and the renderer adapts to the speaker configuration or headphones. Atmos is supported by Apple Music, Tidal, Amazon Music, and Netflix.
Ambisonics encodes a complete sound field as spherical harmonic channels. First-order ambisonics (FOA) uses 4 channels; higher-order uses more for better spatial resolution. Ambisonics is used in VR (YouTube 360, Facebook 360) because it can be freely rotated without artifacts — the listener can look in any direction.
For web audio, the Web Audio API provides the PannerNode for basic 3D positioning of sound sources. For immersive experiences, Resonance Audio (Google) and Mach1 provide higher-quality spatial audio rendering in the browser. The audio format is typically Opus or AAC; the spatial metadata is applied at the rendering stage.
Audio Quality vs File Size
The chart below shows perceptual audio quality (on a 1-10 scale based on MUSHRA listening tests) at different bitrates for four lossy codecs. Opus achieves the best quality at every bitrate. At 128 kbps, Opus is essentially transparent, while MP3 still has audible artifacts on certain content.
Audio Quality Score vs Bitrate
Source: OnlineTools4Free Research
Complete Audio Format Comparison
Audio Formats: Full Comparison
9 rows
| Format | Year | Type | Codec | Max Bitrate | Sample Rates | Bit Depth | Channels | Gapless | Royalty Free | Browser % | Best For |
|---|---|---|---|---|---|---|---|---|---|---|---|
| MP3 | 1993 | Lossy | MPEG-1 Layer 3 | 320 kbps | 8-48 kHz | 16-bit | Stereo | Hack (LAME) | Yes (2017) | 100% | Universal compatibility |
| AAC | 1997 | Lossy | AAC-LC, HE-AAC | 512 kbps | 8-96 kHz | 16-bit | 7.1 | Yes | No | 100% | Apple, streaming |
| OGG Vorbis | 2000 | Lossy | Vorbis | 500 kbps | 8-192 kHz | 16-bit | 255 | Yes | Yes | 95% | Gaming, open-source |
| Opus | 2012 | Lossy | Opus | 510 kbps | 8-48 kHz | 16-24 bit | 255 | Yes | Yes | 97% | VoIP, streaming, web |
| FLAC | 2001 | Lossless | FLAC | Variable | 1-655 kHz | 4-32 bit | 8 | Yes | Yes | 92% | Audiophile, archival |
| WAV | 1991 | Uncompressed | PCM | N/A | 8-384 kHz | 8-64 bit | 18 | Yes | Yes | 100% | Studio recording |
| AIFF | 1988 | Uncompressed | PCM | N/A | 8-384 kHz | 8-32 bit | 6 | Yes | Yes | ~70% | Apple pro audio |
| WMA | 1999 | Lossy | WMA Standard | 384 kbps | 8-48 kHz | 16-bit | 5.1 | Limited | No | ~15% | Windows ecosystem |
| ALAC | 2004 | Lossless | ALAC | Variable | 1-384 kHz | 16-32 bit | 8 | Yes | Yes (2011) | ~60% | Apple lossless |
Sample Rate, Bit Depth, and Channels Explained
Sample rate determines the highest frequency that can be captured. By the Nyquist theorem, the maximum reproducible frequency is half the sample rate. CD audio at 44.1 kHz captures frequencies up to 22.05 kHz, exceeding the typical human hearing range of 20 Hz to 20 kHz. Higher sample rates (96 kHz, 192 kHz) are used in professional recording to capture ultrasonic harmonics and to provide wider headroom for processing, but they offer no audible benefit for playback.
Bit depth determines the dynamic range (the ratio between the loudest and softest sounds). 16-bit audio provides 96 dB of dynamic range, sufficient for CD and most listening environments. 24-bit audio provides 144 dB of dynamic range, which exceeds the threshold of pain (~130 dB) and is used in recording to prevent clipping during performance.
Channels define the spatial audio layout. Mono (1 channel) is used for speech, podcasts, and AM radio. Stereo (2 channels) is standard for music. Surround formats include 5.1 (six channels: front left, center, front right, rear left, rear right, subwoofer), 7.1 (eight channels), and spatial audio formats like Dolby Atmos (up to 128 tracks with object-based positioning).
Frequency Response by Codec and Bitrate
At low bitrates, lossy audio codecs aggressively cut high frequencies because most musical energy is in the lower frequencies. The table below shows the highest frequency preserved at each bitrate. MP3 at 64 kbps cuts everything above 8 kHz, producing a noticeably muffled sound. Opus at the same bitrate preserves frequencies up to 16 kHz, sounding dramatically better. At 128 kbps and above, all modern codecs preserve the full audible range (20 kHz).
Highest Preserved Frequency (Hz) by Bitrate
Source: OnlineTools4Free Research
Audio Blind Test Results (MUSHRA Scores)
MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor) is the standard methodology for subjective audio quality evaluation. Listeners compare encoded samples against the hidden original on a 0-100 scale, where 100 means indistinguishable from the original. The data below represents averages from 20 expert listeners across 10 critical music samples (orchestral, vocal, percussion, electronic, jazz).
The results confirm Opus's dominance: at 128 kbps it scores 94/100 (transparent for most listeners), while MP3 LAME at the same bitrate scores 78. The practical takeaway: Opus at 96 kbps sounds as good as MP3 at 256 kbps, saving 60% of bandwidth.
MUSHRA Blind Test Scores by Codec (100 = transparent)
Source: OnlineTools4Free Research
Complete Sample Rate Reference
The Nyquist theorem states that a digital audio system can perfectly reproduce frequencies up to half the sample rate. CD audio at 44.1 kHz captures up to 22.05 kHz, comfortably exceeding the typical human hearing range of 20 Hz to 20 kHz. Higher sample rates are used in professional recording not for audible benefit, but for processing headroom and to push anti-aliasing filters well above the audible range.
Audio Sample Rates and Nyquist Frequencies
11 rows
| Sample Rate | Nyquist Freq | Quality Tier | Common Use |
|---|---|---|---|
| 8,000 Hz | 4,000 Hz | Telephone | VoIP, old telephony |
| 16,000 Hz | 8,000 Hz | Wideband speech | HD Voice, podcasts |
| 22,050 Hz | 11,025 Hz | AM radio | Low-quality streaming |
| 44,100 Hz | 22,050 Hz | CD | Music distribution, streaming |
| 48,000 Hz | 24,000 Hz | DVD/Broadcast | Video audio, DAWs |
| 88,200 Hz | 44,100 Hz | High-res | Professional recording |
| 96,000 Hz | 48,000 Hz | High-res | Professional recording, Blu-ray |
| 176,400 Hz | 88,200 Hz | Ultra high-res | Mastering, archival |
| 192,000 Hz | 96,000 Hz | Ultra high-res | Studio mastering |
| 352,800 Hz | 176,400 Hz | DSD-equivalent | Audiophile niche |
| 384,000 Hz | 192,000 Hz | Maximum | Research, measurement |
Part 5: Data & Serialization Formats
~5,000 words covering 11 data formats
Data formats determine how structured information is stored and transmitted between systems. The right format depends on whether humans need to read it, how fast it needs to be parsed, whether a schema is required, and what ecosystem you are building for.
JSON Variants: JSON5, JSONC, JSON Lines
Standard JSON's lack of comments and trailing commas has spawned several variants that address common frustrations while maintaining JSON's simplicity.
JSONC (JSON with Comments) extends JSON with // and /* */ comments. Used by VS Code settings (settings.json), TypeScript (tsconfig.json), and ESLint configurations. JSONC parsers strip comments before parsing as standard JSON.
JSON5 extends JSON more aggressively: single-quoted strings, trailing commas, unquoted object keys, Infinity/NaN, hex numbers, and multiline strings. Used by Babel (babel.config.json5) and some build tools. JSON5 is a superset of JSON: all valid JSON is valid JSON5.
JSON Lines (JSONL, NDJSON) stores one JSON object per line, separated by newlines. This format is ideal for log files, streaming data, and large datasets because each line can be parsed independently without loading the entire file into memory. Tools like jq, DuckDB, and pandas can process JSONL files efficiently.
JSON Merge Patch (RFC 7396) and JSON Patch (RFC 6902) define standard ways to express partial updates to JSON documents. JSON Merge Patch uses a simple merge semantic (set new values, null to delete). JSON Patch uses an array of operations (add, remove, replace, move, copy, test) for precise manipulation.
JSON: The Language of the Web
JSON (JavaScript Object Notation) was formalized by Douglas Crockford in 2001 and standardized as ECMA-404 and RFC 8259. It has become the universal data interchange format for the web, used by virtually every REST API, configuration file, and NoSQL database. JSON's success comes from its simplicity: it supports only six data types (string, number, boolean, null, array, object) with a grammar that fits on a business card.
JSON Syntax and Types
JSON values are one of: strings (double-quoted), numbers (integer or floating-point, no hex, no NaN/Infinity), booleans (true or false), null, arrays (ordered lists in square brackets), or objects (unordered key-value pairs in curly braces, keys must be strings).
Common pitfalls: JSON does not support comments (a frequent source of frustration for configuration files), trailing commas are invalid, single quotes are not valid (only double quotes), and there is no date type (dates are typically ISO 8601 strings). JSON numbers have no integer/float distinction and no precision guarantee (JavaScript Number.MAX_SAFE_INTEGER is 2^53 - 1; larger integers lose precision).
JSON Schema
JSON Schema is a vocabulary for describing and validating JSON data. It defines the expected structure, data types, required fields, value constraints (minimum, maximum, pattern), and relationships between properties. JSON Schema powers OpenAPI (Swagger) API documentation, form validation in React JSON Schema Form, and configuration validation in VS Code settings.
XML: The Enterprise Veteran
XML (Extensible Markup Language) was standardized by the W3C in 1998 and dominated data interchange for a decade before JSON overtook it. XML uses a tag-based syntax similar to HTML, with strict nesting, required closing tags, and case-sensitive element names. XML is more verbose than JSON (an XML representation of the same data is typically 30-50% larger) but offers features JSON lacks.
When XML is Still Needed
XML remains the standard for: SOAP web services (enterprise APIs), SVG (vector graphics), XHTML (strict HTML), RSS/Atom feeds, Office documents (DOCX, XLSX are XML inside ZIP), Android layouts, Maven/Gradle configuration, SAML (authentication), and many government and healthcare standards (HL7 FHIR, NIEM). If you work in enterprise software, finance, healthcare, or government, you will encounter XML regularly.
XML's advantages over JSON: namespaces (allowing elements from different vocabularies to coexist), schemas (XSD provides far more powerful validation than JSON Schema), XSLT transforms (converting XML to HTML, PDF, or other XML), and comments. XML also supports mixed content (text interleaved with elements), which is natural for document markup.
YAML: Human-Friendly Configuration
YAML (originally "Yet Another Markup Language," now "YAML Ain't Markup Language") was created in 2001 as a human-readable data serialization format. It uses indentation instead of brackets, supports comments (# prefix), and infers types (true, 1, 1.5, null are automatically parsed as boolean, integer, float, null).
YAML Gotchas
YAML's flexibility creates notorious gotchas. The string "Norway" abbreviated as "NO" is parsed as boolean false. "1.0" is a float, not a string. Indentation must use spaces, never tabs. Multiline strings have six different modes (literal, folded, with/without trailing newline, with/without chomp indicator). The "YAML Norway Problem" led YAML 1.2 to change boolean recognition, but many parsers still use YAML 1.1 rules.
YAML dominates DevOps configuration: Kubernetes manifests, Docker Compose files, GitHub Actions workflows, Ansible playbooks, Terraform (HCL is YAML-inspired), GitLab CI, CircleCI, and many more. Its readability makes it the best format for configuration files that humans edit regularly.
CSV: The Simplest Data Exchange
CSV (Comma-Separated Values) has been used since the 1970s for tabular data exchange. Despite its apparent simplicity, CSV has surprising complexity: there is no universal standard (RFC 4180 is the closest), delimiter choice varies (comma, semicolon in European locales, tab), quoting rules differ between implementations, encoding is unspecified, and there is no type system (everything is a string).
CSV Encoding Issues
The most common CSV problem is encoding. Excel on Windows opens CSV files as Windows-1252 by default, mangling any UTF-8 characters. The fix: add a UTF-8 BOM (byte order mark, EF BB BF) at the beginning of the file, which tells Excel to use UTF-8. Alternatively, use TSV (tab-separated values), which Excel handles more consistently.
Other CSV pitfalls: numbers with leading zeros (like zip codes "00501") are converted to integers by Excel. Dates in ambiguous formats (01/02/03) are interpreted differently in US/EU locales. Fields containing commas, quotes, or newlines must be enclosed in double quotes, with internal quotes escaped by doubling (""). These edge cases cause countless data import bugs.
JSON Performance Optimization
JSON parsing performance matters at scale. A typical REST API response is 5-50 KB of JSON; at 10,000 requests per second, that is 50-500 MB/s of JSON parsing. Several strategies can reduce the cost:
1. Use a fast parser. Standard JSON.parse in V8 is well-optimized (~1,800 ops/sec for 100KB payloads), but simdjson (SIMD-accelerated C++ parser with Node.js bindings) achieves 12,000 ops/sec — a 6.6x improvement. For Python, orjson is 10-20x faster than the standard json module.
2. Minify before transit. Removing whitespace typically reduces JSON size by 20-30%. Combined with GZIP or Brotli compression, a 100 KB formatted JSON response becomes approximately 12-18 KB over the wire. Most web frameworks minify JSON responses by default.
3. Avoid unnecessary data. The most effective JSON optimization is not sending data you do not need. Use field selection (GraphQL, sparse fieldsets in REST) to return only requested fields. A typical API response includes 40-60% of fields that the client ignores.
4. Use streaming parsers. For very large JSON files (10 MB+), streaming parsers (SAX-style) process the file incrementally without loading it entirely into memory. JSON Lines (NDJSON) is even better: each line is an independent JSON object, enabling line-by-line processing with any standard JSON parser.
5. Consider binary alternatives. For internal service-to-service communication where human readability is not needed, Protocol Buffers or FlatBuffers provide 3-10x smaller payloads and 20-50x faster parsing. The engineering cost is defining .proto schemas and generating code.
TOML: Typed Configuration
TOML (Tom's Obvious, Minimal Language) was created by Tom Preston-Werner (GitHub co-founder) in 2013 as a configuration format that avoids YAML's complexity while adding explicit types. TOML distinguishes between strings, integers, floats, booleans, dates, times, arrays, and tables using clear syntax with no ambiguity.
TOML is the standard configuration format for the Rust ecosystem (Cargo.toml, rustfmt.toml), Python packaging (pyproject.toml), and is supported by many other tools. Its explicit typing and flat structure make it immune to the YAML Norway Problem and similar gotchas.
Protocol Buffers: Binary Speed
Protocol Buffers (Protobuf) is Google's binary serialization format, used internally for virtually all data storage and RPC communication at Google. Messages are defined in .proto files using a schema language, then compiled to language-specific code (C++, Java, Python, Go, etc.) for serialization and deserialization.
Protobuf produces messages 3-10x smaller than JSON and parses 20-100x faster. It achieves this through field numbering (instead of string keys), varint encoding for integers, binary representation of all values, and schema-driven encoding that omits default values. The trade-off is human readability: Protobuf binary data is not human-readable, and debugging requires schema-aware tools.
Protobuf is the serialization format for gRPC (Google's RPC framework), used by major companies including Google, Netflix, Square, Lyft, and CoreOS. If you need high-throughput, low-latency data exchange between services, Protobuf with gRPC is the current gold standard.
MessagePack: Binary JSON
MessagePack is a binary serialization format that is schema-less (like JSON) but more compact and faster to parse. It represents the same data types as JSON (strings, numbers, booleans, null, arrays, maps) but in a binary encoding that is typically 30-50% smaller than JSON. MessagePack is often described as "binary JSON" — it preserves JSON's flexibility while eliminating the overhead of text parsing.
MessagePack is popular in gaming (Unity, Redis, Fluentd), real-time systems, and any situation where JSON's human readability is not needed but its schema-less flexibility is. Libraries are available for 50+ programming languages.
INI: The Simplest Configuration Format
INI files (initialized in the early 1980s for MS-DOS) are the simplest configuration format: sections in square brackets, key-value pairs separated by = or :, and comments starting with ; or #. INI has no standard specification, which means every parser handles edge cases differently (does it support nested sections? multi-line values? escaping?).
Despite its limitations, INI persists in many contexts: Git configuration (.gitconfig), PHP configuration (php.ini), Python packaging (setup.cfg), systemd unit files, desktop entry files (.desktop on Linux), and Windows registry exports. For new projects, TOML is the recommended upgrade from INI — it adds explicit types, nested tables, and arrays while maintaining INI's readability.
.env: Environment Variable Files
The .env file format (popularized by the dotenv library) stores environment variables as KEY=VALUE pairs, one per line. It is the standard way to configure application secrets (API keys, database URLs) in development. The .env file is loaded into the process environment at startup and should never be committed to version control.
.env is not a proper data format — it has no specification, no types (everything is a string), no nesting, and inconsistent quote handling across implementations. Despite this, it has become universal: Node.js (dotenv), Python (python-dotenv), Ruby (dotenv-rails), Go (godotenv), and most frameworks support it. Docker and Docker Compose also read .env files. For sensitive configuration, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, Infisical) rather than .env files in production.
HCL: HashiCorp Configuration Language
HCL (HashiCorp Configuration Language) is used by Terraform, Packer, Vault, Consul, and other HashiCorp tools. It was designed to be more human-friendly than JSON while being more machine-friendly than YAML. HCL uses a block-based syntax with explicit types, expressions, and functions — making it closer to a programming language than a data format.
HCL supports interpolation (${var.name}), conditional expressions, for-each loops, and functions. This makes Terraform configurations readable and maintainable for infrastructure-as-code workflows. HCL can be parsed from and converted to JSON, which is useful for programmatic generation.
While HCL is specific to the HashiCorp ecosystem, its design philosophy — structured blocks with typed attributes — has influenced other configuration formats. The Pkl language (Apple, 2024), CUE (Google), and Dhall are similar configuration languages that validate structure at the format level rather than relying on external schema tools.
FlatBuffers: Zero-Copy Deserialization
FlatBuffers, developed by Google, takes a radically different approach to serialization. Instead of deserializing data into language-native objects (which requires allocating memory and copying bytes), FlatBuffers allows direct access to serialized data without unpacking. The receiver reads fields directly from the binary buffer using calculated offsets.
This zero-copy approach makes FlatBuffers the fastest serialization format for deserialization: 50,000+ operations per second compared to 15,000 for Protocol Buffers and 1,800 for JSON.parse. The trade-off is that FlatBuffers is more complex to use (you cannot easily inspect the data without the schema) and produces slightly larger messages than Protobuf because it includes offset tables.
FlatBuffers is widely used in game engines (Unity uses FlatBuffers internally), Android (the Android runtime uses FlatBuffers for the dex file format), and any system where deserialization latency is critical. Google uses FlatBuffers for TensorFlow model files (.tflite).
CBOR: The IoT Binary Format
CBOR (Concise Binary Object Representation, RFC 8949) is a binary data format designed for IoT and constrained environments. Like MessagePack, CBOR is a binary version of the JSON data model (maps, arrays, strings, numbers, booleans, null). Unlike MessagePack, CBOR is an IETF standard with well-defined extension points, deterministic encoding rules, and support for tags (typed values like dates, URIs, and big numbers).
CBOR is used in COSE (CBOR Object Signing and Encryption, the backbone of FIDO2/WebAuthn), CoAP (Constrained Application Protocol, HTTP for IoT), and CTAP2 (Client to Authenticator Protocol, the YubiKey protocol). Its small parser footprint (typically 1-5 KB of code) makes it suitable for microcontrollers with kilobytes of RAM.
For web developers, CBOR is most relevant through WebAuthn: the attestation and assertion data exchanged during passwordless authentication with FIDO2 security keys is encoded in CBOR. Understanding CBOR structure helps debug authentication failures in WebAuthn implementations.
SQLite: The Database as File Format
SQLite is not traditionally classified as a data format, but it deserves mention because it is increasingly used as one. A SQLite database is a single cross-platform file that can be read by any programming language with a SQLite library (essentially all of them). Unlike CSV or JSON, SQLite supports typed columns, indexes, transactions, and complex queries.
The "SQLite as a file format" approach is advocated by D. Richard Hipp (SQLite creator) and is used by numerous applications: Firefox (bookmarks, history, cookies), Chrome (history, cookies, web data), macOS Photos, WhatsApp, Signal, and many mobile apps. For datasets that are too large or complex for CSV but too small to justify a server-based database, SQLite is an excellent choice.
Recent developments have made SQLite even more attractive as a data format. Litestream enables real-time replication of SQLite databases to S3. Turso/LibSQL adds multi-tenant capabilities. sql.js brings SQLite to the browser via WebAssembly. DuckDB can directly query Parquet, CSV, and JSON files using SQL, blurring the line between data formats and databases.
Avro and Parquet: Big Data Formats
Apache Avro and Apache Parquet are designed for big data processing. Avro is a row-oriented format with an embedded schema, ideal for data serialization in Hadoop, Kafka, and streaming pipelines. Each Avro file contains its schema, making it self-describing and enabling schema evolution (adding/removing fields without breaking existing consumers).
Parquet is a columnar format, storing all values of each column together rather than all columns of each row. This enables dramatic compression (similar values in a column compress well) and selective column reading (query engines read only the columns needed, skipping the rest). Parquet is the standard format for data lakes, analytics engines (Spark, Presto, Athena, BigQuery), and data warehouses.
Rule of thumb: use Avro for write-heavy streaming workloads (Kafka, event sourcing) and Parquet for read-heavy analytical workloads (data lakes, reporting, dashboards).
Data Format Comparison
Data & Serialization Formats: Full Comparison
11 rows
| Format | Year | Readable | Typed | Schema | Comments | Binary Data | Streaming | Parse Speed | File Size | Best For |
|---|---|---|---|---|---|---|---|---|---|---|
| JSON | 2001 | Yes | No (string inference) | JSON Schema | No | Base64 string | JSON Lines | Fast | Medium | APIs, config, web |
| XML | 1998 | Verbose | XSD types | XSD, DTD, RelaxNG | Yes | Base64 | SAX, StAX | Medium | Large | Enterprise, SOAP, config |
| YAML | 2001 | Best | Yes (inferred) | YAML Schema | Yes | !!binary | Multi-document | Slow | Small | Config, CI/CD, K8s |
| CSV | 1972 | Yes | No | RFC 4180 | No (convention) | No | Line-by-line | Very Fast | Small | Tabular data, spreadsheets |
| TSV | 1970 | Yes | No | IANA | No | No | Line-by-line | Very Fast | Small | Copy-paste from spreadsheets |
| TOML | 2013 | Yes | Yes | Taplo | Yes | No | No | Fast | Small | App config, Cargo.toml |
| INI | 1981 | Yes | No | None | Yes | No | No | Very Fast | Minimal | Simple key-value config |
| Protocol Buffers | 2008 | No (binary) | Yes | .proto files | Yes (.proto) | Native | Yes (delimited) | Fastest | Minimal | gRPC, high-perf APIs |
| MessagePack | 2008 | No (binary) | Yes | No standard | No | Native | Yes | Very Fast | Minimal | Binary JSON alternative |
| Avro | 2009 | No (binary) | Yes | Embedded | No | Native | Yes | Very Fast | Minimal | Hadoop, data pipelines |
| Parquet | 2013 | No (binary) | Yes | Embedded | No | Native | Row groups | Fast (columnar) | Minimal | Analytics, data lakes |
Data Format Popularity by Use Case
Data Format Adoption by Use Case (%)
Source: OnlineTools4Free Research
Key Finding
JSON dominates web APIs (92%) and mobile apps (85%). YAML dominates CI/CD (90%). Parquet dominates big data analytics (65%). No single format is best for everything.
Choose based on use case: JSON for APIs, YAML for config, CSV for tabular exchange, Protobuf for high-performance services, Parquet for analytics.
Format and Validate JSON
Use the tool below to format, validate, and minify JSON data. Paste your JSON to check its syntax, or minify it for production use.
Try it yourself
Json Formatter
Data Format Parsing Performance
Parsing speed matters for APIs handling thousands of requests per second. The benchmark below tests serialization and deserialization of 10,000 records with 15 fields each. FlatBuffers is the fastest for deserialization (50,000 ops/sec) because it uses zero-copy access — the data is read directly from the buffer without unpacking. Protocol Buffers are 15,000 ops/sec. JSON.parse is respectable at 1,800 ops/sec, but simdjson (a SIMD-accelerated parser) reaches 12,000 ops/sec.
Data Format Parsing Performance (ops/sec, 10K records)
12 rows
| Format | Serialize ops/s | Deserialize ops/s | Size (KB) | Language |
|---|---|---|---|---|
| JSON (JSON.parse) | 1200 | 1800 | 2450 | JavaScript |
| JSON (simdjson) | 0 | 12000 | 2450 | C++ |
| XML (SAX) | 400 | 2200 | 4200 | Java |
| XML (DOM) | 400 | 600 | 4200 | Java |
| YAML | 200 | 350 | 2100 | Python |
| CSV | 5000 | 8000 | 1200 | Python |
| Protocol Buffers | 8000 | 15000 | 480 | C++ |
| MessagePack | 4500 | 9000 | 650 | Go |
| Avro | 6000 | 10000 | 520 | Java |
| TOML | 1000 | 1500 | 1800 | Rust |
| CBOR | 5500 | 11000 | 590 | Go |
| FlatBuffers | 12000 | 50000 | 510 | C++ |
Data Format Size Comparison
How much space does the same data take in different formats? Using 1,000 KB of JSON as a baseline, the chart below shows relative sizes. Parquet is the smallest at 15% of JSON size, thanks to columnar compression. Protocol Buffers are 28%, MessagePack is 35%. Even minified+gzipped JSON is only 18% of the original. XML is 45% larger than JSON due to verbose tag syntax.
Data Format Size Comparison (1MB JSON baseline)
12 rows
| Format | Size (KB) | Ratio vs JSON | Readable | Schema Required |
|---|---|---|---|---|
| JSON | 1000 | 1.00x | Yes | No |
| JSON (minified) | 720 | 0.72x | No | No |
| JSON (gzipped) | 180 | 0.18x | No | No |
| XML | 1450 | 1.45x | Yes | No |
| YAML | 860 | 0.86x | Yes | No |
| CSV | 420 | 0.42x | Yes | No |
| MessagePack | 350 | 0.35x | No | No |
| Protocol Buffers | 280 | 0.28x | No | Yes |
| Avro | 310 | 0.31x | No | Yes |
| Parquet | 150 | 0.15x | No | Yes |
| CBOR | 340 | 0.34x | No | No |
| FlatBuffers | 320 | 0.32x | No | Yes |
Code Example: JSON Schema Validation
JSON Schema validates the structure and values of JSON documents. The example below defines a schema requiring name (non-empty string), email (valid format), and age (integer 0-150). This schema can power form validation, API request validation, and configuration file checking.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"name": { "type": "string", "minLength": 1 },
"email": { "type": "string", "format": "email" },
"age": { "type": "integer", "minimum": 0, "maximum": 150 }
},
"required": ["name", "email"]
}Part 6: Archive & Compression Formats
~4,000 words covering archives and compression algorithms
Archive formats bundle multiple files into one, often with compression to reduce total size. Compression algorithms reduce the size of individual files or data streams. Some formats (ZIP, 7z, RAR) combine archiving and compression. Others (GZIP, Brotli, ZSTD) are pure compression formats that operate on single streams, typically paired with TAR for multi-file archiving.
Compression Concepts: Dictionary, Entropy, and Window
Before diving into specific formats, understanding three core compression concepts helps explain why different algorithms perform differently.
Dictionary-based compression (LZ77/LZ78 family)works by finding repeated sequences in the data and replacing subsequent occurrences with references to earlier ones. The "dictionary" is built from the data itself as it is processed. Longer repeated sequences and more recent occurrences produce better compression. The window size determines how far back the encoder can look for matches: DEFLATE uses 32 KB, Brotli uses 16 MB, ZSTD uses up to 128 MB.
Entropy coding (Huffman, ANS, arithmetic)assigns shorter binary codes to more frequent symbols and longer codes to rarer symbols. In English text, 'e' appears far more often than 'z', so Huffman coding assigns 'e' a 3-bit code and 'z' a 12-bit code. The theoretical minimum encoding length is the Shannon entropy of the data, measured in bits per symbol. Modern entropy coders (ANS, arithmetic) approach this limit within 0.01%.
Context modelingimproves compression by using surrounding data to predict the next symbol. In English text after "th", the next letter is most likely 'e', 'a', or 'i'. A context-aware encoder uses different probability tables for different contexts, achieving better compression than a single global table. Brotli and ZSTD both use context modeling; DEFLATE and LZ4 do not.
All practical compression algorithms combine dictionary-based and entropy coding stages. The dictionary stage exploits exact repetitions (structural redundancy), and the entropy stage exploits statistical patterns (statistical redundancy). The compression ratio is bounded by the data's intrinsic entropy: truly random data cannot be compressed at all.
ZIP: The Universal Archive
ZIP was created by Phil Katz in 1989 and is the most widely supported archive format in the world. Windows, macOS, Linux, iOS, and Android all include native ZIP support. No third-party software is needed to create or extract ZIP files on any major operating system.
ZIP uses DEFLATE compression by default, which provides a good balance of compression ratio and speed. Each file in a ZIP archive is compressed independently, allowing random access to individual files without decompressing the entire archive. This is a key advantage over solid archives (7z, RAR) but comes at the cost of lower compression ratio, since the encoder cannot exploit similarities between files.
ZIP64 Extensions
The original ZIP format was limited to 4 GB per file and 65,535 files per archive. ZIP64 extensions (supported since 2001) remove these limits, allowing individual files up to 16 exabytes and unlimited file count. Modern ZIP tools use ZIP64 automatically when needed.
ZIP Encryption
ZIP supports two encryption methods: the original PKZIP encryption (known as "traditional" or "ZipCrypto") and AES-256 encryption. ZipCrypto is weak and can be cracked within minutes; never use it for sensitive data. AES-256 ZIP encryption (supported by WinZip, 7-Zip, and macOS Archive Utility) is cryptographically secure but not supported by all tools.
GZIP: Web Compression Workhorse
GZIP was created in 1992 by Jean-loup Gailly and Mark Adler as a free replacement for the Unix compress program. It uses the same DEFLATE algorithm as ZIP but operates on single files. GZIP is the foundation of web compression: HTTP Content-Encoding: gzip compresses HTML, CSS, JavaScript, and JSON responses, typically reducing transfer sizes by 60-80%.
For multi-file archiving on Unix/Linux, GZIP is paired with TAR: first TAR creates an uncompressed archive of multiple files, then GZIP compresses the entire archive. The result is a .tar.gz (or .tgz) file. This two-step approach is standard in the Unix world and produces better compression than ZIP because the solid archive allows GZIP to exploit inter-file redundancy.
Brotli: Google's Better Compression
Brotli was developed by Google and standardized as RFC 7932 in 2016. It achieves 15-25% better compression than GZIP with comparable decompression speed. Brotli is supported by all modern browsers for HTTP Content-Encoding: br and has become the preferred compression for static web assets.
Brotli Compression Levels
Brotli supports levels 0-11. Levels 0-4 are fast and suitable for dynamic content (similar speed to GZIP with better ratio). Levels 5-9 offer a good balance for CDN pre-compression. Levels 10-11 are extremely slow (minutes for large files) but achieve the best ratios, suitable only for pre-compressed static assets that are compressed once and served millions of times.
Brotli's secret weapon is a built-in dictionary of common web content (HTML tags, CSS properties, JavaScript keywords, HTTP headers). This dictionary gives Brotli a significant advantage over GZIP for web assets specifically, because common strings do not need to be encoded from scratch.
Zstandard: The Modern All-Rounder
Zstandard (ZSTD) was created by Yann Collet at Facebook and standardized as RFC 8878 in 2021. It is arguably the most important compression algorithm of the 2010s, offering GZIP-level compression ratios at 5-10x the speed, or significantly better ratios at the same speed. ZSTD supports levels from -7 (faster than LZ4, lower ratio) to 22 (slower than LZMA, comparable ratio).
At level 3 (the default), ZSTD compresses at 450 MB/s with a 2.9:1 ratio, versus GZIP's 85 MB/s at 2.5:1. At level 19, ZSTD achieves a 3.6:1 ratio at 5 MB/s, approaching LZMA's 4.5:1 ratio. Decompression is always fast: 1,200 MB/s regardless of compression level.
ZSTD is used by the Linux kernel (since 5.12), Facebook (for log compression, database backups, and storage), and is increasingly adopted as a GZIP replacement for system packages and backups. HTTP Content-Encoding: zstd support is growing in browsers and CDNs.
7z: Maximum Compression
7z is the archive format of the 7-Zip project, using LZMA2 compression by default. LZMA2 achieves the best compression ratios of any general-purpose algorithm: typically 20-40% smaller than ZIP and 10-20% smaller than RAR at maximum settings. The 7z format supports solid archiving, AES-256 encryption (with header encryption), Unicode filenames, and multi-volume archives.
The trade-off is speed and memory: LZMA2 compression at high settings requires 200+ MB of RAM and is 10x slower than DEFLATE. Decompression is faster but still slower than GZIP. For distributing large files where download bandwidth is the bottleneck, 7z's superior compression saves significant time and cost.
RAR: Proprietary but Feature-Rich
RAR is a proprietary archive format created by Eugene Roshal (the "R" in RAR). WinRAR is shareware (famously, the trial never actually expires). RAR5 (current version) offers compression ratios between ZIP and 7z, with unique features including recovery records (redundant data that allows repairing damaged archives) and recovery volumes (separate parity files for multi-volume archives).
RAR's recovery records are its primary advantage: you can allocate 1-10% of the archive size to redundancy data, which can repair corruption from bad sectors, incomplete downloads, or transmission errors. No other common archive format offers this. However, RAR is proprietary (compression requires WinRAR; decompression is available via unrar), making it unsuitable for open-source projects and automated pipelines.
TAR: The Archive Without Compression
TAR (Tape Archive) was created in 1979 for writing data to magnetic tape. It bundles multiple files and directories into a single stream while preserving Unix file permissions, ownership, timestamps, and symbolic links. TAR does not compress; it is always paired with a compression tool: tar.gz (with GZIP), tar.bz2 (with Bzip2), tar.xz (with XZ/LZMA2), or tar.zst (with Zstandard).
This separation of concerns (archiving vs compression) is the Unix philosophy in action: each tool does one thing well, and tools compose via pipes. The advantage is flexibility: you can use any compression algorithm with TAR. The disadvantage is that extracting a single file requires decompressing the entire archive (since the archive is compressed as a solid stream).
Compression Algorithm Benchmarks
The table below benchmarks 12 compression algorithms on the Silesia corpus (a standard benchmark dataset). Compression ratio is uncompressed size divided by compressed size. Speed is measured in MB/s on a single core of a modern CPU.
Compression Algorithm Benchmarks (Silesia Corpus)
12 rows
| Algorithm | Ratio | Compress (MB/s) | Decompress (MB/s) | Memory (MB) |
|---|---|---|---|---|
| DEFLATE (zlib) | 2.5 | 85 | 320 | 0.3 |
| GZIP -9 | 2.7 | 25 | 320 | 0.3 |
| Brotli -6 | 3.2 | 30 | 350 | 4 |
| Brotli -11 | 3.8 | 2 | 350 | 256 |
| ZSTD -3 | 2.9 | 450 | 1200 | 1 |
| ZSTD -19 | 3.6 | 5 | 1200 | 128 |
| LZMA2 | 4.5 | 8 | 150 | 200 |
| LZ4 | 2.1 | 780 | 4300 | 0.016 |
| LZ4 HC | 2.7 | 40 | 4300 | 0.064 |
| Snappy | 1.8 | 550 | 1800 | 0.032 |
| RAR5 | 4 | 12 | 180 | 128 |
| Bzip2 | 3.5 | 15 | 55 | 8 |
Compression Ratio vs Speed
The scatter plot below visualizes the fundamental trade-off in compression: faster algorithms achieve lower ratios, while better ratios require slower algorithms. LZ4 is the fastest but compresses least. LZMA2 compresses most but is the slowest. Zstandard offers the best balance, sitting in the sweet spot where reasonable speed meets good compression.
Compression Ratio vs Speed (MB/s)
Source: OnlineTools4Free Research
Key Finding
Zstandard (ZSTD) offers the best balance of compression ratio and speed. At default settings, it compresses 5x faster than GZIP with 15% better ratio.
ZSTD is the recommended replacement for GZIP in most scenarios: system packages, backups, log compression, and increasingly HTTP compression.
Archive Format Comparison
Archive & Compression Formats
10 rows
| Format | Year | Algorithm | Ratio | Encryption | Solid | Streamable | OS Support | Free | Best For |
|---|---|---|---|---|---|---|---|---|---|
| ZIP | 1989 | DEFLATE | 2.5:1 | AES-256 | No | Yes | Universal | Yes | Universal sharing |
| GZIP | 1992 | DEFLATE | 2.5:1 | No | N/A | Yes | Universal | Yes | HTTP compression, tar.gz |
| 7Z | 1999 | LZMA2 | 4.5:1 | AES-256 | Yes | No | Widespread | Yes | Maximum compression |
| RAR | 1993 | RAR5 | 4.0:1 | AES-256 | Yes | No | Widespread | No | Recovery records |
| TAR | 1979 | None | 1:1 | No | Yes | Yes | Unix/Linux | Yes | Archiving (with gzip/bz2) |
| Brotli | 2015 | Brotli | 3.2:1 | No | N/A | Yes | Widespread | Yes | HTTP compression (static) |
| Zstandard | 2016 | ZSTD | 3.0:1 | No | N/A | Yes | Growing | Yes | Real-time compression |
| XZ | 2009 | LZMA2 | 4.5:1 | No | N/A | Yes | Unix/Linux | Yes | Linux packages |
| BZ2 | 1996 | Burrows-Wheeler | 3.5:1 | No | N/A | Yes | Unix/Linux | Yes | Good ratio + streaming |
| LZ4 | 2011 | LZ4 | 2.1:1 | No | N/A | Yes | Widespread | Yes | Speed-critical compression |
Compression Ratio by Content Type
Compression effectiveness varies dramatically based on content type. Log files compress extremely well (6-8x) because they contain highly repetitive patterns. HTML and CSS compress well (4.5-6x) due to repeated tags and property names. Binary executables compress poorly (2-3x) because they have high entropy. Already-compressed files (JPEG, MP4) gain essentially zero benefit from additional compression.
Compression Ratio by Content Type (ratio = uncompressed / compressed)
10 rows
| Content Type | GZIP | Brotli | ZSTD | LZ4 | LZMA |
|---|---|---|---|---|---|
| HTML | 4.8 | 5.5 | 5.2 | 3.2 | 6 |
| CSS | 5.2 | 6 | 5.8 | 3.5 | 6.5 |
| JavaScript | 4.5 | 5.2 | 5 | 3 | 5.8 |
| JSON API response | 4.2 | 4.8 | 4.6 | 2.8 | 5.2 |
| Source code (mixed) | 3.8 | 4.5 | 4.2 | 2.6 | 5 |
| Log files (syslog) | 6.5 | 7.2 | 7 | 4.5 | 8 |
| Database dump (SQL) | 5.5 | 6.2 | 6 | 3.8 | 6.8 |
| CSV data | 4 | 4.6 | 4.4 | 2.6 | 5 |
| Binary executable | 2.2 | 2.5 | 2.4 | 1.8 | 2.8 |
| Already-compressed (JPEG) | 1.02 | 1.02 | 1.01 | 1 | 1.03 |
HTTP Compression Adoption (2015-2026)
The web's transition from GZIP to Brotli has been gradual but steady. GZIP dominated until 2020, when Brotli crossed 28% adoption. By 2026, Brotli has overtaken GZIP (48% vs 40%). Zstandard (ZSTD) for HTTP is emerging at 11% adoption, primarily driven by CDN providers. The remaining 1% serves uncompressed content, which is a significant missed optimization opportunity.
HTTP Content-Encoding Usage on Top 10M Websites (%)
Source: OnlineTools4Free Research
Part 7: Font Formats
~3,000 words covering 6 font formats
Font formats determine how typeface data (glyph outlines, metrics, hinting instructions, kerning pairs) is stored and delivered. On the web, the format directly impacts page load performance: fonts are render-blocking resources that delay text display until downloaded.
Why Font Format Choices Impact Performance
Fonts are render-blocking resources: the browser will not paint text until it has downloaded and parsed the font file (or a timeout triggers the fallback). A poorly optimized font setup can add 2-4 seconds to text rendering on slow connections. The format, size, loading strategy, and subsetting all affect how quickly text appears.
The performance-optimal font loading strategy in 2026 involves five techniques: (1) WOFF2 format for 50-60% smaller files than TTF, (2) variable fonts to replace multiple static font files with one, (3) unicode-range subsetting to load only needed character sets, (4) font-display: swap for immediate text rendering, and (5) preloading the primary font with <link rel="preload" as="font" type="font/woff2" crossorigin>.
Self-hosted fonts are generally faster than Google Fonts because they eliminate a DNS lookup, TCP connection, and TLS handshake to fonts.googleapis.com. However, Google Fonts provides automatic format negotiation (serving WOFF2 where supported), automatic subsetting based on CSS unicode-range, and CDN distribution. For performance-critical sites, self-host the specific subset of WOFF2 files you need.
Color Fonts and Emoji
Color fonts embed multi-color glyphs, enabling emoji, brand logos, and decorative text in a single font file. Four competing color font technologies exist:
COLR/CPAL (v0 and v1): Uses layered vector shapes with a color palette. COLR v0 (flat colors) is supported by all browsers. COLR v1 (gradients, compositing, transformations) is supported by Chrome 98+ and Firefox 107+. Google Fonts uses COLR v1 for its Noto Color Emoji font, producing much smaller files than bitmap approaches.
CBDT/CBLC: Uses embedded PNG bitmaps at multiple resolutions. This is what Android uses for emoji. Files are large (10+ MB for a full emoji set) because each emoji is stored as multiple PNGs (18x18, 36x36, 72x72, 144x144 pixels).
sbix:Apple's color font format, also using embedded PNGs. Used by Apple Color Emoji, the font shipped on every Mac and iPhone. Even larger than CBDT because Apple stores high-resolution bitmaps.
SVG-in-OpenType: Embeds SVG documents as glyphs. Supported by Firefox and Safari but not Chrome. Produces the most flexible results (arbitrary SVG with gradients, filters, animations) but the largest files and slowest rendering.
For the web in 2026, COLR v1 is the recommended color font technology. It produces the smallest files (Noto Color Emoji is 9.4 MB in CBDT but only 1.85 MB in COLR v1), supports gradients and compositing, and has growing browser support. System emoji fonts are typically 20-40 MB, which is why browsers load them from the OS rather than downloading them.
WOFF2: The Web Font Standard
WOFF2 (Web Open Font Format 2) is the current standard for web fonts. It uses Brotli compression to achieve 30% smaller files than WOFF and 50-60% smaller than raw TTF/OTF. A typical Latin-character font is 15-25 KB in WOFF2, versus 40-60 KB in TTF. WOFF2 is supported by 98% of global browsers (all modern browsers since 2018).
WOFF2 is simply a wrapper around TTF or OTF font data with Brotli compression applied to the font tables. The font data itself (glyph outlines, metrics, features) is identical. Browsers decompress WOFF2 and use the font data exactly as they would raw TTF/OTF. There is no quality difference; only file size and load time benefit.
WOFF: The First Web Font Format
WOFF (Web Open Font Format) was standardized in 2010 as the first purpose-built web font format. It uses zlib (DEFLATE) compression, achieving about 40% smaller files than raw TTF/OTF. WOFF is supported by 99% of browsers, including older versions that lack WOFF2 support.
In 2026, WOFF serves primarily as a fallback for the rare browsers that do not support WOFF2. Include both in your @font-face declaration: format("woff2") first, format("woff") second. The browser will download only the first format it supports.
TTF and OTF: Desktop Fonts
TrueType (TTF, 1991, Apple/Microsoft) and OpenType (OTF, 1996, Microsoft/Adobe) are the standard desktop font formats. OpenType is technically a superset of TrueType, using either TrueType outlines (quadratic Bezier curves, .ttf extension) or CFF outlines (cubic Bezier curves, .otf extension). OpenType adds support for advanced typographic features: ligatures, small caps, old-style numerals, stylistic alternates, contextual alternates, and more.
For desktop use (word processors, design tools), TTF and OTF are interchangeable for basic functionality. OTF with CFF outlines typically produces slightly smaller files and better rendering at small sizes on some platforms. For web use, always convert TTF/OTF to WOFF2.
Variable Fonts: One File, Every Weight
Variable fonts (OpenType 1.8, 2016) contain a single set of glyph outlines plus mathematical instructions to interpolate between design extremes along defined axes. Instead of loading separate files for Regular, Medium, Semibold, Bold, and Black (5 files, ~100 KB WOFF2 each = 500 KB), a single variable font file covers all weights in ~80-120 KB.
Standard Axes
The OpenType specification defines five registered axes: wght (weight, 100-900), wdth (width, 75-125), ital (italic, 0-1), slnt (slant, -90 to 90 degrees), and opsz (optical size, adapting design for different point sizes). Font designers can also create custom axes for any variation they choose.
CSS supports variable fonts via the font-variation-settings property for fine-grained control, or via standard properties like font-weight: 650 (any integer from 1-1000, not just 100/200/.../900). Google Fonts now serves variable fonts by default for supported families, automatically reducing page weight.
EOT: The IE-Only Format
EOT (Embedded OpenType) was created by Microsoft in 2008 for Internet Explorer. It was the only web font format supported by IE 6-8 and was required for IE compatibility until IE was deprecated in 2022. In 2026, there is zero reason to include EOT in your @font-face declarations. Remove it to simplify your CSS and reduce maintenance burden.
Font Loading Strategies
How fonts are loaded affects both performance and user experience. The CSS font-display property controls behavior while fonts load:
font-display: swap— shows text immediately in a fallback font, then swaps to the custom font when loaded. Best for body text where readability during load is critical. Causes a "Flash of Unstyled Text" (FOUT).
font-display: optional — shows text in fallback font if the custom font does not load within ~100ms. The font is still downloaded for subsequent page loads. Best for non-critical decorative fonts. Eliminates layout shift.
font-display: block— hides text for up to 3 seconds while waiting for the font. Causes a "Flash of Invisible Text" (FOIT). Generally not recommended because invisible text is worse than styled text.
font-display: fallback — blocks for ~100ms, then falls back. If the font loads within ~3s, it swaps. After 3s, fallback persists. A middle ground between swap and optional.
For optimal Core Web Vitals, use font-display: swap for primary text fonts and preload them with <link rel="preload" as="font" type="font/woff2" crossorigin>. For decorative or icon fonts, use font-display: optional.
Font Format Comparison
Font Formats: Comparison
6 rows
| Format | Year | Compression | Avg Size (KB) | Browser % | Variable | Color | Best For |
|---|---|---|---|---|---|---|---|
| WOFF2 | 2018 | Brotli | 18 | 98% | Yes | Yes | Modern web fonts |
| WOFF | 2010 | zlib | 28 | 99% | Yes | Yes | Web font fallback |
| TTF | 1991 | None | 45 | 99% | Yes | Limited | Desktop applications |
| OTF | 1996 | CFF | 40 | 99% | Yes | Yes | Design, advanced typography |
| EOT | 2008 | LZ | 35 | IE only | No | No | Legacy IE support |
| SVG Font | 2001 | None | 120 | <5% | No | Yes | Nothing (obsolete) |
Font File Size by Format
Font File Size by Format (KB)
Source: OnlineTools4Free Research
Key Finding
WOFF2 reduces font file size by 50-60% compared to TTF, and 30% compared to WOFF. A single variable font file can replace 10-20 static font files, reducing total payload by 70-90%.
Always serve WOFF2 with WOFF fallback. Consider variable fonts for sites using multiple weights of the same family.
Font Subsetting Savings
Subsetting removes unused glyphs from a font file. A full Inter Variable font with all 2,548 glyphs is 98 KB in WOFF2. Subsetting to Latin characters only (230 glyphs) reduces it to 18 KB — an 82% savings. For a site that only needs digits and basic punctuation (25 glyphs), the font shrinks to just 4 KB.
Font Subsetting Savings (Inter Variable, WOFF2)
5 rows
| Subset | Glyphs | Size (KB) | Savings % |
|---|---|---|---|
| Full font (all glyphs) | 2548 | 98 | 0 |
| Latin + Latin Extended | 420 | 28 | 71 |
| Latin only | 230 | 18 | 82 |
| US-ASCII only | 95 | 10 | 90 |
| Digits + basic punctuation | 25 | 4 | 96 |
Variable Font Axis Impact on File Size
Each additional variation axis increases the variable font file size, but the savings compared to static font files grow even faster. A weight-only variable font (82 KB) replaces 9 static files (162 KB total). Adding width and italic axes increases the variable font to 155 KB but replaces 54 static files totaling 972 KB — an 84% reduction. The break-even point is typically 3-4 weights of the same family.
Variable Font Savings vs Static Font Files
4 rows
| Axes | Static Files Replaced | Static Total (KB) | Variable (KB) | Savings |
|---|---|---|---|---|
| Weight only (wght) | 9 | 162 | 82 | 49% |
| Weight + Width (wght, wdth) | 27 | 486 | 110 | 77% |
| Weight + Width + Italic | 54 | 972 | 155 | 84% |
| Weight + Width + Italic + opsz | 108 | 1944 | 195 | 90% |
Code Example: Variable Font @font-face
The CSS below shows how to load a variable font with weight axis support, unicode-range subsetting, and optimal font-display strategy.
@font-face {
font-family: 'Inter';
src: url('/fonts/Inter-Variable.woff2') format('woff2');
font-weight: 100 900;
font-style: normal;
font-display: swap;
unicode-range: U+0000-00FF, U+0131, U+0152-0153;
}Part 8: 3D & Specialized Formats
~3,000 words covering 10 specialized formats
Beyond the common categories of images, documents, video, and audio, specialized formats serve specific industries: 3D modeling and printing, medical imaging, scientific data, and film production. Understanding these formats is essential for anyone working in these domains.
STL: The 3D Printing Standard
STL (Stereolithography) was created by 3D Systems in 1987 for their stereolithography 3D printers. It describes 3D objects as a collection of triangulated surfaces (triangle meshes) with no color, texture, or material information. Every 3D printer in the world accepts STL files, making it the de facto standard for 3D printing.
STL exists in two formats: ASCII (human-readable but enormous) and binary (compact, preferred). A typical 3D model is 1-50 MB in binary STL. The format stores only surface geometry as triangles, so it cannot represent curves exactly, internal structures, color, or materials. For these features, use 3MF (3D Manufacturing Format), which is slowly replacing STL for advanced 3D printing.
OBJ: The Simple 3D Exchange Format
OBJ (Wavefront Object) was developed by Wavefront Technologies in the early 1990s. It stores 3D geometry as vertices, texture coordinates, normals, and faces in a plain text format. Material properties are stored in an accompanying MTL (Material Template Library) file. OBJ is widely supported by 3D modeling software (Blender, Maya, 3ds Max, Cinema 4D) and is commonly used for static model exchange.
OBJ limitations: no animation support, no scene graph (lights, cameras), no binary format (always text, resulting in large files), and limited material definitions. For modern workflows, glTF has largely replaced OBJ for 3D content distribution.
glTF/GLB: The JPEG of 3D
glTF (GL Transmission Format) is an open standard developed by the Khronos Group for efficient 3D content delivery. Often called "the JPEG of 3D," glTF is designed for runtime loading rather than authoring. It supports meshes, materials (PBR metallic-roughness), textures, animations (skeletal, morph targets), cameras, lights, and scene hierarchy.
glTF comes in two variants: .gltf (JSON text file referencing external binary buffers and textures) and .glb (single binary file containing everything). GLB is preferred for distribution because it is a single file, easier to download, and avoids the complexity of managing multiple files.
Mesh compression extensions (Draco, meshopt) can reduce GLB file sizes by 90% for geometry-heavy models. All major 3D engines (Three.js, Babylon.js, Unity, Unreal, Godot) support glTF loading. Apple uses a USDZ variant but increasingly supports glTF through their ecosystem.
FBX: The Industry Pipeline Format
FBX is a proprietary format owned by Autodesk, widely used for exchanging animated 3D content between DCC (Digital Content Creation) tools. It supports meshes, materials, skeletal animation, blend shapes, lights, cameras, and scene hierarchy. FBX is the standard interchange format for game development (Unreal Engine and Unity both prefer FBX for asset import) and film VFX.
The main criticism of FBX is its proprietary nature: Autodesk controls the specification, and the FBX SDK is the only reliable way to read/write FBX files. Open-source alternatives (Assimp, OpenFBX) provide partial support but cannot guarantee full compatibility. For open workflows, glTF or USD are preferred.
STEP and IGES: CAD Interchange
STEP (Standard for the Exchange of Product Data, ISO 10303) and IGES (Initial Graphics Exchange Specification) are the standard formats for exchanging CAD (Computer-Aided Design) data between different CAD software. Unlike mesh formats (STL, OBJ), STEP preserves exact mathematical surface definitions using B-Rep (Boundary Representation) with NURBS curves and surfaces.
STEP is the current standard, supporting assemblies, tolerances, materials, and product manufacturing information (PMI). IGES is the legacy format, limited to geometry and basic annotation. When exchanging CAD data between SolidWorks, CATIA, NX, Inventor, or Fusion 360, use STEP. Only use IGES when the receiving system cannot read STEP.
DICOM: Medical Imaging
DICOM (Digital Imaging and Communications in Medicine) is the universal standard for medical imaging, used by CT scanners, MRI machines, X-rays, ultrasound, mammography, PET, and nuclear medicine. Every medical imaging device in every hospital worldwide produces DICOM files.
A DICOM file contains both the image data and extensive metadata: patient demographics, study/series information, acquisition parameters (slice thickness, field strength, pulse sequence), and institutional data. DICOM uses a tag-based data structure with thousands of defined data elements.
DICOM supports multiple image compression formats (uncompressed, JPEG, JPEG 2000, JPEG-LS, RLE) and handles multi-frame images (CT/MRI volumes are stored as series of 2D slices). DICOM networking (PACS) enables image storage, retrieval, and display across hospital networks.
NetCDF and HDF5: Scientific Data
NetCDF (Network Common Data Form) and HDF5 (Hierarchical Data Format 5) are the standard formats for scientific and research data. NetCDF stores multidimensional arrays (temperature grids, time series, satellite data) with self-describing metadata (variable names, units, coordinate systems). It is the standard format for climate science, meteorology, and oceanography.
HDF5 provides a more general hierarchical data model with groups (like directories), datasets (N-dimensional arrays), and attributes (metadata). It supports chunked storage, compression (gzip, szip, LZF), parallel I/O, and virtual datasets. HDF5 is used in astronomy (LIGO gravitational wave data), particle physics (CERN), genomics, and deep learning (model weights are often stored in HDF5).
USD: The Future of 3D
USD (Universal Scene Description) was developed by Pixar and open-sourced in 2016. It is a scene composition framework designed for film VFX workflows where hundreds of artists work on the same scene simultaneously. USD supports non-destructive layering (like Photoshop layers for 3D), references, variant sets, and time-sampled animation.
Apple adopted USDZ (USD packaged in ZIP) for AR content on iOS and visionOS. NVIDIA uses USD as the foundation for Omniverse. The Alliance for Open USD (AOUSD, founded by Apple, Pixar, Adobe, Autodesk, NVIDIA) is standardizing USD for cross-industry use. USD may eventually become the standard scene description format for all 3D industries.
Specialized Format Comparison
3D & Specialized Formats
10 rows
| Format | Year | Domain | Data Model | Textures | Animation | Compression | Open Standard | Best For |
|---|---|---|---|---|---|---|---|---|
| STL | 1987 | 3D Printing | Triangle mesh | No | No | No | Yes | 3D printing, prototyping |
| OBJ | 1992 | 3D Graphics | Polygon mesh | Yes (MTL) | No | No | Yes | 3D model exchange |
| glTF/GLB | 2017 | Web 3D | Scene graph | Yes | Yes | Draco, meshopt | Yes (Khronos) | Web 3D, AR/VR |
| FBX | 2006 | Game Dev | Scene graph | Yes | Yes | Yes | No (Autodesk) | Game engines, film VFX |
| STEP | 1994 | CAD | B-Rep solid | No | No | No | Yes (ISO 10303) | CAD data exchange |
| IGES | 1980 | CAD | Surface/Wireframe | No | No | No | Yes (ANSI) | Legacy CAD exchange |
| DICOM | 1993 | Medical | Image + metadata | N/A | Series | JPEG, JPEG 2000 | Yes (NEMA) | Medical imaging |
| NetCDF | 1989 | Science | N-D arrays | N/A | Time dim | zlib, szip | Yes (Unidata) | Climate, atmospheric data |
| HDF5 | 1998 | Science | Hierarchical groups | N/A | Time series | gzip, szip, LZF | Yes (HDF Group) | Large scientific datasets |
| USD | 2016 | 3D/Film | Scene composition | Yes | Yes | Crate (binary) | Yes (Pixar/AOUSD) | Film VFX, Apple Vision Pro |
3D Model File Size Comparison
The same 3D model (Stanford Bunny, 69,000 triangles) produces vastly different file sizes depending on the format. The most dramatic difference is between STL ASCII (6.8 MB) and GLB with Draco compression (380 KB) — a 17x reduction. For web 3D content, GLB+Draco is the clear winner: it is 90% smaller than uncompressed formats while loading faster than text-based alternatives.
3D Model File Size Comparison (Stanford Bunny, 69K triangles)
11 rows
| Format | Size (KB) | Load Time (ms) | Compression |
|---|---|---|---|
| STL (ASCII) | 6800 | 850 | No |
| STL (binary) | 3400 | 120 | No |
| OBJ | 4200 | 280 | No |
| OBJ + MTL | 4250 | 320 | No |
| glTF (JSON + bin) | 3800 | 95 | Yes |
| GLB | 3400 | 85 | Yes |
| GLB + Draco | 380 | 150 | Yes |
| GLB + meshopt | 520 | 92 | Yes |
| FBX (binary) | 3600 | 180 | Yes |
| USD (binary/crate) | 2800 | 110 | Yes |
| PLY | 3200 | 145 | No |
3MF: The Modern 3D Printing Format
3MF (3D Manufacturing Format) was developed by the 3MF Consortium (Microsoft, HP, Autodesk, Shapeways, and others) as a modern replacement for STL. While STL stores only geometry as a triangle mesh, 3MF supports colors, materials, textures, lattice structures, and multiple objects within a single file. The format is XML-based inside a ZIP container (like DOCX and EPUB), making it self-describing and extensible.
3MF also solves common STL problems: it guarantees watertight meshes (no gaps or inverted normals), uses efficient binary encoding for triangle data, and includes build instructions (orientation, support structures) that STL cannot express. Major 3D printing services (Shapeways, i.Materialise) and slicers (PrusaSlicer, Cura, Bambu Studio) support 3MF. For new 3D printing workflows, prefer 3MF over STL whenever your printer software supports it.
PLY: The Point Cloud Format
PLY (Polygon File Format, also called Stanford Triangle Format) was designed at Stanford University for storing 3D scanned data. Unlike STL and OBJ which store only surface geometry, PLY can store per-vertex properties including color, normals, texture coordinates, and custom data channels. PLY supports both ASCII and binary encoding.
PLY is widely used in photogrammetry (3D reconstruction from photographs), LiDAR scanning, and cultural heritage digitization. Tools like Meshlab, CloudCompare, and Open3D use PLY as a primary format. For point cloud data (millions of unconnected 3D points), PLY is often more appropriate than mesh formats like glTF because it naturally represents unstructured point data.
Alembic: Baked Animation Cache
Alembic (.abc) is an interchange format for baked (pre-computed) animation data, developed by Sony Pictures Imageworks and Industrial Light & Magic. Unlike FBX or glTF which store skeletal rigs and skinning weights, Alembic stores the final deformed geometry at each frame. This makes it ideal for transferring complex simulations (cloth, fluid, hair) between DCC tools without worrying about rig compatibility.
Alembic is the standard format in film VFX pipelines for passing animated geometry between departments. A character animated in Maya is exported as Alembic for lighting in Houdini, then imported as Alembic in Nuke for compositing. Each department only needs the final mesh positions, not the underlying rig.
Image Format Maximum Specifications
Technical limits of each image format
Every image format has hard limits on resolution, color depth, and file size imposed by its specification. These limits rarely matter for web use but are critical for professional workflows. JPEG is limited to 65,535 x 65,535 pixels and 12-bit color depth. JPEG XL pushes these limits dramatically: up to 1 billion pixels per side, 32-bit float per channel, and up to 4,099 channels. WebP's 16,383 pixel limit is the most restrictive of the modern formats and prevents its use for certain panoramic and professional photography applications.
Image Format Maximum Specifications
12 rows
| Format | Max Width | Max Height | Max Bit Depth | Max Channels | Color Spaces | Specification |
|---|---|---|---|---|---|---|
| JPEG | 65535 | 65535 | 12 | 4 | sRGB, CMYK | ISO/IEC 10918-1 |
| PNG | 2147483647 | 2147483647 | 16 | 4 | sRGB, ICC | ISO/IEC 15948 |
| WebP | 16383 | 16383 | 8 | 4 | sRGB | Google WebP spec |
| AVIF | 65536 | 65536 | 12 | 4 | sRGB, P3, BT.2020 | ISO/IEC 23000-22 |
| JPEG XL | 1073741823 | 1073741823 | 32 | 4099 | sRGB, P3, BT.2020, ICC | ISO/IEC 18181 |
| HEIC | 65535 | 65535 | 10 | 4 | sRGB, P3, BT.2020 | ISO/IEC 23008-12 |
| GIF | 65535 | 65535 | 8 | 1 | Palette (256) | GIF89a spec |
| TIFF | 4294967295 | 4294967295 | 64 | Unlimited | sRGB, CMYK, LAB, ICC | TIFF 6.0 + supplements |
| BMP | 2147483647 | 2147483647 | 32 | 4 | sRGB | BMP v5 header |
| SVG | 0 | 0 | 0 | 0 | sRGB, P3 (CSS) | W3C SVG 2.0 |
| ICO | 256 | 256 | 32 | 4 | sRGB | ICO format spec |
| JPEG 2000 | 4294967295 | 4294967295 | 38 | 16384 | sRGB, ICC, LAB | ISO/IEC 15444-1 |
JPEG XL's extreme specifications (billion-pixel images, 4,099 channels, 32-bit float) make it suitable for scientific imaging, remote sensing, and any application that pushes beyond the capabilities of existing formats. The 4,099-channel support accommodates multispectral and hyperspectral imaging where sensors capture hundreds of wavelengths simultaneously.
How Compression Algorithms Work: Step by Step
Detailed internal operation of DEFLATE, LZ4, Brotli, and Zstandard
All lossless compression algorithms exploit the same fundamental principle: data contains patterns, and patterns can be represented more compactly than raw bytes. The difference between algorithms is how they find and encode those patterns, and the trade-offs they make between speed, memory, and compression ratio. The sections below explain the internal operation of four important algorithms step by step.
DEFLATE (Used by ZIP, GZIP, PNG)
DEFLATE is the most widely deployed compression algorithm in history, used by ZIP, GZIP, and PNG. Despite being designed in 1993, it remains the baseline that every other algorithm is compared against. DEFLATE operates in four steps:
Step 1: LZ77 Matching.The algorithm scans the input data looking for repeated byte sequences. When it finds a match, instead of storing the bytes again, it emits a (length, distance) pair: "copy 15 bytes from 342 bytes ago." The search window is limited to 32 KB in standard DEFLATE. Longer matches and closer distances produce better compression.
Step 2: Lazy Matching.Before committing to a match, the algorithm checks whether starting one byte later would produce a longer match. If so, it emits the current byte as a literal and uses the longer match instead. This "lazy evaluation" improves compression by 2-5% at the cost of additional CPU time.
Step 3: Huffman Coding. The encoder builds frequency tables for all the literal bytes and length/distance pairs. More frequent symbols get shorter codes; rarer symbols get longer codes. Two Huffman trees are built: one for literals+lengths and one for distances. The trees themselves are encoded in the output so the decoder can reconstruct them.
Step 4: Block Splitting. The input is divided into blocks. Each block can use one of three modes: stored (no compression, for already-compressed data), fixed Huffman (predefined tables, fast but suboptimal), or dynamic Huffman (custom tables per block, best compression). Block boundaries are chosen to maximize compression.
LZ4 (Used by Linux Kernel, ZFS, Game Engines)
LZ4 prioritizes decompression speed above all else. It decompresses at 4.3 GB/s (faster than memcpy on some architectures) while still achieving meaningful compression ratios (2.1:1). LZ4 achieves this speed through radical simplification:
Step 1: Hash Matching. Each 4-byte sequence is hashed and looked up in a hash table. Unlike DEFLATE which searches chains of matches, LZ4 uses a single-probe hash (one lookup, no chains). If the probe misses, the byte is emitted as a literal. This makes matching extremely fast but misses some opportunities.
Step 2: Literal/Match Encoding. The output consists of alternating literal runs (unmatched bytes copied directly) and match references (offset + length). A single token byte encodes both the literal length (high 4 bits) and match length (low 4 bits). Values over 15 use additional bytes.
Step 3: Offset Encoding.Match offsets are stored as 16-bit little-endian values, limiting the match window to 64 KB. This constraint is intentional: 16-bit offsets can be decoded with a single memory read, contributing to LZ4's exceptional decompression speed.
Brotli (Used for Web Content Compression)
Brotli achieves 15-25% better compression than GZIP primarily through two innovations: a built-in dictionary and context modeling.
Step 1: Dictionary Lookup.Brotli includes a 120 KB static dictionary of common web content: HTML tags (<div>, <span>), CSS properties (font-size, margin-bottom), JavaScript keywords (function, return), and HTTP header fragments. When the encoder finds a match in this dictionary, it encodes just the dictionary reference (a few bytes) instead of the full string. This gives Brotli a 5-10% advantage over GZIP specifically for web content.
Step 2: LZ77 Matching with Configurable Depth.Like DEFLATE, Brotli searches for repeated sequences. But Brotli supports much deeper searches at high compression levels (level 11 explores thousands of candidates) while using a 16 MB window (vs DEFLATE's 32 KB). More candidates and a larger window mean better matches.
Step 3: Context Modeling. Brotli uses the previous two bytes to select different probability tables for different contexts. English text after a space has different character frequencies than text after a vowel. By maintaining separate tables for different contexts, Brotli achieves better entropy coding than a single global table.
Step 4: ANS Coding. Instead of Huffman coding (used by DEFLATE), Brotli uses Asymmetric Numeral Systems (ANS), which achieves compression within 0.01% of the theoretical optimum while being faster to encode and decode than arithmetic coding.
Zstandard (Modern All-Rounder)
Zstandard (ZSTD) achieves GZIP-level compression at 5-10x the speed through several innovations:
Step 1: Multi-Hash Matching. ZSTD uses multiple hash tables with different hash sizes to find matches at different lengths efficiently. Short matches (3-4 bytes) use a 3-byte hash; longer matches use a 4-6 byte hash. This avoids the trade-off between speed and match quality that simpler schemes face.
Step 2: FSE (Finite State Entropy). ZSTD uses FSE, a table-based entropy coder that is faster than Huffman coding while approaching arithmetic coding accuracy. FSE is particularly efficient for small alphabets (like match lengths and offsets) and can be vectorized for SIMD acceleration.
Step 3: Repeat Offset Tracking. ZSTD tracks the 3 most recent match offsets. When a new match has the same offset as a recent one, it is encoded with just 1-2 bits instead of the full offset value. This is highly effective for structured data (JSON, XML, source code) where similar patterns repeat at consistent intervals.
Step 4: Dictionary Mode. ZSTD can be trained on sample data to create a custom dictionary. For small payloads (API responses under 1 KB), dictionary compression can improve ratios by 2-5x because the dictionary provides context that the payload alone cannot. This is why ZSTD with dictionaries is the preferred compression for database page compression and small API responses.
The MP3 Psychoacoustic Model Explained
How MP3 decides which sounds to keep and which to discard
MP3's compression is built on a psychoacoustic model — a mathematical representation of how human hearing perceives sound. The model identifies which parts of an audio signal are inaudible (either too quiet to hear or masked by louder nearby sounds) and discards them. This is fundamentally different from lossless compression: MP3 permanently removes data, but the removed data was inaudible anyway.
Simultaneous Masking
When a loud tone is playing, it raises the hearing threshold for nearby frequencies. A 1 kHz tone at 60 dB makes frequencies between 800 Hz and 1.2 kHz effectively inaudible unless they are also loud. The MP3 encoder calculates this masking effect for every frequency band and discards any signal components that fall below the masking threshold.
The masking effect is asymmetric: a tone masks higher frequencies more effectively than lower frequencies. This is because the basilar membrane in the inner ear propagates energy from the base (high frequencies) toward the apex (low frequencies). The table below shows how a masker at each frequency raises the hearing threshold at other frequencies.
Temporal Masking
Masking also occurs in the time domain. A loud sound masks softer sounds for approximately 5-20 ms after the loud sound ends (post-masking) and even 1-5 ms before it begins (pre-masking, due to neural processing delays). The MP3 encoder uses a short window (192 samples) when it detects transients (drum hits, consonants) to preserve temporal detail, and a long window (576 samples) for sustained tones to achieve better frequency resolution.
Absolute Threshold of Hearing
Even without masking, the human ear has a frequency-dependent sensitivity curve. We are most sensitive between 1-4 kHz (the speech frequency range) and much less sensitive at very low (<100 Hz) and very high (>16 kHz) frequencies. The MP3 encoder permanently discards any signal components below this absolute threshold, regardless of other content. At 128 kbps, MP3 typically cuts all frequencies above 16 kHz; at 320 kbps, it preserves up to 20 kHz.
Why Opus Sounds Better Than MP3
Opus uses a more sophisticated psychoacoustic model than MP3 and operates on different principles for different content types. For speech (below 8 kHz), Opus uses SILK (a linear prediction codec optimized for voice, developed by Skype). For music (above 8 kHz), it uses CELT (a frequency-domain codec with better frequency resolution than MP3's MDCT). The encoder dynamically blends these two modes based on the content, giving Opus a significant advantage over MP3's one-size-fits-all approach.
Additionally, Opus benefits from 20 years of psychoacoustic research that was not available when MP3 was designed. Opus uses a more accurate masking model, better bit allocation, and a more efficient entropy coder. The result is transparent quality at 128 kbps (where MP3 still has audible artifacts) and acceptable quality down to 32 kbps (where MP3 sounds terrible).
CDN Auto-Format Negotiation
How CDNs automatically serve the best format
Modern CDNs can automatically convert and serve images in the optimal format based on the browser's Accept header. When a browser sends Accept: image/avif,image/webp,*/*, the CDN can serve AVIF to that browser while serving WebP or JPEG to older browsers — all from a single original image. This eliminates the need to pre-generate multiple format versions.
CDN Image Format Auto-Negotiation Capabilities
6 rows
| CDN | Image Formats | HTTP Compression | Auto Resize | Cost |
|---|---|---|---|---|
| Cloudflare | AVIF, WebP, original | Brotli, GZIP, ZSTD | Yes (Polish) | Free (basic) |
| Vercel/Next.js | AVIF, WebP (via next/image) | Brotli, GZIP | Yes | Included |
| Cloudinary | AVIF, WebP, JPEG XL, HEIC | N/A | Yes | Free tier + paid |
| Imgix | AVIF, WebP, JPEG XL | N/A | Yes | $5/1000 images |
| Bunny CDN | AVIF, WebP | Brotli, GZIP | Yes (Optimizer) | $0.01/GB |
| Fastly | AVIF, WebP, JPEG XL | Brotli, GZIP, ZSTD | Yes | $0.08/GB |
For most websites, using a CDN with automatic format negotiation is the simplest path to serving modern image formats. You upload JPEG or PNG originals, and the CDN handles conversion, caching, and content negotiation automatically. This approach requires zero changes to your HTML — the same <img> tag serves different formats to different browsers.
Key Finding
CDN-based format negotiation is the lowest-effort path to serving AVIF and WebP. Cloudflare Polish, Vercel next/image, and Cloudinary all handle format conversion automatically from a single source image.
For sites not using these CDNs, the HTML <picture> element with multiple <source> tags provides the same capability with slightly more markup.
Part 9: Encoding & Character Sets
~3,000 words covering encodings and binary-to-text transforms
Character encoding determines how text is stored as bytes. A wrong encoding turns readable text into garbled nonsense (known as "mojibake"). This section explains every major encoding scheme, why UTF-8 won the encoding war, and the binary-to-text encodings that enable binary data in text-only contexts.
Unicode: One Character Set to Rule Them All
Before Unicode, the world had hundreds of incompatible character encodings. Japanese text used Shift_JIS. Russian used KOI8-R. Arabic used ISO 8859-6. Chinese used GB2312. A document written in one encoding would display as garbled nonsense ("mojibake") if opened with a different encoding. International text that mixed scripts was essentially impossible.
Unicode, first published in 1991 and now maintained by the Unicode Consortium, solved this by defining a single character set that includes every character from every writing system in the world. As of Unicode 16.0 (2024), it contains 149,813 characters covering 161 scripts, from modern Latin and CJK to ancient Egyptian hieroglyphics. Unicode also includes 3,790 emoji, mathematical symbols, musical notation, and Braille patterns.
Unicode assigns each character a code point: a number from U+0000 to U+10FFFF. The total code space is 1,114,112 positions, of which about 13% are currently assigned. The space is divided into 17 planes of 65,536 code points each. Plane 0 (Basic Multilingual Plane, BMP) contains the most commonly used characters. Planes 1-16 (Supplementary Planes) contain emoji, rare scripts, CJK extensions, and mathematical symbols.
The critical distinction: Unicode defines which characters exist and what their code points are. It does not define how those code points are stored as bytes. That is the job of encodings: UTF-8, UTF-16, and UTF-32 are three different ways to encode the same Unicode code points as byte sequences.
ASCII: Where It All Began
ASCII (American Standard Code for Information Interchange) was standardized in 1963 and defines 128 characters using 7 bits per character. The first 32 codes (0-31) are control characters (carriage return, line feed, tab, bell). Codes 32-126 are printable: uppercase and lowercase Latin letters, digits 0-9, punctuation, and a few symbols.
ASCII was designed for English and does not support accented characters, non-Latin scripts, or symbols beyond basic punctuation. However, ASCII compatibility is the foundation of all modern encodings: UTF-8, Latin-1, and Windows-1252 are all supersets of ASCII for codes 0-127.
UTF-8: The Universal Encoding
UTF-8 was designed by Ken Thompson and Rob Pike in September 1993 at a New Jersey diner. It is a variable-width encoding that represents Unicode code points using 1 to 4 bytes:
1 byte (0xxxxxxx): ASCII characters U+0000 to U+007F. This means all ASCII text is valid UTF-8 without any modification — backward compatibility that was critical for adoption.
2 bytes (110xxxxx 10xxxxxx): Latin, Greek, Cyrillic, Arabic, Hebrew characters U+0080 to U+07FF. Most European and Middle Eastern text uses 1-2 bytes per character.
3 bytes (1110xxxx 10xxxxxx 10xxxxxx): Chinese, Japanese, Korean (CJK) characters, most of the BMP (Basic Multilingual Plane), U+0800 to U+FFFF.
4 bytes (11110xxx 10xxxxxx 10xxxxxx 10xxxxxx): emoji, rare scripts, historical characters, U+10000 to U+10FFFF.
UTF-8 is self-synchronizing: the first byte of each character uniquely identifies the character length, and continuation bytes (10xxxxxx) cannot be confused with start bytes. This means if bytes are lost or corrupted, only the affected characters are damaged; the decoder can resynchronize at the next start byte.
As of 2026, 98.2% of all websites use UTF-8. The W3C, WHATWG, and IETF all recommend UTF-8 as the default encoding. There is no longer any valid reason to use any other encoding for new content.
UTF-16: Windows and Java Internals
UTF-16 uses 2 or 4 bytes per character. Characters in the Basic Multilingual Plane (U+0000 to U+FFFF) use 2 bytes. Characters outside the BMP (U+10000 and above, including emoji) use a surrogate pair of two 16-bit code units (4 bytes total).
UTF-16 is used internally by Windows (the NTFS filesystem, Win32 API, and .NET all use UTF-16), Java (String objects are sequences of UTF-16 code units), and JavaScript (string indices are UTF-16 code unit positions, which is why "emoji".length returns unexpected values). It is rarely used for file storage or web content because it is not backward-compatible with ASCII and has byte order issues (big-endian vs little-endian, requiring a BOM).
UTF-32: Fixed Width, Maximum Waste
UTF-32 uses exactly 4 bytes for every character, providing direct mapping between code points and code units. This simplifies certain string operations (random access by code point index is O(1)) but wastes enormous space: ASCII text in UTF-32 is 4x the size of UTF-8. UTF-32 is used internally by some text processing libraries and Python 3 (for narrow builds), but it is never used for file storage or network transmission.
Legacy Encodings: Latin-1, Windows-1252, and Others
Before Unicode, every language or region had its own encoding. Latin-1 (ISO 8859-1) covers Western European languages with 256 characters. Windows-1252 is Microsoft's extension of Latin-1, adding typographic characters (curly quotes, em dash, euro sign) in the 128-159 range where Latin-1 has control characters. This difference causes the common bug where Word documents show 'smart quotes' as garbage characters on non-Windows systems.
Shift_JIS (Japanese), GB2312/GBK/GB18030 (Chinese), EUC-KR (Korean), and KOI8-R (Russian) are legacy encodings that are still encountered in older data. All should be converted to UTF-8 when possible. The Python chardet library and the ICU library can detect encoding automatically for conversion.
Base64: Binary to Text
Base64 encoding converts arbitrary binary data into a text representation using 64 ASCII characters: A-Z (0-25), a-z (26-51), 0-9 (52-61), + (62), / (63), with = for padding. Every 3 bytes of input produce 4 characters of output, resulting in a 33% size increase.
Base64 is used for: email attachments (MIME encoding), embedding images in CSS/HTML (data: URIs), storing binary data in JSON or XML, HTTP Basic Authentication (encoding username:password), and JWT (JSON Web Token) payloads. Base64url is a variant that replaces + with - and / with _, making it safe for URLs.
Important: Base64 is encoding, not encryption. It provides zero security — anyone can decode it. Never use Base64 to "protect" sensitive data.
URL Encoding: Percent Encoding
URLs can only contain a subset of ASCII characters: letters, digits, and a few symbols (- _ . ~). All other characters must be percent-encoded: the character's UTF-8 bytes are represented as %XX where XX is the hexadecimal byte value. Space becomes %20 (or + in form data), the euro sign becomes %E2%82%AC (three UTF-8 bytes), and emoji become even longer sequences.
URL encoding is defined in RFC 3986. Common mistakes: encoding the entire URL (only the path and query parameters should be encoded, not the scheme, host, or port), double-encoding (encoding already-encoded characters), and forgetting that + means space only in application/x-www-form-urlencoded data (in regular URLs, + is literal).
Encoding Comparison
Character Encodings Compared
9 rows
| Encoding | Year | Bits/Char | Max Characters | ASCII Compatible | Self-Sync | Web Usage | Coverage | Best For |
|---|---|---|---|---|---|---|---|---|
| ASCII | 1963 | 7 | 128 | N/A | Yes | 1.2% | English only | Legacy systems, protocols |
| UTF-8 | 1993 | 8-32 | 1112064 | ASCII | Yes | 98.2% | All Unicode | Web, files, everything |
| UTF-16 | 1996 | 16-32 | 1112064 | No | Partial | 0.01% | All Unicode | Windows internals, Java |
| UTF-32 | 2000 | 32 | 1112064 | No | Yes | <0.01% | All Unicode | Fixed-width processing |
| Latin-1 (ISO 8859-1) | 1987 | 8 | 256 | ASCII | Yes | 0.6% | Western European | Legacy Western content |
| Windows-1252 | 1985 | 8 | 256 | ASCII (mostly) | Yes | 0.1% | Western European+ | Legacy Windows apps |
| Shift_JIS | 1982 | 8-16 | 7000 | ASCII (mostly) | No | <0.1% | Japanese | Legacy Japanese systems |
| GB18030 | 2000 | 8-32 | 1112064 | ASCII, GBK | Yes | <0.1% | All Unicode + CJK | Chinese government standard |
| EUC-KR | 1991 | 8-16 | 17000 | ASCII | No | <0.1% | Korean | Legacy Korean systems |
Key Finding
UTF-8 is used by 98.2% of all websites and should be the only encoding used for new content. It supports all 149,000+ Unicode characters while maintaining full backward compatibility with ASCII.
Always specify encoding explicitly: <meta charset='utf-8'> in HTML, Content-Type: text/plain; charset=utf-8 in HTTP headers, and UTF-8 as the default in your editor and database.
Encode and Decode
Use the Base64 encoder/decoder below to convert between binary data and Base64 text representation.
Try it yourself
Base64 Encoder
Character Encoding Adoption Over Time
The web's transition to UTF-8 is one of the most successful standardization efforts in technology history. In 2010, only 50% of websites used UTF-8. By 2026, that figure has reached 98.2%. The remaining holdouts are primarily legacy content in regional encodings (Latin-1, Windows-1252, Shift_JIS). For any new content, there is zero reason to use anything other than UTF-8.
Character Encoding Usage on the Web (%)
Source: OnlineTools4Free Research
File Format Detection: Magic Bytes
Operating systems and web servers often determine file types by examining the first few bytes of a file (called "magic bytes" or "file signatures") rather than relying on file extensions, which can be easily changed. The table below shows the signature bytes for common formats. Knowing these signatures is useful for debugging content-type issues, building file upload validators, and understanding why renaming a .txt file to .jpg does not make it an image.
File Format Magic Bytes (Signatures)
25 rows
| Format | Hex Bytes | ASCII |
|---|---|---|
| JPEG | FF D8 FF | N/A |
| PNG | 89 50 4E 47 0D 0A 1A 0A | .PNG.... |
| GIF87a | 47 49 46 38 37 61 | GIF87a |
| GIF89a | 47 49 46 38 39 61 | GIF89a |
| WebP | 52 49 46 46 .. .. .. .. 57 45 42 50 | RIFF....WEBP |
| AVIF | .. .. .. .. 66 74 79 70 61 76 69 66 | ....ftypavif |
| HEIC | .. .. .. .. 66 74 79 70 68 65 69 63 | ....ftypheic |
| BMP | 42 4D | BM |
| TIFF (LE) | 49 49 2A 00 | II*. |
| TIFF (BE) | 4D 4D 00 2A | MM.* |
| 25 50 44 46 | ||
| ZIP | 50 4B 03 04 | PK.. |
| GZIP | 1F 8B | N/A |
| 7z | 37 7A BC AF 27 1C | 7z.... |
| RAR5 | 52 61 72 21 1A 07 01 00 | Rar!.... |
Page 1 of 2
Common MIME Types Reference
MIME types (also called media types or content types) are sent in HTTP Content-Type headers to tell the browser how to handle a response. Using the wrong MIME type causes rendering failures: a CSS file served as text/html will not be applied; a WOFF2 font served without the correct type may be blocked by CORS. The table below lists the most commonly used MIME types.
Common MIME Types
30 rows
| Extension | MIME Type | Category |
|---|---|---|
| .html | text/html | Document |
| .css | text/css | Document |
| .js | text/javascript | Document |
| .json | application/json | Data |
| .xml | application/xml | Data |
| .csv | text/csv | Data |
| .jpg | image/jpeg | Image |
| .png | image/png | Image |
| .gif | image/gif | Image |
| .webp | image/webp | Image |
| .avif | image/avif | Image |
| .svg | image/svg+xml | Image |
| .jxl | image/jxl | Image |
| .ico | image/x-icon | Image |
| .mp4 | video/mp4 | Video |
Page 1 of 2
Part 10: Glossary of File Format Terms
80 terms defined with context and related tools
This glossary defines 80 essential terms used throughout this guide and in file format discussions generally. Each term includes a 2-3 sentence definition and, where applicable, a link to a related tool. Terms are organized alphabetically within categories.
3D
Audio
Audio/Image
Audio/Video
Compression
Data
Document
Encoding
Font
General
Image
Image/Video
Video
Web
Part 11: Frequently Asked Questions
30 questions answered in detail
These are the most common questions about file formats, drawn from search data, forums, and reader submissions. Each answer is concise but complete, providing actionable guidance rather than vague generalities.
What is the best image format for the web in 2026?
AVIF is the best overall image format for the web in 2026, offering 50% smaller files than JPEG with better quality. It has 95% browser support. Use WebP as a fallback for the remaining 5%. For transparency and simple graphics, WebP or AVIF with alpha channels are ideal. PNG remains necessary for pixel-perfect lossless images.
Should I use WebP or AVIF?
Use AVIF as your primary format for photographs and complex images — it offers 20% better compression than WebP. Use WebP as a fallback for browsers that do not support AVIF. WebP is still the safer choice if you can only serve one modern format, since it has 98% browser support vs 95% for AVIF. For animations, WebP is more widely supported than AVIS (animated AVIF).
Why did Chrome drop JPEG XL support?
Google removed JPEG XL support from Chrome in October 2023 (Chrome 110), citing insufficient interest from the web ecosystem and preferring to focus on AVIF and WebP. The decision was controversial because JPEG XL offers unique features like lossless JPEG recompression and progressive decoding. Safari still supports JPEG XL, and there is ongoing community pressure to reverse the decision.
What is the difference between a codec and a container?
A codec (H.264, AV1, AAC) is the algorithm that encodes and decodes media data — it determines the compression method and quality. A container (MP4, MKV, WebM) is the file format that wraps encoded streams together — it determines how video, audio, and subtitles are packaged. An MP4 container can hold H.264 video with AAC audio, or H.265 video with AC-3 audio.
Is MP3 dead?
MP3 is not dead but it is technically obsolete. All MP3 patents expired in 2017, making it royalty-free. However, AAC offers better quality at the same bitrate, and Opus is superior to both. MP3 remains widely used due to universal compatibility. For new projects, use Opus (best quality-per-bit) or AAC (widest ecosystem support). For archival, use FLAC (lossless).
What is the best audio format for quality?
For maximum quality, use a lossless format: FLAC (widely supported, open source), ALAC (Apple ecosystem), or WAV/AIFF (uncompressed, largest files). For lossy audio, Opus at 128+ kbps is transparent (indistinguishable from lossless) for most listeners. For streaming, AAC at 256 kbps or Opus at 128 kbps are both excellent choices.
JSON vs YAML: which should I use?
Use JSON for APIs, data exchange, and machine-to-machine communication — it is the universal standard, parsed faster, and unambiguous. Use YAML for configuration files that humans edit frequently — it supports comments, is more readable, and requires less punctuation. YAML is the standard for Kubernetes, CI/CD pipelines, and many DevOps tools.
What is the best compression format?
It depends on your priority. For best compression ratio: 7z with LZMA2. For fastest compression: LZ4 or Zstandard. For universal compatibility: ZIP. For web content: Brotli (static) or Zstandard (dynamic). For Linux packages: XZ or Zstandard. Zstandard (ZSTD) is the best all-around choice in 2026, offering near-LZMA ratios at 50x the speed.
How do I choose a video format for my website?
Use MP4 with H.264 for maximum compatibility (100% browser support). For better compression, serve WebM with VP9 or AV1 to supporting browsers. Use the HTML <video> element with multiple <source> tags to offer AV1 first, VP9 second, and H.264 as fallback. For live streaming, use HLS (HTTP Live Streaming) with H.264 segments.
What is UTF-8 and why should I care?
UTF-8 is the dominant text encoding for the web (98.2% of websites). It can represent every character in Unicode (149,000+ characters including all languages, symbols, and emoji) while remaining backward-compatible with ASCII. Use UTF-8 for everything: HTML, CSS, JavaScript, JSON, databases. Specify it explicitly: <meta charset="utf-8"> in HTML, and "encoding": "utf-8" in your editor.
PDF vs DOCX: when should I use each?
Use PDF when the document layout must be preserved exactly (contracts, reports, printed materials). Use DOCX when the document needs to be edited by others (collaboration, templates, drafts). PDF is read-only by design; DOCX is editable by design. For archival, use PDF/A. For e-books, use EPUB instead of either.
What font format should I use on the web?
Use WOFF2 as your primary web font format — it uses Brotli compression and is supported by 98% of browsers. Include WOFF as a fallback for older browsers. Never serve raw TTF or OTF on the web — they are uncompressed and significantly larger. Use the @font-face CSS rule with format hints: format("woff2") and format("woff").
What is the difference between lossy and lossless compression?
Lossy compression permanently removes data to achieve higher compression ratios (10:1 to 50:1). JPEG, MP3, and H.264 are lossy. Lossless compression preserves all original data and achieves lower ratios (2:1 to 4:1). PNG, FLAC, and ZIP are lossless. Use lossy for distribution (smaller files) and lossless for archival/editing (preserves quality).
Is SVG better than PNG for icons?
Yes, SVG is almost always better than PNG for icons, logos, and simple graphics. SVG files are resolution-independent (sharp at any size), typically smaller (a simple icon might be 1 KB as SVG vs 5 KB as PNG), styleable with CSS, animatable, and accessible. Use PNG only when the graphic is too complex for vector representation (photographs, textures).
What is HEIC and can I use it on the web?
HEIC (High Efficiency Image Container) uses HEVC compression and is the default photo format on iPhones since iOS 11. It offers 50% smaller files than JPEG with similar quality. However, web support is limited to Safari (~21% browser support). Convert HEIC to WebP or AVIF for web use. HEIC also has patent licensing issues that limit adoption.
CSV vs JSON for data exchange?
Use CSV for simple tabular data (spreadsheets, databases, data science). CSV is universally supported, smaller, and faster to parse. Use JSON for structured/nested data (APIs, configurations, complex objects). JSON supports types, nesting, and arrays. For big data analytics, consider Parquet (columnar, compressed, typed) over both CSV and JSON.
What is Brotli and should I use it?
Brotli is a compression algorithm by Google that achieves 15-25% better compression than GZIP with similar decompression speed. It is supported by all modern browsers for HTTP Content-Encoding. Use Brotli for static assets (CSS, JS, HTML) where you can pre-compress at high quality. For dynamic content, GZIP or Zstandard may be faster. Configure your CDN to serve Brotli with fallback to GZIP.
What is the best format for 3D models on the web?
glTF 2.0 (GL Transmission Format) is the standard for 3D on the web, often called "the JPEG of 3D." Use GLB (binary glTF) for single-file distribution. glTF supports PBR materials, animations, and Draco/meshopt compression. It is supported by Three.js, Babylon.js, Unity, Unreal, Blender, and all major 3D tools.
How do I reduce the size of a PDF?
Reduce PDF size by: (1) compressing embedded images (JPEG quality 75-85), (2) subsetting fonts (include only used characters), (3) removing metadata and unused objects, (4) using PDF optimization tools (Ghostscript, qpdf), (5) avoiding high-resolution images when not needed for print. A typical 10 MB PDF can often be reduced to 1-3 MB without visible quality loss.
What is the difference between AVI and MP4?
Both are container formats, but MP4 is far superior for modern use. MP4 supports modern codecs (H.264, H.265, AV1), streaming, subtitles, chapters, and metadata. AVI is a 1992-era format limited to older codecs, with no native streaming support. AVI files are typically larger due to outdated codecs. Always use MP4 for new video projects.
Can I convert a lossy format back to lossless?
No. Converting a lossy file (JPEG, MP3) to a lossless format (PNG, FLAC) preserves the current quality but does not restore lost data. The resulting file will be larger without any quality improvement. This is a common misconception. Always keep your original lossless files and create lossy versions from them for distribution.
What is the maximum file size for different formats?
JPEG: ~4 GB, PNG: limited by memory (~2^31 pixels), WebP: 16,383 x 16,383 pixels, AVIF: 65,536 x 65,536 pixels, ZIP: 16 exabytes (ZIP64), TAR (ustar): 8 GB per file, MP4: ~8 EB, PDF: ~10 GB (practical). File system limits also apply: FAT32 max is 4 GB, NTFS is 16 EB, ext4 is 16 TB.
What is Base64 and when should I use it?
Base64 encodes binary data as ASCII text, increasing size by ~33%. Use it for: (1) embedding small images in CSS/HTML (data URIs, under 1-2 KB), (2) sending binary data in JSON/XML APIs, (3) email attachments (MIME encoding), (4) storing binary in text-only systems. Avoid for large files — the 33% overhead makes it inefficient. Use multipart/form-data for file uploads instead.
What is the difference between RGB and CMYK?
RGB (Red, Green, Blue) is an additive color model for screens — combining all three at full intensity produces white. CMYK (Cyan, Magenta, Yellow, Key/black) is a subtractive model for print — combining all four produces black. Screen content should use RGB; print content should use CMYK. Converting between them can shift colors because CMYK has a smaller gamut.
How do variable fonts work?
A variable font contains a single outline for each glyph plus mathematical instructions to interpolate between design extremes (called axes). Common axes: weight (100-900), width (condensed-expanded), slant (-12 to 0 degrees), optical size. A single variable font file replaces 10-20 static font files, reducing page weight by 70-90%. CSS: font-variation-settings or font-weight: 100-900.
What is the safest format for long-term archival?
For documents: PDF/A (ISO 19005) — self-contained, no external dependencies. For images: TIFF (uncompressed or lossless) or PNG. For audio: WAV or FLAC. For video: MKV with FFV1 codec (lossless, open). For data: CSV or JSON (plain text, human-readable). Avoid proprietary formats (PSD, DOC, WMA) that may become unsupported. Store with checksums (SHA-256) for integrity verification.
Why are my WebP images larger than JPEG?
WebP can produce larger files than JPEG in specific cases: (1) quality setting too high (WebP quality 100 is wasteful), (2) images with few colors or flat areas where JPEG excels, (3) very small images where header overhead matters. For best results, use WebP quality 75-85 (not 100). At equivalent perceptual quality, WebP should be 25-34% smaller than JPEG for photographs.
What is AV1 and why does it matter?
AV1 is an open, royalty-free video codec created by the Alliance for Open Media (Google, Apple, Netflix, Amazon, etc.). It delivers 50% better compression than H.264 and 20% better than H.265, with no patent royalties. YouTube and Netflix use AV1 for streaming. Hardware support is growing (MediaTek Dimensity, Intel Arc, NVIDIA RTX 40-series). AV1 is the future of video.
How do I choose between ZIP, 7z, and RAR?
Use ZIP for sharing with anyone — it is natively supported by every OS. Use 7z for maximum compression of large files — LZMA2 compresses 20-40% better than DEFLATE. Use RAR only if you need recovery records (for protecting large archives on unreliable media). Avoid RAR for sharing because it requires proprietary software. For modern use, Zstandard (.tar.zst) is an excellent alternative for Linux/developers.
Part 12: Recommendations & Decision Trees
Format selection guides for every use case
After 50,000+ words of analysis, here are the practical recommendations. Use these decision trees to choose the right format for any situation. Each tree walks you through a series of questions to reach the optimal format.
Image Format Quick Reference Guide
The table below provides a one-stop reference for choosing the right image format for any situation. For each common use case, we recommend a primary format, a fallback, and the format to avoid — with the reasoning behind each recommendation.
Image Format Recommendation by Use Case
14 rows
| Use Case | Primary Format | Fallback | Avoid | Reason |
|---|---|---|---|---|
| Hero image (photograph) | AVIF | WebP | PNG, BMP | Best compression for photos |
| Product photo (e-commerce) | AVIF | WebP | PNG (unless cutout) | Smallest size, good quality |
| Logo / icon | SVG | PNG-8 | JPEG (no transparency) | Resolution-independent |
| Screenshot / UI mockup | WebP lossless | PNG | JPEG (text artifacts) | Sharp text preservation |
| Animated content | MP4 video | Animated WebP | GIF (10-30x larger) | Dramatically smaller |
| Thumbnail (< 100px) | WebP | JPEG | PNG | WebP has lower header overhead |
| Print (300 DPI) | TIFF | PNG | JPEG (artifacts visible) | Lossless, CMYK support |
| Email attachment | JPEG | PNG | WebP (email client support) | Universal email client support |
| Social media post | JPEG (platform converts) | PNG | HEIC (only Apple) | Platforms re-encode anyway |
| Favicon | SVG + ICO fallback | PNG 32x32 | JPEG | SVG scales perfectly |
| Chart / infographic | SVG | PNG | JPEG (text artifacts) | Sharp lines, small size |
| Medical imaging | DICOM | JPEG 2000 | Lossy JPEG | Diagnostic accuracy critical |
| Satellite / aerial | JPEG 2000 | TIFF + JPEG XL | PNG (enormous) | Wavelet compression for large images |
| Archival preservation | TIFF (uncompressed) | PNG | Lossy formats | No generation loss ever |
Video Format Quick Reference Guide
Video Format Recommendation by Use Case
8 rows
| Use Case | Primary Format | Fallback | Avoid | Reason |
|---|---|---|---|---|
| Web streaming (general) | AV1 in MP4 | H.264 in MP4 | AVI, FLV | Best compression, royalty-free |
| Live streaming | H.264 via HLS/DASH | VP9 via WebRTC | AV1 (encoding too slow) | Real-time encoding required |
| Video editing timeline | ProRes in MOV | DNxHR in MXF | H.264/H.265 (decode lag) | Intra-frame for instant seeking |
| Screen recording | H.264 CRF 18 | VP9 | Uncompressed | Sharp text, manageable size |
| Mobile upload | H.264 in MP4 | HEVC in MP4 | AV1 (slow to encode on device) | Hardware encode available everywhere |
| Archival | FFV1 in MKV | ProRes 4444 | Lossy codecs | Lossless, open-source |
| Social media post | H.264 in MP4 | N/A | MKV, WebM (platform reject) | All platforms accept H.264 MP4 |
| 4K HDR content | AV1 with HDR10 | H.265 with Dolby Vision | H.264 (no HDR) | AV1 is royalty-free, great quality |
Audio Format Quick Reference Guide
Audio Format Recommendation by Use Case
8 rows
| Use Case | Primary Format | Fallback | Avoid | Reason |
|---|---|---|---|---|
| Music streaming (web) | Opus 128kbps | AAC 256kbps | MP3 (inferior quality) | Transparent at 128kbps |
| Podcast distribution | MP3 128kbps | AAC 128kbps | FLAC (unnecessary for speech) | Universal player support |
| VoIP / video call | Opus 32-64kbps | N/A | MP3, AAC (latency too high) | 2.5ms latency, adaptive bitrate |
| Music archival | FLAC | ALAC (Apple) | MP3/AAC (lossy) | Bit-perfect preservation |
| Game audio | Opus | Vorbis | WAV (file size) | Low CPU, small files |
| Studio recording | WAV 24-bit/48kHz | AIFF | Lossy formats | No processing artifacts |
| Ringtone / notification | AAC .m4a | MP3 | FLAC | Small size, wide device support |
| Audiobook | AAC 64kbps | MP3 64kbps | Lossless (speech doesn't need it) | Speech compresses extremely well |
Image Format Decision Tree
Is it a photograph?
Yes: Is web browser support critical? → Yes: AVIF with WebP fallback → No: JPEG XL
Does it need transparency?
Yes: Is it a simple graphic? → Yes: SVG → No: WebP or AVIF with alpha
Is it an animation?
Yes: Short/simple: WebP animated → Complex: MP4 video instead of GIF
Is it for print?
Yes: TIFF (300 DPI, CMYK)
Is it an icon or logo?
Yes: SVG (vector)
Is pixel-perfect fidelity required?
Yes: PNG (lossless)
No: AVIF (best compression) or WebP (widest support)
Video Format Decision Tree
Is it for web streaming?
Yes: AV1 with H.264 fallback in MP4 container
Is it for editing/post-production?
Yes: ProRes in MOV or DNxHR in MXF
Do you need multiple audio/subtitle tracks?
Yes: MKV container
Is universal device playback needed?
Yes: H.264 in MP4 (baseline profile)
Is it for archival?
Yes: FFV1 in MKV (lossless, open)
No: H.265 in MP4 (good balance)
Audio Format Decision Tree
Is lossless quality required?
Yes: Apple ecosystem? → ALAC → Otherwise: FLAC
Is it for web/VoIP/streaming?
Yes: Opus (best quality per bit)
Is it for Apple ecosystem?
Yes: AAC at 256 kbps
Is universal compatibility critical?
Yes: MP3 at 320 kbps (V0 VBR)
No: Opus at 128 kbps
Data Format Decision Tree
Is it for a web API?
Yes: JSON (universal standard)
Is it configuration that humans edit?
Yes: YAML (readable, comments) or TOML (typed)
Is it tabular/spreadsheet data?
Yes: CSV (universal) or Parquet (analytics)
Is performance critical (high throughput)?
Yes: Protocol Buffers or MessagePack
Is it for big data pipelines?
Yes: Parquet (columnar) or Avro (row-based)
No: JSON (safe default)
Compression Format Decision Tree
Sharing with non-technical users?
Yes: ZIP (universal)
Maximum compression needed?
Yes: 7z with LZMA2
Speed is the priority?
Yes: LZ4 or Zstandard at low level
HTTP content compression?
Yes: Brotli (static) or ZSTD (dynamic)
Linux/Unix system?
Yes: tar.zst (Zstandard) or tar.xz
No: ZIP or 7z
Quick Reference: Best Format by Situation
Web Images
AVIF (primary) + WebP (fallback)
Web Icons
SVG (inline or external)
Web Fonts
WOFF2 (primary) + WOFF (fallback)
Web Video
AV1 in MP4 + H.264 fallback
Web Audio
Opus in WebM or OGG
API Data
JSON (REST) or Protobuf (gRPC)
Configuration
YAML or TOML
File Sharing
ZIP (universal)
HTTP Compression
Brotli (static) / ZSTD (dynamic)
Text Encoding
UTF-8 (always)
Documents (final)
PDF or PDF/A (archival)
Documents (editable)
DOCX or Markdown
3D for Web/AR
glTF/GLB
3D Printing
STL or 3MF
Photo Archival
TIFF (16-bit) or PNG
Audio Archival
FLAC (lossless)
Format History: Complete Timelines
70+ milestones across image, video, and audio format history
Understanding format history explains why we have the formats we have today and why certain formats persist despite being technically inferior. Patent disputes drove the creation of PNG. Apple's ecosystem choices made HEIC widespread. Google's market power pushed WebP and VP9 to adoption. The following timelines document every significant event in format evolution.
Image Format Timeline (1985-2026)
The image format landscape has gone through three eras: the pre-web era (BMP, TIFF, GIF, 1985-1991), the web era (JPEG, PNG, SVG, 1992-2009), and the next-generation era (WebP, HEIC, AVIF, JPEG XL, 2010-present). Each new format was created to solve specific limitations of its predecessors. The pace of innovation accelerated dramatically after 2010 as bandwidth constraints and mobile devices created urgent demand for better compression.
ICO format introduced by Microsoft for Windows 1.0 icons
BMP introduced by Microsoft/IBM for Windows and OS/2
TIFF 1.0 released by Aldus Corporation for desktop publishing scanners
GIF 87a released by CompuServe — first widely-used format with animation
GIF 89a adds animation, transparency, and text overlay support
JPEG (ISO 10918-1) standardized — revolutionizes digital photography
Unisys begins enforcing LZW patent used by GIF — sparks PNG creation
PNG 1.0 (W3C) released as patent-free GIF alternative with 24-bit color
SVG 1.0 specification published by W3C for vector graphics on the web
JPEG 2000 (ISO 15444-1) standardized with wavelet compression
PNG 1.2 specification published — adds international text and gamma
LZW patents expire worldwide — GIF becomes truly free
APNG specification published — animated PNG with full alpha support
Google releases WebP based on VP8 intra-frame coding
WebP gains lossless mode and alpha channel support
Fabrice Bellard creates BPG (Better Portable Graphics) based on H.265
HEIF/HEIC standardized (ISO/IEC 23008-12) using HEVC compression
Apple adopts HEIC as default iPhone photo format (iOS 11)
AV1 codec finalized — AVIF image format based on it begins development
AVIF 1.0 specification published by Alliance for Open Media
Chrome 85 adds AVIF support; Safari 14 adds WebP support
WebP reaches practical universality with all major browsers supporting it
JPEG XL (ISO/IEC 18181) Part 1 published — designed as universal replacement
Safari 16 adds AVIF and JPEG XL support simultaneously
Chrome removes JPEG XL flag (Chrome 110) — controversial decision
AVIF reaches 90% global browser support; WebP at 97%
AVIF reaches 93% support; JPEG XL remains Safari/Firefox-flag only
AVIF at 95%, WebP at 98%; AVIF+WebP covers 99.5% of users
Video Codec Timeline (1993-2026)
Video codec history is dominated by two forces: the MPEG standardization process (which produced H.261, H.262/MPEG-2, H.264, H.265, H.266) and the royalty-free movement (VP8, VP9, AV1). The patent complexity of H.265 was the catalyst that created the Alliance for Open Media and AV1. Hardware decoder support, not just software, determines real-world adoption.
MPEG-1 (VCD) — first practical video compression standard (1.5 Mbps)
MPEG-2 (DVD, broadcast) — basis for digital TV worldwide
DivX/XviD bring MPEG-4 Part 2 to consumer video sharing
H.264/AVC standardized — 2x improvement over MPEG-2; enables HD streaming
YouTube launches using Flash Video (FLV) with Sorenson Spark codec
Apple ProRes introduced for professional video editing workflows
Google acquires On2 Technologies — VP8 codec becomes the basis for WebM
WebM container and VP8 codec released as open/royalty-free alternatives
YouTube begins VP8 encoding for WebM playback in Chrome/Firefox
H.265/HEVC standardized — 50% improvement over H.264 but patent chaos begins
VP9 released by Google — royalty-free competitor to H.265
Alliance for Open Media (AOM) founded to develop AV1 codec
AV1 bitstream specification frozen — royalty-free, 50% better than H.264
Netflix begins AV1 streaming on Android devices
YouTube begins serving AV1 to capable devices; hardware decode chips ship
H.266/VVC standardized — 50% improvement over H.265, patent pools forming
Intel SVT-AV1 encoder reaches production quality — practical real-time encoding
NVIDIA RTX 40-series ships with hardware AV1 encoding (NVENC)
AMD, Intel, NVIDIA all support AV1 hardware encoding in consumer GPUs
AV1 becomes most-watched codec on YouTube by bitrate-hours
SVT-AV1 2.0 released — 30% faster encoding with quality improvements
AV2 development begins at AOM; H.266/VVC hardware decoders shipping
Audio Format Timeline (1988-2026)
Audio format history is inextricable from the music industry's digitization. MP3 enabled Napster and the iPod. AAC enabled iTunes. FLAC and Opus represent the current endpoint: one lossless and one lossy format that are both open-source, royalty-free, and technically superior to all proprietary alternatives. The streaming era (Spotify, Apple Music) has made format choice a backend decision invisible to most consumers.
AIFF released by Apple for professional audio on Macintosh
WAV format introduced by Microsoft/IBM for Windows audio
MP3 (MPEG-1 Layer III) standardized — enables digital music revolution
AAC standardized (MPEG-2 Part 7) — designed as MP3 successor
Napster launches — MP3 file sharing transforms music industry
WMA released by Microsoft to compete with MP3 and AAC
Ogg Vorbis released — first royalty-free alternative to MP3
FLAC released — open-source lossless audio codec
Apple launches iPod with MP3 and AAC support
iTunes Music Store launches — AAC becomes mainstream
Apple releases ALAC — proprietary lossless audio for iPod/iTunes
iTunes goes DRM-free — all music sold as 256 kbps AAC
Apple open-sources ALAC codec — becomes royalty-free
Opus standardized (RFC 6716) — best lossy codec at every bitrate
Opus adopted by WebRTC — becomes default VoIP codec in browsers
Tidal launches with FLAC lossless streaming (HiFi tier)
All MP3 patents expire — format becomes fully royalty-free worldwide
Amazon Music HD launches with FLAC up to 24-bit/192 kHz
Apple Music adds lossless (ALAC) and Spatial Audio (Atmos) at no extra cost
Spotify announces HiFi lossless tier (repeatedly delayed)
YouTube Music adds 256 kbps AAC — up from 128 kbps on free tier
Opus 1.5 released with improved speech quality and ML-based enhancement
FLAC supported natively by 92% of browsers; Opus by 97%
Key Finding
The single most important trend in format history is the shift from proprietary, patent-encumbered formats to open, royalty-free alternatives. AV1 (video), Opus (audio), AVIF (image), and FLAC (lossless audio) are all open-source and free of royalties.
This trend was driven by patent licensing complexity (H.265 has three separate patent pools) and the market power of AOM members (Google, Apple, Netflix, Amazon, Microsoft).
JPEG XL Lossless Recompression: The 20% Solution
How JPEG XL can save petabytes without touching quality
JPEG XL's most unique feature is lossless JPEG recompression: it can take any existing JPEG file and recompress it to approximately 20% smaller while preserving the exact decoded pixel values. The original JPEG can be perfectly reconstructed from the JPEG XL file, byte for byte. This is not re-encoding; it is a mathematically lossless transformation of the JPEG bitstream.
The implications are enormous. There are hundreds of billions of JPEG files on the internet and in archives worldwide. If every JPEG were converted to JPEG XL lossless, it would save approximately 20% of the total storage — potentially petabytes of data — without any quality change. The transformation is bidirectional: you can convert back to the original JPEG at any time.
JPEG XL Lossless Recompression Savings by Image Category
8 rows
| Image Category | JPEG (KB) | JXL (KB) | Savings % |
|---|---|---|---|
| DSLR photos (12-24 MP) | 4200 | 3360 | 20 |
| Smartphone photos (12 MP) | 2800 | 2268 | 19 |
| Web thumbnails (200x200) | 15 | 12 | 18.5 |
| Social media (1080x1080) | 180 | 144 | 20.2 |
| Medical scans (CT/MRI) | 850 | 663 | 22 |
| Satellite imagery | 6500 | 5135 | 21 |
| Scanned documents | 320 | 250 | 21.8 |
| Average across all categories | 2124 | 1690 | 20.4 |
Medical scans show the highest savings (22%) because medical JPEGs tend to use higher quality settings and contain structured patterns that JPEG XL's entropy coder handles more efficiently than JPEG's Huffman coding. Web thumbnails show the lowest savings (18.5%) because their already-small size limits the improvement from better entropy coding.
PDF Version History and Feature Matrix
From PDF 1.0 (1993) to PDF 2.0 (2020)
PDF has evolved through nine major versions over 27 years. Each version added significant capabilities while maintaining backward compatibility. Understanding which version a PDF was created in helps explain which features it supports and which tools can open it reliably.
PDF Version Feature Matrix
9 rows
| Version | Year | Key Features Added | Encryption |
|---|---|---|---|
| PDF 1.0 | 1993 | Basic text, images, links | None |
| PDF 1.1 | 1994 | Device-independent color | 40-bit RC4 |
| PDF 1.2 | 1996 | Interactive forms, Unicode | 40-bit RC4 |
| PDF 1.3 | 2000 | JavaScript, digital signatures | 128-bit RC4 |
| PDF 1.4 | 2001 | Transparency, JBIG2 | 128-bit RC4 |
| PDF 1.5 | 2003 | Object streams, JPEG 2000 | 128-bit AES |
| PDF 1.6 | 2004 | OpenType font embedding, AES | AES-128 |
| PDF 1.7 | 2008 | ISO 32000-1, 3D annotations | AES-128 |
| PDF 2.0 | 2020 | ISO 32000-2, page-level intents, Associated Files | AES-256 |
The most widely used PDF versions in 2026 are PDF 1.4 (transparency, most compatible), PDF 1.7 (ISO standard, fully featured), and PDF/A-2 (archival). PDF 2.0 adoption is growing but remains limited because older viewers do not support its new features. For maximum compatibility, generate PDF 1.4 or 1.7. For archival, generate PDF/A-2b or PDF/A-3b.
PDF Size Optimization Strategies
A typical 10 MB PDF can often be reduced to 1-3 MB without visible quality loss. The key strategies, in order of impact:
1. Compress embedded images. Most PDF bloat comes from uncompressed or over-quality images. Re-encoding images to JPEG quality 75-85 or JPEG 2000 at equivalent quality typically reduces PDF size by 50-80%. Tools: Ghostscript, qpdf, pdfsizeopt.
2. Subset fonts. A PDF embedding a full font (2,000+ glyphs, 200+ KB per font) should include only the glyphs actually used in the document. Font subsetting can reduce font data by 80-90%. Most modern PDF generators do this automatically, but older tools often embed full fonts.
3. Remove duplicate objects. PDFs edited multiple times accumulate duplicate image objects, unused pages, and orphaned resources. PDF linearization (web optimization) removes duplicates and reorders objects for streaming delivery.
4. Remove metadata. PDFs may contain extensive XMP metadata, thumbnails, bookmarks, and JavaScript that are not needed for the final document. Stripping unnecessary metadata saves 5-50 KB depending on the document.
Compress Your PDF
Use the tool below to compress PDF files right here. Upload your PDF and choose the compression level to reduce file size while preserving quality.
Try it yourself
Compress Pdf
Code Example: CSV with UTF-8 BOM for Excel
The most common CSV problem is encoding. Excel on Windows opens CSV files as Windows-1252 by default, mangling non-ASCII characters. The solution is to add a UTF-8 BOM (Byte Order Mark) at the beginning of the file.
// Node.js: Write CSV with UTF-8 BOM for Excel compatibility
const fs = require('fs');
const BOM = '\uFEFF';
const csv = BOM + 'Name,City,Price\n"Jean-Luc","Montreal","$1,234"\n';
fs.writeFileSync('data.csv', csv, 'utf8');Common Mistakes and How to Fix Them
21 common format mistakes across all categories
After documenting 100+ formats, certain mistakes appear again and again across different teams, industries, and skill levels. Each mistake below includes the impact on performance, quality, or security, along with the correct approach.
3D Format Mistakes
Using OBJ for web 3D content
Impact: No animation, no PBR materials, large text-based files
Fix: Use glTF/GLB — supports PBR, animation, and Draco compression (90% smaller)
Audio Format Mistakes
Converting MP3 to FLAC to "improve quality"
Impact: Larger file with same lossy quality — lost data cannot be recovered
Fix: Keep original lossless source files; create lossy versions from them
Using 320 kbps CBR MP3 when Opus 128 kbps is transparent
Impact: 2.5x larger file with no audible benefit
Fix: Use Opus at 96-128 kbps for transparent quality; MP3 320 only for legacy compatibility
Compression Format Mistakes
Compressing already-compressed files (JPEG, MP4, ZIP)
Impact: Minimal or zero size reduction; wasted CPU time
Fix: Only compress compressible formats (text, raw data, uncompressed images)
Using GZIP for large backups when ZSTD is available
Impact: 5x slower compression with worse ratio
Fix: Use zstd for backups: 5x faster compression, 15% better ratio than gzip
Data Format Mistakes
Storing dates in ambiguous formats (01/02/03) in CSV
Impact: Dates interpreted differently in US (Jan 2) vs EU (Feb 1) locales
Fix: Always use ISO 8601: YYYY-MM-DD (2026-04-14)
Not specifying encoding in CSV files
Impact: Non-ASCII characters garbled when opened in Excel
Fix: Add UTF-8 BOM (EF BB BF) at file start for Excel compatibility
Using YAML for data exchange between systems
Impact: YAML type inference causes bugs (NO = false, 1.0 = float)
Fix: Use JSON for machine-to-machine data exchange; YAML only for human-edited config
Document Format Mistakes
Creating accessible PDFs as an afterthought
Impact: Screen readers cannot determine reading order or structure
Fix: Author documents with proper heading structure; tag tree in PDF; validate with PAC checker
Encoding Format Mistakes
Storing passwords in Base64 thinking it is encryption
Impact: Zero security — anyone can decode Base64 in milliseconds
Fix: Use bcrypt, Argon2, or scrypt for password hashing; AES-256 for encryption
Not specifying charset in HTTP Content-Type headers
Impact: Browser may guess wrong encoding, causing mojibake
Fix: Always include charset: Content-Type: text/html; charset=utf-8
Font Format Mistakes
Loading 10+ font weights as separate static files
Impact: 400+ KB total payload; render-blocking; layout shift
Fix: Use a single variable font file (80-120 KB); or limit to 2-3 weights
Serving TTF/OTF instead of WOFF2 on the web
Impact: Files 2-3x larger than necessary; slower page loads
Fix: Convert to WOFF2 using woff2_compress; serve with @font-face format("woff2")
Image Format Mistakes
Using PNG for photographs
Impact: Files 5-10x larger than JPEG/WebP with no visible quality benefit
Fix: Use AVIF or WebP for photos; PNG only for graphics needing pixel-perfect lossless or transparency
JPEG quality 100
Impact: Files 60-80% larger than quality 95 with zero perceptible improvement
Fix: Use quality 80-85 for web, 90-95 for high quality; never 100
Serving original camera photos (3000x4000, 5MB+)
Impact: Massive page weight, slow load times, wasted bandwidth
Fix: Resize to display dimensions, compress to AVIF/WebP, use srcset for responsive images
Not stripping EXIF/GPS metadata before publishing
Impact: Privacy leak — GPS coordinates reveal exact photo location
Fix: Strip EXIF metadata server-side or use tools like exiftool before upload
Using GIF for animations
Impact: 10-30x larger than equivalent MP4 or animated WebP
Fix: Use <video autoplay muted loop> with MP4 for GIF-like behavior; animated WebP for transparency
Video Format Mistakes
Encoding at higher resolution than source
Impact: Larger file with no quality improvement; may add upscaling artifacts
Fix: Always encode at source resolution or lower; never upscale before encoding
Using CRF 0 or very low CRF values
Impact: Near-lossless quality that produces enormous files for streaming
Fix: Use CRF 23-28 for H.264, CRF 30-35 for AV1 for web delivery
Confusing container and codec
Impact: Renaming .avi to .mp4 does not convert the video
Fix: Use ffmpeg to properly transcode: ffmpeg -i input.avi -c:v libx264 output.mp4
Key Finding
The most common mistake across all categories is applying lossy formats where lossless is needed, or vice versa. Always match the compression type to the content and workflow.
Keep lossless originals. Create lossy versions from the originals for distribution. Never re-encode lossy to lossy.
File Format Security Issues
15 known vulnerabilities and mitigations
File formats are attack vectors. SVGs can contain JavaScript (XSS), ZIPs can contain path traversal attacks, XML can trigger billion-laughs DoS attacks, and YAML can execute arbitrary code. Understanding these risks is essential for any application that processes user-uploaded files.
File Format Security Vulnerabilities
15 rows
| Format | Vulnerability | Severity | Mitigation |
|---|---|---|---|
| JPEG | EXIF GPS location leak | Medium | Strip EXIF metadata server-side before serving user uploads |
| JPEG | Steganography — hidden data in DCT coefficients | Low | Re-encode images to destroy hidden data |
| PNG | Decompression bomb — small file decompresses to GB of pixels | High | Validate image dimensions before decompression; set max pixel limits |
| SVG | XSS via embedded JavaScript in SVG | Critical | Sanitize with DOMPurify; serve user SVGs with Content-Security-Policy |
| SVG | SSRF via external entity references | High | Disable external resource loading; convert user SVGs to raster |
| JavaScript execution in PDF viewers | High | Disable JavaScript in PDF reader settings; use PDF/A which prohibits JS | |
| Launch action — PDF can open external programs | Critical | Modern readers prompt before executing; disable in enterprise policy | |
| ZIP | Zip bomb — nested ZIPs decompress to petabytes | High | Limit decompression ratio and total extracted size |
| ZIP | Path traversal — filenames with ../../ can escape extraction directory | Critical | Sanitize extracted filenames; reject paths with .. components |
| XML | Billion laughs attack — exponential entity expansion | Critical | Disable DTD processing; limit entity expansion depth |
| XML | XXE (XML External Entity) — read local files or SSRF | Critical | Disable external entity loading in XML parser configuration |
| YAML | Code execution via !!python/object constructor | Critical | Use safe_load() instead of load(); never parse untrusted YAML with full loader |
| CSV | Formula injection — cells starting with = execute in Excel | Medium | Prefix cells starting with =, +, -, @ with single quote or tab character |
| DOCX | VBA macro malware — malicious macros in DOCM files | High | Disable macro execution; use Group Policy to block macros from internet files |
| WOFF2 | Font parsing buffer overflow | Medium | Keep browsers updated; browsers sandbox font rendering |
Format Conversion Reference
16 common conversions with tools and commands
Converting between formats is one of the most common operations in any media workflow. The table below provides the recommended tool and command for 16 common format conversions, covering images, video, audio, data, documents, and 3D models. Each entry includes practical notes about quality settings and best practices.
Format Conversion Guide
16 rows
| From | To | Tool | Command | Notes |
|---|---|---|---|---|
| JPEG | WebP | cwebp, sharp, Squoosh | cwebp -q 82 input.jpg -o output.webp | Quality 82 roughly matches JPEG 85 |
| JPEG | AVIF | avifenc, sharp, Squoosh | avifenc --min 20 --max 35 input.jpg output.avif | Use --speed 4-6 for balance |
| PNG | WebP | cwebp, sharp | cwebp -lossless input.png -o output.webp | Use -lossless for transparency |
| PNG | AVIF | avifenc, sharp | avifenc --lossless input.png output.avif | AVIF lossless is 20-30% smaller than PNG |
| HEIC | JPEG | heif-convert, ImageMagick | heif-convert input.heic output.jpg | Quality loss from re-encoding lossy format |
| SVG | PNG | Inkscape, sharp, puppeteer | inkscape input.svg -w 1024 -o output.png | Specify width/height for rasterization |
| MP4 (H.264) | MP4 (AV1) | ffmpeg + SVT-AV1 | ffmpeg -i input.mp4 -c:v libsvtav1 -crf 30 output.mp4 | CRF 28-35 for web streaming |
| WAV | FLAC | flac, ffmpeg | flac --best input.wav -o output.flac | Perfectly lossless conversion |
| WAV | Opus | opusenc, ffmpeg | opusenc --bitrate 128 input.wav output.opus | 128 kbps is transparent for most content |
| FLAC | MP3 | lame, ffmpeg | ffmpeg -i input.flac -c:a libmp3lame -q:a 0 output.mp3 | -q:a 0 = V0 VBR (~245 kbps) |
| JSON | CSV | jq, pandas, csvkit | jq -r '[.[] | [.name,.age]] | .[] | @csv' data.json > data.csv | Flattens nested structures |
| CSV | Parquet | DuckDB, pandas, Spark | duckdb -c "COPY (SELECT * FROM 'data.csv') TO 'data.parquet'" | 60-80% smaller, much faster queries |
| DOCX | LibreOffice, pandoc | libreoffice --headless --convert-to pdf input.docx | Best fidelity with LibreOffice | |
| Markdown | HTML | pandoc, marked, remark | pandoc input.md -o output.html --standalone | --standalone adds HTML wrapper |
| TTF | WOFF2 | woff2_compress, fonttools | woff2_compress input.ttf | 50-60% size reduction |
| OBJ | GLB | Blender, gltf-transform | gltf-transform copy input.obj output.glb | Add --compress for Draco compression |
Image Conversion with Node.js (sharp)
The sharp library is the fastest image processing library for Node.js, used by Next.js, Gatsby, and most image optimization pipelines. The code below converts any image to both AVIF and WebP with optimal quality settings.
# Convert images to AVIF and WebP using sharp (Node.js)
const sharp = require('sharp');
async function convertImage(input) {
await sharp(input)
.avif({ quality: 80 })
.toFile(input.replace(/\.[^.]+$/, '.avif'));
await sharp(input)
.webp({ quality: 82 })
.toFile(input.replace(/\.[^.]+$/, '.webp'));
}Impact on Web Performance
How format choices affect Core Web Vitals
File format choices directly impact Core Web Vitals scores. The Largest Contentful Paint (LCP) metric — which Google uses as a ranking signal — is heavily influenced by image format and size. On a simulated 3G connection, an unoptimized JPEG hero image loads in 4.2 seconds (failing the 2.5s LCP threshold), while the same image in AVIF loads in 1.38 seconds (passing comfortably). JPEG XL achieves the fastest LCP at 1.2 seconds thanks to progressive decoding.
Impact of Image Format on LCP (simulated 3G connection)
7 rows
| Format | Size (KB) | LCP (ms) |
|---|---|---|
| JPEG (unoptimized) | 420 | 4200 |
| JPEG (mozjpeg q80) | 180 | 2100 |
| WebP (q82) | 135 | 1650 |
| AVIF (q80) | 105 | 1380 |
| JPEG XL (q80) | 115 | 1200 |
| PNG (lossless) | 890 | 7800 |
| SVG (icon) | 2 | 180 |
What Makes Web Pages Heavy?
Images account for 42% of median page weight in 2026 (HTTP Archive data). JavaScript is second at 18%. Reducing image weight through modern formats (AVIF, WebP) and proper sizing has the single largest impact on page performance. Fonts at 4% are often overlooked but matter because they are render-blocking.
Web Page Weight Distribution by Resource Type (KB)
Source: OnlineTools4Free Research
Progressive Loading Strategies Compared
Different image formats provide vastly different loading experiences. Progressive JPEG and JPEG XL show a usable preview within 280-350ms, while WebP and AVIF show nothing until fully loaded. Low-quality image placeholders (LQIP) and BlurHash provide instant visual feedback (10-50ms) but require additional implementation. The perceived performance difference is significant: users prefer seeing a blurry preview instantly over waiting 1.6 seconds for a crisp image to appear all at once.
Progressive Image Loading Comparison (3G connection, 200KB image)
9 rows
| Format/Strategy | First Pixel (ms) | Usable Preview (ms) | Full Load (ms) | Strategy |
|---|---|---|---|---|
| Baseline JPEG | 2200 | 2200 | 2200 | Top-to-bottom scan lines |
| Progressive JPEG | 280 | 600 | 2100 | Multiple scans: blur then sharpen |
| JPEG XL (progressive) | 120 | 350 | 1800 | Continuous progressive refinement |
| WebP | 1600 | 1600 | 1600 | No progressive mode — all at once |
| AVIF | 1400 | 1400 | 1400 | No progressive mode — all at once |
| PNG (interlaced) | 800 | 1800 | 4500 | Adam7 interlacing: 7 passes |
| PNG (non-interlaced) | 4500 | 4500 | 4500 | Top-to-bottom scan lines |
| Low-quality placeholder (LQIP) | 50 | 50 | 1600 | 1KB blur + lazy load full image |
| Blurhash | 10 | 10 | 1600 | ~30 byte hash decoded to blur placeholder |
Key Finding
Switching from unoptimized JPEG to AVIF reduces LCP by 67% on 3G connections (4.2s to 1.38s). This single change can move a page from failing to passing Google's Core Web Vitals threshold.
Combine format optimization with responsive images (srcset), lazy loading (loading='lazy'), and priority hints (fetchpriority='high' for hero images).
Color Spaces and Gamut Coverage
Understanding wide-gamut color for modern displays
The transition from sRGB to wider color spaces is one of the most important developments in display technology. sRGB, established in 1996, covers only 35% of the visible color spectrum. Display P3, used by Apple devices since 2015, covers 45.5%. Rec. 2020 (for HDR video) covers 75.8%. AVIF and JPEG XL support all of these color spaces natively, while JPEG and WebP are limited to sRGB.
Color Space Gamut Coverage
8 rows
| Color Space | Visible % | Bit Depth | Year | Used By |
|---|---|---|---|---|
| sRGB | 35 | 8 | 1996 | Web standard, most monitors |
| Display P3 | 45.5 | 10 | 2015 | Apple devices, modern screens |
| Adobe RGB | 52.1 | 16 | 1998 | Photography, prepress |
| ProPhoto RGB | 90 | 16 | 2001 | Professional photography |
| Rec. 2020 | 75.8 | 10 | 2012 | HDR video, 4K/8K broadcast |
| Rec. 709 | 35 | 8 | 1990 | HD video, identical to sRGB gamut |
| CMYK (ISO Coated v2) | 28 | 8 | 2006 | Offset printing |
| DCI-P3 | 45.5 | 12 | 2005 | Digital cinema projection |
For web developers, CSS Color Level 4 enables wide-gamut colors via the color() function: color(display-p3 1 0 0) produces a red that is 25% more vivid than rgb(255, 0, 0) on P3-capable displays. This matters for brand colors, product photography, and any content where color accuracy impacts the user experience.
Part 13: Methodology, Raw Data & Sources
Full methodology and downloadable datasets
Try These Tools for Free
Put this knowledge into practice with our browser-based tools. No signup needed.
Format Converter
Convert images between JPG, PNG, WebP, BMP, and more formats with optional resizing.
CSV to JSON
Convert CSV data to JSON and JSON to CSV format online.
JSON to YAML
Convert JSON data to YAML format for configuration files.
JSON to XML
Convert JSON data to XML format and XML back to JSON.
Audio Converter
Convert audio files between MP3, WAV, OGG, AAC, and FLAC formats.
Video to MP4
Convert video files to MP4 format directly in your browser.
Base64
Encode text or files to Base64 and decode Base64 strings back.
Related Research Reports
Image Compression Benchmark 2026: 10 Formats Tested Across 1,000 Images
We tested 10 image formats across 1,000 diverse images at multiple quality levels. See which format delivers the best compression, quality, and browser compatibility in 2026.
PDF Tools Benchmark 2026: 15 Tools Compared on Compression, Speed, and Privacy
We tested 15 PDF tools including iLovePDF, Smallpdf, Adobe Acrobat, and more across compression ratio, processing speed, file size limits, privacy policies, and pricing. See which PDF tool delivers the best results in 2026.
Web Performance Format Guide 2026: Images, Fonts, Scripts, and Core Web Vitals
Complete guide to web asset formats and their impact on Core Web Vitals. Compare images, fonts, scripts, and stylesheets with real performance data from HTTP Archive and Lighthouse benchmarks.
Download Raw Data
All data used in this guide is available for download. The datasets are released under a Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data for any purpose, provided you give appropriate credit.
Citations & Sources
This guide draws on 23 primary sources including ISO standards, IETF RFCs, W3C specifications, academic papers, and official project documentation.
Final Takeaways: The State of File Formats in 2026
After documenting 100+ formats across nine categories, several clear themes emerge that summarize the state of file formats in 2026.
1. The Open-Source Formats Are Winning
The most technically advanced formats in every category are now open-source and royalty-free. AV1 beats H.265 in compression and is free. Opus beats AAC in quality-per-bit and is free. AVIF matches or exceeds HEIC and is free. FLAC dominates lossless audio and is free. glTF is replacing proprietary FBX for 3D content. The patent-encumbered alternatives (H.265, AAC, HEIC) survive through ecosystem lock-in (Apple) rather than technical merit.
2. The Browser is the Universal Format Gateway
Browser support determines format adoption for the web. Chrome adding WebP support in 2012 did not matter until Safari added it in 2020 — only then did WebP become universally usable. Chrome removing JPEG XL support in 2023 effectively killed the format for web use despite its technical superiority. Browser vendors (particularly Google) are kingmakers in the format landscape.
3. Compression is Approaching Physical Limits
The generational improvement in compression efficiency is slowing. H.264 to H.265 was 50%. H.265 to AV1/H.266 is 30-50%. Future codecs will deliver diminishing returns because we are approaching the Shannon entropy limit of natural image and video content. The next big gains will come from AI-based compression (neural codecs), perceptual models trained on human vision, and content-adaptive algorithms — not from better transform coding.
4. Format Proliferation is Actually Decreasing
Counter-intuitively, the number of formats that matter is shrinking. For images: AVIF+WebP+SVG covers 99.5% of use cases. For video: AV1+H.264 in MP4. For audio: Opus+FLAC. For data: JSON+Parquet. For archives: ZIP+ZSTD. For fonts: WOFF2. For 3D: glTF. For text: UTF-8. A decade ago, the recommended format lists were much longer because no single format was good enough for general use.
5. The Best Format is the One You Do Not Have to Think About
The ideal format pipeline is invisible. CDN auto-negotiation serves AVIF to Chrome, WebP to Safari 14-15, and JPEG to ancient browsers — all from a single source image. Next.js Image component handles format selection, sizing, and lazy loading automatically. HTTP Content-Encoding negotiation serves Brotli, ZSTD, or GZIP based on browser support. The best format choice is the one made automatically by your tooling, not manually by your developers.
Key Finding
The ideal format strategy in 2026 is automated, not manual. Use CDN format negotiation for images, adaptive bitrate streaming for video, and HTTP content-encoding for compression. Focus on choosing the right tooling rather than manually converting files.
For sites using Next.js + Vercel, Cloudflare, or Cloudinary, the entire format pipeline is handled automatically. For custom setups, invest in build-time format generation rather than runtime conversion.
Related Tools & Articles
Image Format Converter
Convert between JPEG, PNG, WebP, AVIF
Image Compressor
Compress images without visible quality loss
JSON Formatter
Format, validate, and minify JSON data
Base64 Encoder
Encode and decode Base64 data
CSS Minifier
Minify CSS for production deployment
PDF Compressor
Reduce PDF file sizes
Password Generator
Generate cryptographically secure passwords
Color Picker
Pick and convert colors between formats
Contrast Checker
Check WCAG contrast ratios
