Why GIF is static image rather than video?

2023-02-03 ｜ Research ｜ #Words: 4222 ｜中文原版

When I was organizing my disk, I have a question: What is GIFs classified as? Static image files or video files?

Because if you put it in a video, GIF is classified as a static image usually and is composed of images; if you put it in an image, it shows animations.

Then I have new problem: Why GIFs shows animations, but be classified as static images?

I read many blogs, but the answers did not deep enough for me. They like “beacuse GIF is static images, so it is static images”. So I did some research by myself and wrote this blog, hoping to help people who have the same questions as me.

The structure of article is “GIF file format” -> “History of GIF” -> “Conclusion”. Some history about technology is also placed in “The History of GIFs”.

GIF file format

If want to explain the question in depth, we need to introduce the format of GIF file briefly first. And you will make sense why GIF appear in history.

If you want to learn more details of the GIF file format, here is a good article, which is also one of the reference materials for this article: “LZW and GIF explained—- Steve Blackstock” .

A GIF file has 5 major parts:

	+-----------------------+
	| +-------------------+ |
	| |   GIF Signature   | |
	| +-------------------+ |
	| +-------------------+ |
	| | Screen Descriptor | |
	| +-------------------+ |
	| +-------------------+ |
	| | Global Color Map  | |
	| +-------------------+ |
	. . .		    . . .
	| +-------------------+ |    ---+
	| |  Image Descriptor | |	|
	| +-------------------+ |	|
	| +-------------------+ |	|
	| |  Local Color Map  | |	|-   Repeated 1 to n times
	| +-------------------+ |	|
	| +-------------------+ |	|
	| |    Raster Data    | |	|
	| +-------------------+ |    ---+
	. . .		    . . .
	| +-------------------+ |
	|-|   GIF Terminator  |-|
	| +-------------------+ |
	+-----------------------+

For understanding, here is a GIF for explaining:

GIF example

You use hex viewer(like hexdump) to check GIF above and can get the following content:

GIF file in hex

GIF signature（header）

On UNIX, file format is decided by header, rather than file suffix.

The header of GIF file is GIF signature, which is 6 bytes and has two versions: “GIF87a (47h 49h 46h 38h 37h 48h)” and “GIF89a (47h 49h 46h 38h 39h 48h)”.

Screen descriptor describes the size of the image and provides some image informations, which is the 7 bytes in screenshot, I mark them as “size”, “information”, “background color”, and “indicates the end”:
Image size occupies 4 bytes, which 2 bytes are length and 2 bytes are width. So the maximum length and width are 65536 (2^16). The size of example GIF is 72x72(48x48 in hex), and you can see “48” in screenshot above. Two points need to be noted in this: GIF uses big endian, 48 00 means 0048h; secondly, offical name is “screen size”, which comes from history of graphics systems and computer. I won’t explain it, if you are interested in, you can watch “The 8-Bit Guy” videos about early computer graphics systems in Youtube.
Image information occupies 1 byte, but contains a lot of information, so I will use binary format to explain. In above screenshot, f6 is written as “1111 0110” in binary.
1. Bit 1 is the global color mapping symbol, indicating whether use global color mapping later, here 1 means “use”;
2. Bit 2~4 add 1 indicate the color depth. Here is 111(mean “7” in decimal) and add 1 equal to 8, which means the color depth is 8 bits(up to 256 colors). This value will be used by color mapping;
3. Bit 5 is reserved for the future, must be 0;
4. Bit 6~8 add 1 indicates the bits of each pixel in the image. Here is 110(mean “6” in decimal). Adding 1 equals 7, it means “color size” of a single pixel in this image is 7 bits(up to 128 colors). This value is related to the image.
Background color is 00, which means white.
00 indicates the end.

Global Color Map

In screenshot above, the area from tail of the first line to the 00000180 line is the global color mapping part. Although global color mapping is optional, modern GIFs basically enable it and are used to represent the color of each pixel.

What is Color Map

Originally I wanted to omit the explanation of color mapping, but this technology outdated, many people may not understand it. So I will explain it in detail.

Nowadays, many people know that the colors on the screen are mostly composed of three colors: red, green and blue (printer is CMYK). We can describe a color by values of the three colors. But why a “RGB” value can determine a color?

You may answer me: We extract a color from gamut by this value. That’s very close the truth.

Now, please imagine you are a game developer from the 1980s, you only have 8 bits to display delicate colors. 8 bits is only can represent 256 values, and you need to use three colors together, rather than black and white. This sounds impossible! Because today most custom screen have 24 bits for a pixel(each color occupies 8 bit), which is 16777216 color. This gap is too big. How 1980s game developer implement it?

Did you notice that I just say how many colors, and this color may not be evenly distributed. In the GIF above, the main colors are black, deep blue, red, and white, which are very simple in composition and not composed of very complex colors. So we map colors needed to values, not map any color we won’t use. Now a detailed image can be displayed. Many of the games on early game consoles such as Famicom consoles used this technology to achieve screen effects beyond specifications.

This color map can be understood as a palette, and the color space supported by the system or monitor can be understood as a paint box. The paint may have hundreds of thousands of colors, but the palette can only hold dozens of colors. For example, when creating a GIF with a single pixel size of 8 bits in the 24-bit RGB color space, 256 colors are taken out of 16777216 paints (local color mapping is to give each image a palette).

Let’s back to global color mapping. Color mapping is generally arranged in the order of “black to white”. But you can also sort by frequency or other (“Image Descriptors” section for details).

For example, in GIF above, 03h 03h 03h is the darkest color(hair rope):

For example, in GIF above, 03h 03h 03h is the darkest color that is the main color of hair rope

17h 17h 17his the second darkest color：

`17h 17h 17h`is the second darkest color

How to calculate the size of the global color map area

Notice: The global color mapping area of GIF files is not a fixed size. The calculation is as follows:

(1 <<(([Byte of Image Information] & 0x07) + 1)) * 3

For example, f6 in gif above means：

(1 <<((0xf6 & 0x07) + 1)) * 3

0xf6 & 0x07 = 1111 0110b
			& 0000 0111b
			————————————
			  0000 0110b = 6
			  
1 <<((0xf6 & 0x07) + 1)) = 1 << 7 = 1000 0000b = 128

(1 <<((0xf6 & 0x07) + 1)) * 3 = 128*3=384

So the data block size for the global color map above is 384 bytes.

Extension block

At this place, there may be one or more extension blocks, starting with 21h and ending with 00h. They can be used to describe animation intervals and background transparency and many other related information.

The size of the extension block is fixed, but 87a and 89a are not the same size. If it is 87a, then this part is the graphics control extension block (8 bytes).

If it is 89a, then there will be four graphics control extension blocks (8 bytes), plain text extension blocks (15 bytes), application extension blocks (14 bytes), and comment extension blocks (5 to 256 bytes). Extension blocks, each extension block starts with 21h, and the difference lies in the second byte:

If it is f9h, then identify the data block as a graphics control extension block;
If it is 01h, then identify the data block as a plain text extension block;
If it is ffh, then identify the data block as an application extension block;

Still taking the above picture as an example, intercept the part of the expansion block:

GIF extension block

The bytes between the two red lines are extension block.

21h ffh is explained as the application extension block at first.
0bh is the number of bytes in the Application Identifier and Authent Code fields (this value is fixed and must be 0b). The AuthentCode field contains a value that identifies the software application that created the application extension block. If the recognition of this identifier value is successful, the remainder of the data block is read and its data is operated on. If the identifier value cannot be recognized, the remainder of the data block is read and its data is discarded.
4e 45 54 53 43 41 50 45 (hexadecimal) is the application identifier. It means “NETSCAPE” in ASCII code.
32 2e 30 (hexadecimal) is the Authent Code, which means “2.0” in ASCII code.
this part depend on the application and developer.
00h is end block.

Netscape was the first browser to allow users to interact with images, such as the innovative ability to click on an image to jump to a page. It can be said that Netscape saved GIF, otherwise it would have disappeared. This is what the president of CompuServe said in the interview.

Then 21h f9h is represented as a graphics control extension block.

The next 04h is a fixed value, which is the sum of the number of bytes of the Packed, DelayTime and ColorIndex, totally 4 bytes. (In case of GIF87a there is only one 21h) The Packed is 04h in the above figure and 0000 0100 in binary. It is divided into 4 parts:

Bit 1 is used to indicate whether the color transparency field in the ColorIndex field has a value. Here, 0 means there is no value.
Bit 2 is the user input flag. If user input is required before entering the next image sequence (press a key or click the mouse), then it is 1, otherwise it is 0. 0 here indicates that no user operation is required to enter the next image.
Bits 3~5 indicate how to process the image after it is displayed (the numbers here are binary)
- 000 (no disposal method specified)
- 001 (does not process graphics), it is 001 in the picture, so there is no operation.
- 010 (overlay graphic with background color)
- 011 (overwrite graph with previous graph)
Bits 6~8 are reserved, but this subfield is not used in GIF89a, so it is always 0.

You can see the content of the graphics control extension block can make GIF look like PPT.

The next 1 byte is the DelayTime field, the content is 0ah, which means wait 0.1 seconds before playing the next image. 0.1 second is 1/10 second, which can be understood as displaying 10 frames per second. The next 2 bytes are color transparency, here it is 00h, and the first bit in the previous Packed field is 0. That is to say, this field is not skipped, that is, the color transparency is not adjusted. The following byte of ffh should be 00h, which should be the characteristics of the software itself. The last 00h indicates the end of the extension block.

Image

Then let us move to the content of each image in the GIF. The image data in GIF is compressed by LZW (Lempel-Ziv-Welch) algorithm, starting from 2ch (ASCII corresponds to ,), and consists of the following three parts:

Image descriptor;
Local color mapping (of course this is also optional);
Raster data.

Image Descriptor (Local Image Descriptor)

The image descriptor is also called image title, starting with 2ch and ending with 07h, which is the part below:

Image title, starting with 2ch and ending with 07h

This part contains the delimiter 2ch of the image descriptor (1 byte), the coordinates X and Y of the image (2 bytes each), the length and width of the image (2 bytes each), the image and color Information about the mapping data (1 byte), and the terminator 07h (1 byte).

As the above image, the coordinates X and Y of the image are both 00h 00h. The image dimensions are the same as before, 72x72 pixels in length and width. The information of image and color mapping data has 4 subfields. The 00h binary here is 0000 0000, then:

Bit 1 is the local color mapping flag. If this bit is 1, the image has a local color map and the local color map is used; if it is 0, the global color map is used. Here in the figure is 0, indicating the use of global color mapping.
Bit 2 is the interlaced generation flag. If it is 1, the image is interlaced, and 0 is the opposite. Here in the picture is 0, indicating that it is not interlaced.
Bit 3 is a classification flag, indicating whether the color map is arranged by frequency of occurrence (or other factors). Only works with GIF89a though. In the case of GIF87a, this part is always 0. Here in the picture is 0, indicating that there is no sorted color mapping.
Bits 3～4 are reserved bits and are always 0.
Bits 5～8 are the number of local color map entries, used to indicate the number of local color map entries. Here is 0, indicating that this image has no local color mapping.

Local color mapping

Local color mapping is same as global color mapping. And as we can see from the last section, there is no local color mapping section, so we do not talk about it.

Raster data

Raster data is image data, consisting of pieces of data compressed using the LZW (Lempel-Ziv-Welch) algorithm. Each block starts with “block size count bytes”, which values ranging from 1 to 255 (1h to ffh).

Also use the above example to explain. The raster data of the first image starts from the ff of 1b3h. ff means that the next piece of data is 255 bytes:

The ff in the picture indicates that the next piece of data is 255 bytes

Then scroll down, we can see that 2b3h, 3b3h, 4b3h, and 5b3h are all ff, that is, moved back 256 places:

2b3h 3b3h 4b3h 5b3h

But it is c3 in 6b3h:

it is `c3` in `6b3h`

So move back 196 (c3h + 1 = 195 + 1=196) and take a look, which is the position 777h (6b0h + c3h + 1h):

777h

You can see that this position is 00, which means that the raster data of the current image block has ended (because it needs to be understood that the following data block has 0, and then there is no more).

Then continue down, you can see 2c at the position of 780h on the next line, indicating that a new image descriptor has started. The cycle repeats like this until it ends.

However, it should be noted that not every block is in the beautiful format of 255+255+...+xxx, some are messy in actual.

End of GIF file

If you know some low-level about system, you maybe know that files and data can have no start character, but must have an end character. For example, strings in C (or languages like Python that are extended from C) end with \0 to indicate the end of the string. Because as a data stream in the UNIX system, system must know where it ends. So what does the GIF file end with?

The terminator of GIF is ;, which is 3b in hexadecimal. The terminator of GIF is `;`, which is `3b` in hexadecimal

You can see a 00h in front of the terminator 3bh, which indicates the end of a raster data.

What does a static image of a GIF look like?

In the first few years after GIF birthed, GIF was only used for static images. The image below is a GIF file (from “Encyclopedia of Graphics File Formats, 2nd Edition”, a book in 2005). There is only one image in GIF file with resolution of 1419 × 1001. You can right-click to download and see the original image:

Static GIF file from "Encyclopedia of Graphics File Formats, 2nd Edition"

History of GIFs

Birth

background knowledge

GIFs were born before the Internet.

Before the advent of the Internet, there were many computer manufacturers. Unlike today, they are only responsible for assembling or producing part of the hardware. At that time, most computer manufacturers were responsible for developing computer systems and even software. For example, the Apple was one of many computer manufacturers at that time.

Different hardware and purposes led to the systems and applications not being compatible with each other. At that time, when you bought a computer, you would get a book, which would tell you how to program on this computer. For example, the following instruction manual for the Apple II:

Apple II manual

Purpose

The boss of CompuServe Sandy Trevor needed programmers to solve two main problems:

First, CompuServe offers subscriptions (on an hourly basis) to users who need to access email and transfer files, which requires a simple graphical format that can be displayed seamlessly on all computers. In other words, this file must be universal.
Secondly, some people who have used computers for a long time may still remember that when we used dial-up Internet a long time ago, due to the slow speed of the Internet, the pictures on the web page would be refreshed little by little. If there was a problem with the network, we would have to wait for a long time. The image is gone and you need to refresh the page and download it again. So the file must be small.
Because the computer memory at that time was too small, it could not occupy too much memory while maintaining picture quality.

To solve this problem CompuServe developed GIF. GIF can transmit the data stream to the client piece by piece, and then display the picture step by step. Or generate interlaced images, then the image data can be interlaced and transmitted to the client. In this way, although the picture is lossy, you can make up for it. For example, if the picture below is displayed in alternate lines, you can roughly figure out the content of the picture when you see the left and middle images in the picture below. However, if the picture is displayed line by line and displayed bit by bit from top to bottom, it is impossible to figure out the content.

The difference between progressive display and interlaced display

alternatives

Although GIF is lossless compression, but the color of a single image is only 256 colors at most. If you want to achieve delicate colors, you can use the splicing method, which will cause the file to be too large and too complex. Furthermore, even if you success to solve the splicing problem (the file size will be very scary at this time), it can only support up to 24-bit RGB color space, which means that the color of GIF will not be particularly delicate (for current screens).

In other respects, GIF can theoretically display images of up to about 4.3 billion pixels (for this, please see the content of “GIF File Format” above), which is not out of date until now.

However, the size and color of GIF have always been two fatal problems, so the PNG format was invented to eliminate GIF. Do you know the full name of PNG is “PNG’s not GIF”, but this is a story after 1995, when GIF was almost ten years old.

Now we can see that PNG, as a lossless compressed image, has successfully replaced GIF in terms of static images. Because it is better than GIF in all aspects (and at that time, GIF had patent restrictions due to the use of the LZW algorithm, while PNG was completely open source). But there has never been a substitute for the dynamic images of GIF.

In 1999, there was a comprehensive campaign to destroy GIFs on the Internet, but it was not successful. Such behavior is still happening now. If you are interested, you can take a look https://burnallgifs.org/archives/.

There was no success because there was no file that could be easily implemented, no sound and show animation in a loop. There is another more important reason: GIF can be played without any codec. So all browsers can play GIFs.

Although the MNG (Multiple-image Network Graphics) multi-image file around PNG was born in 2001 and its functions and image quality are much better than GIF, it still did not defeat GIF (my personal guess is that it is too complicated, MNG can even add audio track. If I need sound, why not using MP4?).

Conclusion

GIF was originally created to be used as a static image, like today’s PNG and JEPG. However, its structure can be look as a simplified version of Microsoft PowerPoint. The pptx file is a tar package composed of xml and image files. The GIF file is composed of some information blocks, control blocks and LZW compressed raster images. It is a coincidence that Microsoft PowerPoint and GIF were born in the same year. However, you can control the spacing and animation between each raster image in the GIF file, so that it can show animation/motion effects (just like you can use PowerPoint to create an animation, but you can’t say this PowerPoint is a video).

Bold speculation

I have an idea， but due to so long ago, so I can’t verify. If you know something about below content, please tell me, thanks!

At that time, there was a technology trend. Tech Companies to develop something like slideshows on computers; So GIF developers wanted to not only display pictures, but also have some special functions (but it was more likely to be company requirements).

Because GIF can respond to user operations, like the effect of a slideshow. The most famous slideshow software birthed at that time. For example, Power Point, was also born in the same year and released on Apple’s Macintosh in 1985.

If GIF is just to solve the problem of image display, then there is no need to add some unnecessary functions to GIF.

We should remember no Internet at that time, computer science had only been born for more than 20 years, various technologies and applications were not yet mature and personal computers had just been popularized. Some technologies and applications that seem to exist naturally today survived the competition from many parties at that time actually, such as GIF and the Linux kernel.

So I think this is so dramatic: GIF may have had some extra features added by request, but it is precisely because of these features that it has not been eliminated.

By the way, who thought GIFs could be animated first?

I hope these will help someone in need~

References

《LZW and GIF explained—-Steve Blackstock》

《GIF File Format Summary》

《GIF Signature Format: Introduction & Recovery》

《THE HISTORY OF GIFS AND HOW TO USE THEM》