When I was organizing my disk, I have a question: What is GIFs classified as? Static image files or video files?
Because if you put it in a video, GIF is classified as a static image usually and is composed of images; if you put it in an image, it shows animations.
Then I have new problem: Why GIFs shows animations, but be classified as static images?
I read many blogs, but the answers did not deep enough for me. They like “beacuse GIF is static images, so it is static images”. So I did some research by myself and wrote this blog, hoping to help people who have the same questions as me.
The structure of article is “GIF file format” -> “History of GIF” -> “Conclusion”. Some history about technology is also placed in “The History of GIFs”.
If want to explain the question in depth, we need to introduce the format of GIF file briefly first. And you will make sense why GIF appear in history.
If you want to learn more details of the GIF file format, here is a good article, which is also one of the reference materials for this article: “LZW and GIF explained—- Steve Blackstock” .
A GIF file has 5 major parts:
+-----------------------+
| +-------------------+ |
| | GIF Signature | |
| +-------------------+ |
| +-------------------+ |
| | Screen Descriptor | |
| +-------------------+ |
| +-------------------+ |
| | Global Color Map | |
| +-------------------+ |
. . . . . .
| +-------------------+ | ---+
| | Image Descriptor | | |
| +-------------------+ | |
| +-------------------+ | |
| | Local Color Map | | |- Repeated 1 to n times
| +-------------------+ | |
| +-------------------+ | |
| | Raster Data | | |
| +-------------------+ | ---+
. . . . . .
| +-------------------+ |
|-| GIF Terminator |-|
| +-------------------+ |
+-----------------------+
For understanding, here is a GIF for explaining:
You use hex viewer(like hexdump
) to check GIF above and can get the following content:
On UNIX, file format is decided by header, rather than file suffix.
The header of GIF file is GIF signature, which is 6 bytes and has two versions: “GIF87a (47h 49h 46h 38h 37h 48h)” and “GIF89a (47h 49h 46h 38h 39h 48h)”.
48 00
means 0048h
; secondly, offical name is “screen size”, which comes from history of graphics systems and computer. I won’t explain it, if you are interested in, you can watch “The 8-Bit Guy” videos about early computer graphics systems in Youtube.f6
is written as “1111 0110
” in binary.
1
means “use”;111
(mean “7” in decimal) and add 1 equal to 8, which means the color depth is 8 bits(up to 256 colors). This value will be used by color mapping;0
;110
(mean “6” in decimal). Adding 1 equals 7, it means “color size” of a single pixel in this image is 7 bits(up to 128 colors). This value is related to the image.00
, which means white.00
indicates the end.In screenshot above, the area from tail of the first line to the 00000180
line is the global color mapping part. Although global color mapping is optional, modern GIFs basically enable it and are used to represent the color of each pixel.
Originally I wanted to omit the explanation of color mapping, but this technology outdated, many people may not understand it. So I will explain it in detail.
Nowadays, many people know that the colors on the screen are mostly composed of three colors: red, green and blue (printer is CMYK). We can describe a color by values of the three colors. But why a “RGB” value can determine a color?
You may answer me: We extract a color from gamut by this value. That’s very close the truth.
Now, please imagine you are a game developer from the 1980s, you only have 8 bits to display delicate colors. 8 bits is only can represent 256 values, and you need to use three colors together, rather than black and white. This sounds impossible! Because today most custom screen have 24 bits for a pixel(each color occupies 8 bit), which is 16777216 color. This gap is too big. How 1980s game developer implement it?
Did you notice that I just say how many colors, and this color may not be evenly distributed. In the GIF above, the main colors are black, deep blue, red, and white, which are very simple in composition and not composed of very complex colors. So we map colors needed to values, not map any color we won’t use. Now a detailed image can be displayed. Many of the games on early game consoles such as Famicom consoles used this technology to achieve screen effects beyond specifications.
This color map can be understood as a palette, and the color space supported by the system or monitor can be understood as a paint box. The paint may have hundreds of thousands of colors, but the palette can only hold dozens of colors. For example, when creating a GIF with a single pixel size of 8 bits in the 24-bit RGB color space, 256 colors are taken out of 16777216 paints (local color mapping is to give each image a palette).
Let’s back to global color mapping. Color mapping is generally arranged in the order of “black to white”. But you can also sort by frequency or other (“Image Descriptors” section for details).
For example, in GIF above, 03h 03h 03h
is the darkest color(hair rope):
17h 17h 17h
is the second darkest color:
Notice: The global color mapping area of GIF files is not a fixed size. The calculation is as follows:
(1 <<(([Byte of Image Information] & 0x07) + 1)) * 3
For example, f6
in gif above means:
(1 <<((0xf6 & 0x07) + 1)) * 3
0xf6 & 0x07 = 1111 0110b
& 0000 0111b
————————————
0000 0110b = 6
1 <<((0xf6 & 0x07) + 1)) = 1 << 7 = 1000 0000b = 128
(1 <<((0xf6 & 0x07) + 1)) * 3 = 128*3=384
So the data block size for the global color map above is 384 bytes.
At this place, there may be one or more extension blocks, starting with 21h
and ending with 00h
. They can be used to describe animation intervals and background transparency and many other related information.
The size of the extension block is fixed, but 87a and 89a are not the same size. If it is 87a, then this part is the graphics control extension block (8 bytes).
If it is 89a, then there will be four graphics control extension blocks (8 bytes), plain text extension blocks (15 bytes), application extension blocks (14 bytes), and comment extension blocks (5 to 256 bytes). Extension blocks, each extension block starts with 21h
, and the difference lies in the second byte:
f9h
, then identify the data block as a graphics control extension block;01h
, then identify the data block as a plain text extension block;ffh
, then identify the data block as an application extension block;Still taking the above picture as an example, intercept the part of the expansion block:
The bytes between the two red lines are extension block.
21h ffh
is explained as the application extension block at first.0bh
is the number of bytes in the Application Identifier and Authent Code fields (this value is fixed and must be 0b
). The AuthentCode field contains a value that identifies the software application that created the application extension block. If the recognition of this identifier value is successful, the remainder of the data block is read and its data is operated on. If the identifier value cannot be recognized, the remainder of the data block is read and its data is discarded.4e 45 54 53 43 41 50 45
(hexadecimal) is the application identifier. It means “NETSCAPE” in ASCII code.32 2e 30
(hexadecimal) is the Authent Code, which means “2.0” in ASCII code.00h
is end block.Netscape was the first browser to allow users to interact with images, such as the innovative ability to click on an image to jump to a page. It can be said that Netscape saved GIF, otherwise it would have disappeared. This is what the president of CompuServe said in the interview.
Then 21h f9h
is represented as a graphics control extension block.
The next 04h
is a fixed value, which is the sum of the number of bytes of the Packed, DelayTime and ColorIndex, totally 4 bytes. (In case of GIF87a there is only one 21h
)
The Packed is 04h
in the above figure and 0000 0100
in binary. It is divided into 4 parts:
0
means there is no value.1
, otherwise it is 0
. 0
here indicates that no user operation is required to enter the next image.000
(no disposal method specified)001
(does not process graphics), it is 001
in the picture, so there is no operation.010
(overlay graphic with background color)011
(overwrite graph with previous graph)0
.You can see the content of the graphics control extension block can make GIF look like PPT.
The next 1 byte is the DelayTime field, the content is 0ah
, which means wait 0.1 seconds before playing the next image. 0.1 second is 1/10 second, which can be understood as displaying 10 frames per second.
The next 2 bytes are color transparency, here it is 00h
, and the first bit in the previous Packed field is 0
. That is to say, this field is not skipped, that is, the color transparency is not adjusted.
The following byte of ffh
should be 00h
, which should be the characteristics of the software itself.
The last 00h
indicates the end of the extension block.
Then let us move to the content of each image in the GIF. The image data in GIF is compressed by LZW (Lempel-Ziv-Welch) algorithm, starting from 2ch
(ASCII corresponds to ,
), and consists of the following three parts:
The image descriptor is also called image title, starting with 2ch
and ending with 07h
, which is the part below:
This part contains the delimiter 2ch
of the image descriptor (1 byte), the coordinates X and Y of the image (2 bytes each), the length and width of the image (2 bytes each), the image and color Information about the mapping data (1 byte), and the terminator 07h
(1 byte).
As the above image, the coordinates X and Y of the image are both 00h 00h
. The image dimensions are the same as before, 72x72 pixels in length and width.
The information of image and color mapping data has 4 subfields. The 00h
binary here is 0000 0000
, then:
1
, the image has a local color map and the local color map is used; if it is 0
, the global color map is used. Here in the figure is 0
, indicating the use of global color mapping.1
, the image is interlaced, and 0
is the opposite. Here in the picture is 0
, indicating that it is not interlaced.0
. Here in the picture is 0
, indicating that there is no sorted color mapping.0
.0
, indicating that this image has no local color mapping.Local color mapping is same as global color mapping. And as we can see from the last section, there is no local color mapping section, so we do not talk about it.
Raster data is image data, consisting of pieces of data compressed using the LZW (Lempel-Ziv-Welch) algorithm. Each block starts with “block size count bytes”, which values ranging from 1 to 255 (1h to ffh).
Also use the above example to explain. The raster data of the first image starts from the ff
of 1b3h
. ff
means that the next piece of data is 255 bytes:
Then scroll down, we can see that 2b3h
, 3b3h
, 4b3h
, and 5b3h
are all ff
, that is, moved back 256 places:
But it is c3
in 6b3h
:
So move back 196 (c3h + 1 = 195 + 1=196) and take a look, which is the position 777h
(6b0h + c3h + 1h):
You can see that this position is 00
, which means that the raster data of the current image block has ended (because it needs to be understood that the following data block has 0, and then there is no more).
Then continue down, you can see 2c
at the position of 780h
on the next line, indicating that a new image descriptor has started. The cycle repeats like this until it ends.
However, it should be noted that not every block is in the beautiful format of 255+255+...+xxx
, some are messy in actual.
If you know some low-level about system, you maybe know that files and data can have no start character, but must have an end character. For example, strings in C (or languages like Python that are extended from C) end with \0
to indicate the end of the string. Because as a data stream in the UNIX system, system must know where it ends. So what does the GIF file end with?
The terminator of GIF is ;
, which is 3b
in hexadecimal.
You can see a 00h
in front of the terminator 3bh
, which indicates the end of a raster data.
In the first few years after GIF birthed, GIF was only used for static images. The image below is a GIF file (from “Encyclopedia of Graphics File Formats, 2nd Edition”, a book in 2005). There is only one image in GIF file with resolution of 1419 × 1001. You can right-click to download and see the original image:
GIFs were born before the Internet.
Before the advent of the Internet, there were many computer manufacturers. Unlike today, they are only responsible for assembling or producing part of the hardware. At that time, most computer manufacturers were responsible for developing computer systems and even software. For example, the Apple was one of many computer manufacturers at that time.
Different hardware and purposes led to the systems and applications not being compatible with each other. At that time, when you bought a computer, you would get a book, which would tell you how to program on this computer. For example, the following instruction manual for the Apple II:
The boss of CompuServe Sandy Trevor needed programmers to solve two main problems:
To solve this problem CompuServe developed GIF. GIF can transmit the data stream to the client piece by piece, and then display the picture step by step. Or generate interlaced images, then the image data can be interlaced and transmitted to the client. In this way, although the picture is lossy, you can make up for it. For example, if the picture below is displayed in alternate lines, you can roughly figure out the content of the picture when you see the left and middle images in the picture below. However, if the picture is displayed line by line and displayed bit by bit from top to bottom, it is impossible to figure out the content.
Although GIF is lossless compression, but the color of a single image is only 256 colors at most. If you want to achieve delicate colors, you can use the splicing method, which will cause the file to be too large and too complex. Furthermore, even if you success to solve the splicing problem (the file size will be very scary at this time), it can only support up to 24-bit RGB color space, which means that the color of GIF will not be particularly delicate (for current screens).
In other respects, GIF can theoretically display images of up to about 4.3 billion pixels (for this, please see the content of “GIF File Format” above), which is not out of date until now.
However, the size and color of GIF have always been two fatal problems, so the PNG format was invented to eliminate GIF. Do you know the full name of PNG is “PNG’s not GIF”, but this is a story after 1995, when GIF was almost ten years old.
Now we can see that PNG, as a lossless compressed image, has successfully replaced GIF in terms of static images. Because it is better than GIF in all aspects (and at that time, GIF had patent restrictions due to the use of the LZW algorithm, while PNG was completely open source). But there has never been a substitute for the dynamic images of GIF.
In 1999, there was a comprehensive campaign to destroy GIFs on the Internet, but it was not successful. Such behavior is still happening now. If you are interested, you can take a look https://burnallgifs.org/archives/.
There was no success because there was no file that could be easily implemented, no sound and show animation in a loop. There is another more important reason: GIF can be played without any codec. So all browsers can play GIFs.
Although the MNG (Multiple-image Network Graphics) multi-image file around PNG was born in 2001 and its functions and image quality are much better than GIF, it still did not defeat GIF (my personal guess is that it is too complicated, MNG can even add audio track. If I need sound, why not using MP4?).
GIF was originally created to be used as a static image, like today’s PNG and JEPG. However, its structure can be look as a simplified version of Microsoft PowerPoint. The pptx file is a tar package composed of xml and image files. The GIF file is composed of some information blocks, control blocks and LZW compressed raster images. It is a coincidence that Microsoft PowerPoint and GIF were born in the same year. However, you can control the spacing and animation between each raster image in the GIF file, so that it can show animation/motion effects (just like you can use PowerPoint to create an animation, but you can’t say this PowerPoint is a video).
I have an idea, but due to so long ago, so I can’t verify. If you know something about below content, please tell me, thanks!
At that time, there was a technology trend. Tech Companies to develop something like slideshows on computers; So GIF developers wanted to not only display pictures, but also have some special functions (but it was more likely to be company requirements).
Because GIF can respond to user operations, like the effect of a slideshow. The most famous slideshow software birthed at that time. For example, Power Point, was also born in the same year and released on Apple’s Macintosh in 1985.
If GIF is just to solve the problem of image display, then there is no need to add some unnecessary functions to GIF.
We should remember no Internet at that time, computer science had only been born for more than 20 years, various technologies and applications were not yet mature and personal computers had just been popularized. Some technologies and applications that seem to exist naturally today survived the competition from many parties at that time actually, such as GIF and the Linux kernel.
So I think this is so dramatic: GIF may have had some extra features added by request, but it is precisely because of these features that it has not been eliminated.
By the way, who thought GIFs could be animated first?
I hope these will help someone in need~
《LZW and GIF explained—-Steve Blackstock》