In the ‘Subtleties of Color’, Robert Simmons talks about the importance of the use of color in data visualization and how effective use of color can be extremely functional in conveying information and making a point regarding the dataset. For instance, the author talks about the use of color in the first images of Mars taken from the interplanetary probe wherein color was used to represent spatial datasets with multiple dimensions of quantity including individual atoms and cosmic background radiation. The writer states “Careful use of color enhances clarity, aids storytelling, and draws the viewer into your dataset. Poor use of color can obscure data, or even mislead.” When talking about some of the problems with the use of color in data visualization, he lays emphasis on the difference between the representation of color on screen and the perception of color by the human eye. He explains that one of the biggest problems with the use of color in representing data sets lies in the fact that computers display and interpret color very differently than humans. Firstly, they make use of the RGB system to represent colors while humans often interpret colors in terms of their specific characteristics namely lightness or value, hue and saturation or chroma. While the cones in our retinal cells can manipulate a broad spectrum of colors, computers can display colors that are a combination of very narrow frequency bands. Our eyes are also more sensitive to certain colors than others and may also perceive certain values and hues as brighter than others.The unevenness of color perception has been analyzed and resolved to a great extent by CIE that helps accurately translate color through different mediums and ensure consistent change across the entire color palette such that it becomes easier to represent data accurately using perceivable color ranges.
Based on what the writer defines as a ‘perfect palette’, he emphasizes on the need for color palettes to be consistent in the steps across the range of colors so that the change between any two steps is equivalent. Consistent relationships between colors on a scale help preserve the quality of the data and convey differences or variations effectively. He also explains that phenomena such as simultaneous contrast, an optical illusion that makes certain colors appear different (lighter or darker) when they are placed on other colors, in order to avoid misconceptions in the representation of data. In order to most accurately take advantage of the three characteristics of color, the writer advises the use of a linear and proportional change in lightness accompanied by a simultaneous but subtler change in hue and saturation. In this manner, the change in lightness helps represent patterns in data, the change in hue makes reading quantities easier and the changes in saturation magnify contrast.
Further in the series, the writer also talks about how the use of color palettes may vary based on the type of data that is being represented. For instance, sequential data is best represented using color palettes that have equal steps of variation from light to dark or vice versa. Divergent palettes on the other hand are better represented by two sequential palettes that have individual changes in hue and saturation across the values. In a divergent palette, data is often shows diverging or varying from a central data point such as temperature ranges from average temperature or profit and loss variations in the stock market and thus it is often more effective to use color to represent the particular increases and decreases on either side of the data set. For bipolar data, the two hues that are used should vary from a central neutral color as this aids the proper perception of the changes in the data on either side. Divergent palettes are often harder to represent because similarities in lightness may make data impossible to read for people with color blindness. Categorical data or qualitative data uses color to separate areas into distinct categories and usually, each color can then be associated with a specific category which makes such data easier to read. For larger sets of data with more categories, it is often helpful to use additional elements such as symbols, textures, patterns and labelled elements.
The writer also lays great emphasis on connecting color to meaning in various ways. Sometimes, complicated conventional color palettes such as those used in scientific visualization may not be easily understandable by the general audience and therefore it is better to use color palettes that are widely recognized by a large audience and cater to people’s general association of color based on culture or nature. For instance, representing ocean with the color blue and tree cover with the color green is more likely to be understood by the majority of an audience since these colors are conventionally associated with these particular elements. Layering datasets that communicate different but supporting information are extremely informative and using color palettes that differ in hue and saturation such that one set of colors is more muted than the other can be extremely helpful in accurately conveying the information. For data sets that depict a certain specific breakpoint or a drastic difference among a range of values, it is useful to keep the change in lightness consistent but use a sudden change in hue or value, perhaps a contrasting color, in order to depict the area of drastic change. The author also states that areas that do not represent any data should be treated as a background and use shades of grey, white or black so that they can easily be differentiated from the areas that represent data points. Sometimes, differences in data points and the range they are trying to communicate or changes in time period can result in changes in the way that the colors convey the data. For instance, the difference that color represents between foreground and background may be altered by changes in some aspects of the data.
Finally, the writer suggests Color brewer as a useful tool for creating suitable color palettes for maps and other data visualization graphics. After reading this article, I realized that the use of color in data visualization is far more important than I thought. Very often, my choices of color for graphs and maps have been rather arbitrary, having only seen color as a distinguishing factor that helps separate one shape, line or bar from another. However, I realize now that color can have significant underlying meaning and making full use of its different characteristics- lightness, hue and saturation- is extremely important. Knowing and understanding where each of these aspects can be used in data visualization and how differences in value and hue may be translated into differences in particular datasets efficiently is extremely important. Another important takeaway from this article is that color is extremely subjective and it is important to have knowledge of how a particular target audience perceives color in order to appropriately use color in data visualization.