Scatter Plots

Scatter plots use points of dots that represent two different numerical variables in a 2d chart, but can be graphed in a 3d plain as well. The first variable is indepentend while the second value is dependant on the first. Scatter plots allow one to observe the correlation between the variables.

Shapes, color, and scale show additional variables. The inclusion of a pie graph adds another dimension of information that isn't available on other graph types.
These graphs show the variations of data visualizations that can occur when using a 2D Scatter Plot graph.
The two graphs on the left share the strength of seperating the points to make it each data point clear. The two graphs on the right show overplotting where much of the data is lost; the bottom right graph does solve some over-plottings issues by using transparencies.


Scatter Plot Types:

Scatter Diagrams with no correlation: Here the points appear randomly dispeared making it hard to draw a line through them to estimate the average

Scatter plot with no correlation.

•Scatter Diagrams with Moderate Correlation: Here the points are more clustered together making it easier to detect and generalize a relationship between this data set.

Scatter plot with moderate correlation.

•Scatter Diagram with Strong Correlation: Here the points are clearly clustered together with an apparent relationship between the points making for an easy estimate of the average.

Scatter plot with strong correlation.

Each of these correlation types is further catagorized by positive or negative correlations. These are decided by the slant of the average.

If the X value increases along with an increase in the Y value the correlation would be positive.

Positive correlation.

If the slant shows an increase in the X value and a decrease in the Y value, the correlation would turn out negative.

Negative correlation.


Scatter plots are very good for getting a general average out of the data sets as clusterings make it easy to see trends in the data, also allowing predictions to be made.

What makes scatter plots special is its ability to gather many additional variables while still providing understandable graphs. By providing a key one can alter shapes, the scale, hue, saturation, luminance, and or opacity to depict other relationships in the collected data

A 3D scatter plot that shows the relationship in ozone levels between 3 main variables (wind speed, temperature, and solar radiation) and the parts-per-million.
This 2D scatter plot shows the relationship between two main variables (sepal length and sepal width). The graph uses scale, opacity, and color to represent other variables)

Pros:

Shows relationship between 2 main variables and additional sub-information. This allows for numerous data sets with matching axis to be overlayed onto eachother using different plotting point variations.

Non-linear graphing allows for a wider veiwer showing that correlation does not always imply causation.

Many variables can be plotted using different axis, hue, saturation, lightness, shapes, size, and transparencies.

Easy to identify averages using trend lines.

Cons:

'Over plotting' makes it hard to decifer points when data is tightly clustered. Large data sets can be hard to visualize because of this.

Flat trend lines provide inconclusive results.

The graph does not provide precise data depictions as values are often rounded off.