Imagine that humanity has a India WhatsApp Number List which, based on all the data available on our world, is able to calculate the answer to humanity’s ultimate question: what is the meaning of life? ? Now imagine that the answer to this question is the number 42 ; it is suddenly the collapse…. What does this number mean and how does it relate to the original question? Sometimes the issues that you deal with with the data you have are particularly complex or complicated, or if the results of your analyzes are not clearly understood and shared by the people who are able to convert these lessons into action.
Data visualization is not a specifically modern “science” , but it must be said that the development and democratization of machine learning processes give a particularly real taste to the thought experiment of the Galactic Traveler’s Guide : the methods of random forestor other neural networks are indeed black boxes whose cogs are manufactured and assembled freely to produce, from known input data, reliable output results, the common man not being able to ” explain the process that leads from one to the other. all this work is of no use. This step of sharing the results of a data analysis is called data visualization and it is positioned downstream of the data value chain
All kidding aside, let’s get off topic first
42 is certainly an absolutely edifying answer, but no human does not understand it. The goal of a data processing job rarely being to converse two supercomputers or two super-experts in a jar, Douglas Adams should perhaps have equipped his machine with the tools necessary to carry out this crucial step that is visualization. All this to say that producing an effective visualization – that is to say understandable and ideally beautiful – is not enough in itself: today more than before, when you prepare a data visualization you must also prepare to comment on it and sometimes to explain concepts that may be complicated, if not new to your interlocutors.
Some questions to ask yourself to produce a dataviz The stage of creating a data visualization is undoubtedly the moment when human intelligence is most mobilized in the data value chain: you have to know how to show empathy for your audience, sort out the superfluous, synthesize at a relevant level, telling a story, etc. For my part, here are some questions I ask myself when I produce a dataviz: What type of visualization is most relevant to what I want to represent? (Histogram, pie chart, curve, etc.) An example of an article that offers selection criteria for one type of chart or another
Some questions to ask yourself to produce a dataviz
Does my data visualization allow me to read / highlight the information I want to convey? This question arises particularly when you have data in large quantities / over many perimeters Can I imagine a new type of visualization to represent my results? With classic tools like Excel, we quickly find ourselves limited by the possibilities offered by the software (it is for example difficult to represent results with more than 2 or 3 dimensions / axes of analysis) Will my data visualization make sense for my interlocutors? This has in particular to do with the groupings / divisions that you will choose to do to represent your data
(ex: presenting the results of turnover by production island will not necessarily make sense for the members of a COMEX) In his 2020 guide to data visualization, Pierre-Nicolas Schwab suggests 5 levels of maturity in data isualization, I take the liberty of repeating them here because I find them relevant: To convince your audience and create mobilization around your analyzes, the ideal is to produce level 4 data visualizations. Even if producing “Data Art” is still not accessible to everyone today (programming knowledge, cost of Tableau / Qlik / PBI-type solutions, etc.), free and open-access solutions (Google Data Studio, Dataiku, etc.) allow achievement levels (2 ~ 3) to be reached that already greatly exceed this which we are used to with Excel..