At the end of 2018, we Jordan WhatsApp Number List with a large European logistics operator. We conducted a strategic study with this client aimed at evaluating the advisability of mobilizing its logistics base (vehicles, industrial resources, men) to develop new service activities in France. The study addressed 2 aspects of feasibility: industrial and economic. From a data point of view, our client was well armed: equipped with a datalake and the means of capturing data in the field, he recorded, for example, all his logistical actions for the first / last kilometers. These logistical actions alone represented several hundred thousand registrations per day.
In other words, our client had data in large volumes and at an extremely fine mesh (business gesture):we are used to Excel . New tools for a new life The habit of this type of mission could have made us start headlong with our traditional tool for data manipulation: Excel. Not crazy anyway and having understood that our daily life would consist in the first phase of browsing, crossing, enriching, undoing, redoing data sets of several million lines, we set out to find tools better suited to our needs. As part of our monitoring, we had already identified as a potential candidate Dataiku, a data analysis platform that is easy to use and free
New tools for a new life
in its demo version (amply sufficient for the needs described above). Dataiku is also the solution we use today for this type of mission. This article is not sponsored, I will cite KNIME as an alternative,open source. While Excel is and will remain an excellent data processing tool for day-to-day management, the Dataiku and KNIME standard solutions are specifically designed for large-volume data analysis. With these tools: No more machines that crash due to calculation operations on more than 100k lines : these solutions do not use (all) the memory of your machine. When you perform treatments on your datasets, they are first done
on samples which allow a preview before launching operations on all the data. Once your cleanse / compute operations are defined, you apply them to all of the data. Since these tools run either on a remote server, or on a partition of your memory, the heavy operations which usually would crash your machine are executed here in the background. No more files that get lost in meanders of tree-structured folders : with these tools you work in “projects” where each manipulation step is kept in the history by a set of data accessible in the manipulation tree. This functionality considerably facilitates the sharing and reuse of your processed data
New sources for a new life
KNIME manipulations tree No need to systematically start from 0 : when you perform operations with Dataiku or KNIME, you are actually defining a data processing protocol. Thus, the same protocol can be applied effortlessly to several different data sets. This functionality is particularly useful when you are working with structured datasets intended to be updated (repositories, sales data, production data, etc.) I would add that these tools have the good taste to connect with many data sources (JSON, txt, csv files, SQL databases, etc.) and to offer advanced data visualization features. These are therefore end-to-end tools for your analyzes. New sources for a new life It is one thing to
have the tools to manipulate data, but it is also necessary to be able to derive value from it. In this case, we were trying to establish the economic and industrial feasibility of new services, which amounts to starting by first asking the right questions, for example: for which customer base is my industrial device sized? In my logistics for the first / last km, what service windows can I offer my customers? What services are expected by my customers? What service niches are of interest to my clients? Etc. Once the questions have been asked, they must be answered. For some queries, our client produced sufficient data; for others, it was necessary to enrich. New data sources have enabled this enrichment, Open data has become an