Working in a data-driven world

Antoine Jeanjean, contact@opt2a.com

For a few years now, we have heard a lot of talk in the media or at conferences about Artificial Intelligence applied to different disciplines. The term often encompasses many fields, which makes understanding all this 'Data Market' quite complex. Here is a proposed classification of disciplines, already presented several times during conferences and which has given rise to some debates (and adjustments!). HR departments or managers must recruit new profiles which correspond to these disciplines. Students must make a choice for their future career among all of these options with often very different characteristics. I hope this article will help them better understand these professions.

This article was published on Linkedin in French on November 11, 2018 - All rights reserved


Working in a data-driven world


The proposed classification is split into 6 DATA SCIENCE professions. It is a subjective and perfectible vision. To contact me, please send a message to contact@opt2a.com.



# 1 - DATA SCRAPING


The first discipline brings together professions around Data Scraping. Depending on the company, scraping may consist of the installation of physical trackers (sensors / cameras / drones / ...) or digital trackers (web tracking, cookies, opt-in, ...). Data retrieval can also go through crawling, i.e. IT tools consisting in capturing public data available either in open data or on public sites (a technique used in particular by search engines for indexing public content). In companies or administrations, data scraping can also mean the unification of data scattered within departments or not digitized, in order to centralize them in a database. This consolidation often requires the establishment of exchange standards.


- Profiles: IT Engineer / Developer

- Keywords: trackers, crawling, consolidation, parsing.



# 2 - DATA STRUCTURATION


Once this data has been recovered, it will be necessary to put it away. We talk about data structuring or data management. For this, indexing techniques are used in order to switch from raw data to structured data. Since these processes are often to be carried out on a regular basis (every minute, hour, day, week, month ... depending on the activities), ETLs are used for this, i.e. algorithms (or software) which will make it possible to carry out massive synchronizations of information from one data source to another by carrying out labeling. Like a library that receives a semi-trailer of books in front of its door at regular intervals, a classification process will be necessary before placing these books on the shelves made available to the public. This is done in order for the public to easily find the expected content using logic. Some learning algorithms improve the structuring of data within data warehouses by gradually reducing the error. This topics are related to Machine Learning applied to classification. Sometimes the volumes of data being enormous, it is then necessary to use distributed and scalable applications while ensuring an adapted database structure (Data Lake).


- Profiles: Data Engineer

- Keywords: ETL, Indexing, Structured Data, Data Management, Learning Algorithm, Classification, Machine Learning


# 3 - BUSINESS INTELLIGENCE


Once these data are stored, we will try to distribute them within the structure. What good is having data if not to use it to inform those who want to know more. For this, we generally use Business Intelligence tools, which make it possible to edit dashboards presenting key indicators (KPI), graphs, data tables, performance reports. To make them effective, we must think about the organization of the data, to find the right indicators that will not be drowned in the mass of information. We must also think about security, management of access rights and data quality. Launching verification scripts could be mandatory so as not to disseminate Fake Data (the consequences of which may be more or less serious depending on the activities). Since data is often now the raw material for managerial decisions, the trust in the disseminated data that the Chief Data Officer manages to create with the rest of the company is essential.


- Profiles: Data Engineer, Database Administrator

- Keywords: Business Intelligence, Dashboard, KPI, Security, High Performance.


# 4 - DATA ANALYSIS


In parallel to its dissemination within dashboards, structured data may need to be analyzed in order to create new data: we will talk about Rich Data, such as forecasts, indicators, quotes, etc. These studies will consist of an analysis of structured historical data using advanced statistical methods and tools. From a starting point of the past data (training set), we will create algorithms / laws / probabilities that we will verify on another part of the past data (test set). This is done in order to refine the model. When the model is adaptive and continues to evolve on its own, becoming more and more precise as new data is received, we will speak of Machine Learning. Data analysis sometimes requires the use of distributed calculation tools when the volumes of data are too large. Indeed, a simple processor is not sufficient to handle data packets of this size. Decomposition may also be used to improve the analysis.


- Profils: Data Scientists, Data Analysts, Statisticians

- Keywords: Data analysis, forecasts, statistics, correlations, Machine Learning, R, Python


# 5 - OPERATIONS RESEARCH


Operations Research is a discipline which is based on mathematical methods which makes it possible to help make better decisions. We use the modeling and the resolution of an optimization problem which tends to maximize an objective while respecting a set constraints. Very often, it takes shape in companies in the form of decision support tools. Unlike Artificial Intelligence, Operations Research software does not make the decision for humans but allows the users to have access to all the data (raw or enriched) to make their decision. Among the main themes are: planning (personnel, resources, project, advertising, services, ...), pricing (yield, adaptive, complex pricing, ...), simulation (methods of learning, statistical methods, ...) and more generally algorithms (distributed computing, high performance, heuristics ...). This is why Operations Research generally intervenes when phases 1 > 4 have been correctly implemented in the structure: difficult to set up decision support software if the data (geographic / stocks / sales / personnel. ..) are not available and structured.


- Profiles: Operations Research Engineer (More info: www.roadef.org), Revenue Manager, IT Engineer.

- Words: Planning, Optimization, Decision Support, Simulation, Algorithm, Yield, Revenue Management, Pricing


# 6 - ARTIFICIAL INTELLIGENCE


Artificial Intelligence, in the original sense of the term, involves the disciplines which consist of going beyond Decision Support, with decision-making by the machine without human intervention. The methods used are rule engines or more complex algorithms that can call on automata based on neural networks (classifiers), on general problem solving and predicate logic. These algorithms are in all cases first based on data studies (machine learning, deep learning, etc.) and human configuration.

Main known applications are in the field of Finance and Banks (automated trading), in Logistics (robotics and automated task processing), in Transport (automated driving), in Video Games (adverse intelligence) and in Customer Service (Chatbot).

Currently, the Military, Medicine or Law fields are more focused on Simulation, Analysis or Decision Support with tools that come to support Humans instead of replacing them. We also see initiatives arriving in the field of Art (creation of work by applying a style) or Computer Programming (automatic coding of simple applications).


- Profiles: Artificial Intelligence Engineer (More info: www.afia.org), Robotics, Computer Engineer.

- Words: Robotics, Neural Networks, Automation, Chatbot.


GOVERNANCE


To complete the presentation of these disciplines and profiles of this data chain, the one who directly 'orchestrates' all of these activities is sometimes the Technical Director (CTO) or the Director of Data (CDO). The Decision Support subjects are also sometimes managed by the Director of Revenue Management, the Production Director, the Logistics Director or the Sales Director. Novel topics related to 'Artificial Intelligence' are orchestrated by the R&D Director or the Technical Director depending on the structure.


Please feel free to comment by email to discuss about your own company/organization.



Antoine Jeanjean, contact@opt2a.com

OPT2A BLOG

OPT2A BLOG

Working in a data-driven world

For a few years now, we have heard a lot of talk in the media or at conferences about Artificial Intelligence applied to different disciplines. The term often encompasses many fields, which makes understanding all this 'Data Market' quite complex. Here is a proposed classification of disciplines, already presented several times during conferences and which has given rise to some debates (and adjustments!).

Read more
Working in a data-driven world

OPT2A BLOG

Data valuation

To use the expression used by our Niortais friends from Kereon Intelligence, who recently came to give a (very good) presentation in Bordeaux about Data Literacy: 'data can now be compared to gold'! Data has become so important to companies, organizations, administrations and governments that data is the real black gold of the 21st century. And like any gold rush, we are witnessing a real 'increase in pressure' on these subjects related to Data or AI.

Read more
Data valuation

OPT2A BLOG

Automated Dynamic Pricing

The last few years have witnessed the development within e-commerce companies of a multitude of decision support tools allowing the practice of dynamic pricing. These Operations Research tools integrate the level of stock, flow, target margin, conversion rate and competitor prices...

Read more
Automated Dynamic Pricing

Contact Us

Antoine Jeanjean (PhD)
Antoine Jeanjean has more than 15 years of experience in the field of Optimization and Augmented Analysis. He is a computer engineer from ISIMA and the University of Oklahoma with a doctorate in computer science from École Polytechnique. His thesis work aimed to prove the efficiency of local search algorithms applied to industrial problems.
He first worked as a consulting research engineer for the Bouygues Group within the LocalSolver e-lab team... Read more

Bordeaux - FRANCE
contact@opt2a.com




Email

Contact us by using this form directly,
we will get back to you as soon as possible.