Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.[1] These characteristics can include size or amount of data, completeness of the data, correctness of the data, possible relationships amongst data elements or files/tables in the data.
Data exploration is typically conducted using a combination of automated and manual activities.[1][2][3] Automated activities can include data profiling or data visualization or tabular reports to give the analyst an initial view into the data and an understanding of key characteristics.[1]
This is often followed by manual drill-down or filtering of the data to identify anomalies or patterns identified through the automated actions. Data exploration can also require manual scripting and queries into the data (e.g. using languages such as SQL or R) or using spreadsheets or similar tools to view the raw data.[4]
All of these activities are aimed at creating a mental model and understanding of the data in the mind of the analyst, and defining basic metadata (statistics, structure, relationships) for the data set that can be used in further analysis.[1]
Once this initial understanding of the data is had, the data can be pruned or refined by removing unusable parts of the data (data cleansing), correcting poorly formatted elements and defining relevant relationships across datasets.[2] This process is also known as determining data quality.[4]
Data exploration can also refer to the ad hoc querying or visualization of data to identify potential relationships or insights that may be hidden in the data and does not require to formulate assumptions beforehand.[1]
Traditionally, this had been a key area of focus for statisticians, with John Tukey being a key evangelist in the field.[5] Today, data exploration is more widespread and is the focus of data analysts and data scientists; the latter being a relatively new role within enterprises and larger organizations.
^ abcdeFOSTER Open Science, Overview of Data Exploration Techniques: Stratos Idreos, Olga Papaemmonouil, Surajit Chaudhuri.
^ abStanford.edu, 2011 Wrangler: Interactive Visual Specification of Data Transformation Scripts, Kandel, Paepcke, Hellerstein Heer.
^Arnab Nandi; H. V. Jagadish. Guided Interaction: Rethinking the Query-Result Paradigm(PDF). International Conference on Very Large Data Bases (VLDB) 2011.
^ abStanford.edu, IEEE Visual Analytics Science & Technology (VAST), Oct 2012 Enterprise Data Analysis and Visualization: An Interview Study., Sean Kandel, Andreas Paepcke, Joseph Hellerstein, Jeffrey Heer Proc.
^Exploratory Data Analysis, Pearson. ISBN 978-0201076165
Dataexploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and...
Hydrocarbon exploration (or oil and gas exploration) is the search by petroleum geologists and geophysicists for deposits of hydrocarbons, particularly...
data analysis, to begin understanding the messages contained within the obtained data. The process of dataexploration may result in additional data cleaning...
Exploration is the process of exploring, an activity which has some expectation of discovery. Organised exploration is largely a human activity, but exploratory...
notable companies in the petroleum industry that are engaged in petroleum exploration and production. The list is in alphabetical order by continent and then...
tested by collecting new data. JMP, an EDA package from SAS Institute. KNIME, Konstanz Information Miner – Open-Source dataexploration platform based on Eclipse...
Space exploration is the use of astronomy and space technology to explore outer space. While the exploration of space is currently carried out mainly by...
Data Desk is a software program for visual data analysis, visual dataexploration, and statistics. It carries out Exploratory Data Analysis (EDA) and standard...
purposes, therapeutic treatments, and dataexploration and analysis. Extended reality works by using visual data acquisition that is either accessed locally...
Azure Data Explorer is a fully-managed big data analytics cloud platform and data-exploration service, developed by Microsoft, that ingests structured...
Exploration geophysics is an applied branch of geophysics and economic geology, which uses physical methods at the surface of the Earth, such as seismic...
not as obvious in non-visualized quantitative data. Visualization can become a means of dataexploration. Studies have shown individuals used on average...
Reflection seismology (or seismic reflection) is a method of exploration geophysics that uses the principles of seismology to estimate the properties...
unstructured image, video, text, and audio data. Its platform supports the full AI lifecycle for dataexploration, data labeling, model training, evaluation...
•Neo V6 Hero X Echo Fever "Social Media", Online Marketing and Big DataExploration, Palgrave Macmillan, 2015, doi:10.1057/9781137488961.0012, ISBN 9781137488961...
open-source software application for dataexploration and data visualization able to handle data at petabyte scale (big data). The application started as a...
gas, Chevron is vertically integrated and is involved in hydrocarbon exploration, production, refining, marketing and transport, chemicals manufacturing...
potential. Engineering interplanetary journeys is complicated and the exploration of Mars has experienced a high failure rate, especially the early attempts...
oilfield service companies – which provide services to the petroleum exploration and production industry but do not typically produce petroleum. In the...
in the Tarim Basin in the Xinjiang region for scientific, oil and gas exploration. On 4 March 2024, drilling of the borehole, which is known as Shendi...
Underwater exploration is the exploration of any underwater environment, either by direct observation by the explorer, or by remote observation and measurement...
and PSTN calling, and web based chat. Azure Data Explorer provides big data analytics and data-exploration capabilities Azure Search provides text search...
of German no-code company 8080 Labs. 8080 Labs makes bamboolib, a dataexploration tool that does not require coding to use. In response to the popularity...
due to the lack of flexible tooling for sonification research and dataexploration. The Geiger counter, invented in 1908, is one of the earliest and most...
Space Exploration Technologies Corporation, commonly referred to as SpaceX, is an American spacecraft manufacturer, launch service provider, defense contractor...
timeline of space exploration which includes notable achievements, first accomplishments and milestones in humanity's exploration of outer space. This...