Data Stream Mining (also known as stream learning) is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of instances that in many applications of data stream mining can be read only once or a small number of times using limited computing and storage capabilities.[1]
In many data stream mining applications, the goal is to predict the class or value of new instances in the data stream given some knowledge about the class membership or values of previous instances in the data stream.[2]
Machine learning techniques can be used to learn this prediction task from labeled examples in an automated fashion.
Often, concepts from the field of incremental learning are applied to cope with structural changes, on-line learning and real-time demands.
In many applications, especially operating within non-stationary environments, the distribution underlying the instances or the rules underlying their labeling may change over time, i.e. the goal of the prediction, the class to be predicted or the target value to be predicted, may change over time.[3] This problem is referred to as concept drift. Detecting concept drift is a central issue to data stream mining.[4][5] Other challenges[6] that arise when applying machine learning to streaming data include: partially and delayed labeled data,[7][8] recovery from concept drifts,[1] and temporal dependencies.[9]
Examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data.
Data stream mining can be considered a subfield of data mining, machine learning, and knowledge discovery.
^ abGomes, Heitor M.; Bifet, Albert; Read, Jesse; Barddal, Jean Paul; Enembreck, Fabrício; Pfharinger, Bernhard; Holmes, Geoff; Abdessalem, Talel (2017-10-01). "Adaptive random forests for evolving data stream classification". Machine Learning. 106 (9): 1469–1495. doi:10.1007/s10994-017-5642-8. hdl:10289/11231. ISSN 1573-0565.
^Lemaire, Vincent; Salperwyck, Christophe; Bondu, Alexis (2015), Zimányi, Esteban; Kutsche, Ralf-Detlef (eds.), "A Survey on Supervised Classification on Data Streams", Business Intelligence: 4th European Summer School, eBISS 2014, Berlin, Germany, July 6–11, 2014, Tutorial Lectures, Lecture Notes in Business Information Processing, Springer International Publishing, pp. 88–125, doi:10.1007/978-3-319-17551-5_4, ISBN 978-3-319-17551-5
^Gomes, Heitor Murilo; Grzenda, Maciej; Mello, Rodrigo; Read, Jesse; Le Nguyen, Minh Huong; Bifet, Albert (2022-02-28). "A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams". ACM Computing Surveys. 55 (4): 1–42. arXiv:2106.09170. doi:10.1145/3523055. ISSN 0360-0300.
^Grzenda, Maciej; Gomes, Heitor Murilo; Bifet, Albert (2019-11-16). "Delayed labelling evaluation for data streams". Data Mining and Knowledge Discovery. 34 (5): 1237–1266. doi:10.1007/s10618-019-00654-y. ISSN 1573-756X.
^Žliobaitė, Indrė; Bifet, Albert; Read, Jesse; Pfahringer, Bernhard; Holmes, Geoff (2015-03-01). "Evaluation methods and decision theory for classification of streaming data with temporal dependence". Machine Learning. 98 (3): 455–482. doi:10.1007/s10994-014-5441-4. hdl:10289/8954. ISSN 1573-0565.
and 25 Related for: Data stream mining information
DataStreamMining (also known as stream learning) is the process of extracting knowledge structures from continuous, rapid data records. A data stream...
Datamining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics...
In computer science, stream processing (also known as event stream processing, datastream processing, or distributed stream processing) is a programming...
retraining, also known as refreshing, of any model is necessary. DatastreamminingDatamining Persuasions of the Witch's Craft Snyk, a company whose portfolio...
In computer science, streaming algorithms are algorithms for processing datastreams in which the input is presented as a sequence of items and can be...
Process mining is a family of techniques used to analyze event data in order to understand and improve operational processes. Part of the fields of data science...
Datamining, the process of discovering patterns in large data sets, has been used in many applications. In business, datamining is the analysis of historical...
Analysis (MOA) is a free open-source software project specific for datastreammining with concept drift. It is written in Java and developed at the University...
typically employed for datastreammining tasks in dynamic and changing environments. Supervised Classification on DataStreams Evolving fuzzy rule-based...
Advanced Metering Infrastructure Using Intrusion Detection System with DataStreamMining" (PDF). Archived from the original (PDF) on 2016-09-10. Retrieved...
other inputs. If so, output the records. Datastream management system Datastreammining "Issues in DataStream Management" (PDF). "University of Waterloo...
Graphs Molecule mining Sequences Datastreammining Learning from time-varying datastreams under concept drift Web Data model Metadata Metamodels Ontology...
governance Data integrity Data maintenance Data management DataminingData modeling Data point Data preservation Data protection Data publication Data remanence...
Mountaintop removal mining (MTR), also known as mountaintop mining (MTM), is a form of surface mining at the summit or summit ridge of a mountain. Coal...
Gold mining is the extraction of gold by mining. Historically, mining gold from alluvial deposits used manual separation processes, such as gold panning...
machine learning and artificial intelligence, typically employed for datastreammining tasks in dynamic and changing environments. existential risk The hypothesis...
Mining is the extraction of valuable geological materials and minerals from the surface of the Earth. Mining is required to obtain most materials that...
session on Process Mining. Process mining is a type of research that is a mix of computational intelligence and datamining, as well as process modeling and...
event data (e.g., for process mining)". In 2023, the standard has been revised in and superseded by the IEEE Standard 1849-2023. Process mining aims to...
Policy Sustainability and Data Sciences Laboratory, Northeastern University risQ Climate as complex networks Datastreammining Nonlinear Processes in Geophysics:...
Discovery and DataMining, hosts an influential annual conference. The KDD Conference grew from KDD (Knowledge Discovery and DataMining) workshops at...
connector (e.g. USB) through which 'raw data' can be streamed into a personal computer. Typical unstructured data sources include web pages, emails, documents...
Coal mining is the process of extracting coal from the ground or from a mine. Coal is valued for its energy content and since the 1880s has been widely...
Surface mining, including strip mining, open-pit mining and mountaintop removal mining, is a broad category of mining in which soil and rock overlying...
offerings and shut down mining. Many Chinese miners have since relocated to Canada and Texas. One company is operating data centers for mining operations at Canadian...