Global Information Lookup Global Information

Data deduplication information


In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve storage utilization, which may in turn lower capital expenditure by reducing the overall amount of storage media required to meet storage capacity needs. It can also be applied to network data transfers to reduce the number of bytes that must be sent.

The deduplication process requires comparison of data 'chunks' (also known as 'byte patterns') which are unique, contiguous blocks of data. These chunks are identified and stored during a process of analysis, and compared to other chunks within existing data. Whenever a match occurs, the redundant chunk is replaced with a small reference that points to the stored chunk. Given that the same byte pattern may occur dozens, hundreds, or even thousands of times (the match frequency is dependent on the chunk size), the amount of data that must be stored or transferred can be greatly reduced.[1][2]

A related technique is single-instance (data) storage, which replaces multiple copies of content at the whole-file level with a single shared copy. While possible to combine this with other forms of data compression and deduplication, it is distinct from newer approaches to data deduplication (which can operate at the segment or sub-block level).

Deduplication is different from data compression algorithms, such as LZ77 and LZ78. Whereas compression algorithms identify redundant data inside individual files and encodes this redundant data more efficiently, the intent of deduplication is to inspect large volumes of data and identify large sections – such as entire files or large sections of files – that are identical, and replace them with a shared copy.

  1. ^ "Understanding Data Deduplication". Druva. 2009-01-09. Archived from the original on 2019-08-06. Retrieved 2019-08-06.
  2. ^ Cite error: The named reference snia was invoked but never defined (see the help page).

and 25 Related for: Data deduplication information

Request time (Page generated in 0.8141 seconds.)

Data deduplication

Last Update:

In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve...

Word Count : 2693

Dell EMC Data Domain

Last Update:

Dell EMC Data Domain was Dell EMC’s data deduplication storage system. Development began with the founding of Data Domain, and continued since that company’s...

Word Count : 1344

Criticism of Dropbox

Last Update:

security of their files. At the heart of the complaint was the policy of data deduplication, where the system checks if a file has been uploaded before by any...

Word Count : 3202

Data analysis

Last Update:

matching, identifying inaccuracy of data, overall quality of existing data, deduplication, and column segmentation. Such data problems can also be identified...

Word Count : 9552

Deduplication

Last Update:

up deduplication in Wiktionary, the free dictionary. The term deduplication refers generally to eliminating duplicate or redundant information. Data deduplication...

Word Count : 84

Data storage

Last Update:

Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs...

Word Count : 923

Backup

Last Update:

called deduplication. It can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach...

Word Count : 6560

DDR SDRAM

Last Update:

Double Data Rate Synchronous Dynamic Random-Access Memory (DDR SDRAM) is a double data rate (DDR) synchronous dynamic random-access memory (SDRAM) class...

Word Count : 2539

ZFS

Last Update:

require external scripts and software for utilization. Native data compression and deduplication, although the latter is largely handled in RAM and is memory...

Word Count : 9912

5D optical data storage

Last Update:

5D optical data storage (also branded as Superman memory crystal, a reference to the Kryptonian memory crystals from the Superman franchise) is an experimental...

Word Count : 1101

Data redundancy

Last Update:

makes the best possible usage of storage. Data maintenance Data deduplication Data scrubbing End-to-end data protection Redundancy (engineering) Redundancy...

Word Count : 509

Computer data storage

Last Update:

Cloud storage Hybrid cloud storage Data deduplication Data proliferation Data storage tag used for capturing research data Disk utility File system List of...

Word Count : 6491

NTFS reparse point

Last Update:

April 2019). "Introduction to Data Deduplication in Windows Server 2012". Microsoft Tech Community. "Data Deduplication interoperability". docs.microsoft...

Word Count : 2250

ReFS

Last Update:

Data deduplication was missing in early versions of ReFS. It was implemented in v3.2, debuting in Windows Server v1709. Support for alternate data streams...

Word Count : 3144

NetVault Backup

Last Update:

Microsoft Exchange Server, DB2, and Teradata. Quest Software offers data deduplication, and protection for NAS filers (NDMP). NetVault Backup is based on...

Word Count : 1386

XFS

Last Update:

copy-on-write (COW) data, data deduplication, reflink copies, online data and metadata scrubbing, highly accurate reporting of data loss or bad sectors...

Word Count : 3084

Btrfs

Last Update:

between snapshots to a binary stream) Incremental backup Out-of-band data deduplication (requires userspace tools) Ability to handle swap files and swap partitions...

Word Count : 6560

NTFS

Last Update:

January 7, 2024. Rick Vanover (14 September 2011). "Windows Server 8 data deduplication". Archived from the original on 2016-07-18. Retrieved 2011-12-02....

Word Count : 8758

LPDDR

Last Update:

Low-Power Double Data Rate (LPDDR), also known as LPDDR SDRAM, is a type of synchronous dynamic random-access memory that consumes less power and is targeted...

Word Count : 3522

Rolling hash

Last Update:

splitting file streams. Such content-defined chunking is often used for data deduplication. Several programs, including gzip (with the --rsyncable option) and...

Word Count : 2009

String metric

Last Update:

analysis, evidence-based machine learning, database data deduplication, data mining, incremental search, data integration, malware detection, and semantic knowledge...

Word Count : 527

Williams tube

Last Update:

and each Williams tube could typically store about 256 to 2560 bits of data. Because the electron beam is essentially inertia-free and can be moved anywhere...

Word Count : 1661

Ext4

Last Update:

block allocation until data is flushed to disk; in contrast, some file systems allocate blocks immediately, even when the data goes into a write cache...

Word Count : 3427

Magnetic tape

Last Update:

important to enable transferring data. Tape data storage is now used more for system backup, data archive and data exchange. The low cost of tape has...

Word Count : 950

Flash memory

Last Update:

for general storage and transfer of data. NAND or NOR flash memory is also often used to store configuration data in digital products, a task previously...

Word Count : 16884

PDF Search Engine © AllGlobal.net