-
-
   Home
   Archives
 About Us
   Advertise
 Feedback
 Subscribe

Home > Security > Full Story

Digital Watermarking - Steering the future of security
By A. K. Vanwasi

Online content delivery faces massive hurdles in the absence of a secure framework for protecting valuable data. Digital watermarking—a technology that can be used for control, media identification, tracing and protecting content owner's rights—provides the solution

Business of online delivery and distribution via CD/removable disks of multimedia products faces huge obstacles due to unlimited perfect copying and manipulation at the user end. Digital watermarking is the technology used for copy control, media identification, tracing and protecting content owner's rights.

The Internet is an open network, being increasingly used for delivery of digital multimedia contents. In the digital format, content is expressed as streams of ones and zeros that can be transported flawlessly. The contents can be copied perfectly infinite times. A user can also manipulate these files. However, good business senses necessitates two transaction mechanisms content protection and secure transport over the Internet.

The content protection mechanism attempts to protect the rights of the content creator, distributor and user. The content owner deposits a unique description of the original to a neutral registration authority. This unique distribution may be hash value or textual description. Now, the registration authority allots a unique identification number to the content and archives these two for future reference. This unique identification number is also conveyed to the content owner.

The content owner derives suitable parameters, usually digital watermark pertaining to this unique identification number. This digital watermark is securely and secretly merged with the original content itself. The digital watermarked content's quality is minimally degraded. As and when required, the content owner can prove the origin of creation by extracting the watermark from the watermarked content.

In addition to secretly embedding a watermark in content, a content owner/distributor can attach a 'label' that is related to a unique identification number. This label is a public notice that informs a user about the 'Intellectual Property Rights' (IPR) of the content.

The second aspect is the secure transportation of copyright protected content over the Internet. This requires a secure channel between two end-points for the content transport. Cryptology is an effective solution for secure transport/distribution of copyright protected content. The implementation of a cryptology scheme requires specialized hardware and key management system. Cryptology prevents eavesdropping and manipulation of copyrighted contents during transport over the Internet.

A general framework for secure distribution of digital multimedia contents involves:

  • Allocation of a unique registration number by a registration authority to the content owner for the said contents.
  • Deriving digital watermark and securely and secretly merging with contents to claim ownership rights.
  • Attaching labels to declare that the contents are covered by copyright protection. This is public warning against copying/manipulation.
  • Allocation of access rights, maybe a license to a user.
  • Secure transport of protected contents using 'Public Key Infrastructure' (PKI).
  • Monitoring traffic during distribution over the Internet so as to prevent any copyright infringement.

Functional requirements of watermarking
A watermark conveys as much information as possible. This implies that the watermark data rate should be high. A watermark should be a secret and be accessible to authorized parties only. This requirement is known as security of the watermark. This is achieved by the use of cryptographic keys.

A watermark is an integral part of data. It must persist even after signal processing and data manipulation. This also includes malicious manipulation that attempts to remove the watermark. This requirement is known as 'robustness requirement'.

A watermark though being irremovable should also be imperceptible. It should not modify or alter the quality of content. Normally, the degradation in quality is well below one percent.

Watermark recovery process may or may not be allowed to use the original contents of the digital watermark.

Generic model of watermarking scheme
The generic model (see figure-1) shows how a digital watermarking scheme functions. The first part shows how to create watermarked data. The second part shows how to recover the watermark from a test data.

A content owner approaches a neutral registration authority. Depending on the nature of multimedia content, the authority allots a unique registration number. It also archives content and unique registration number for future reference.

A content owner generates suitable watermark, which can be embedded within the data. Such a watermark should be unobtrusive and secure. To ensure that watermark is imperceptible, the watermark signal amplitude should be relatively small (one percent) compared to the average amplitude of content.

To ensure security of embedded digital watermark, one or several secret and crypto logically secure keys have to be used.

To ensure robustness against data manipulation and processing, it is helpful to have very small digital watermarks and ensure that it is redundantly distributed in the host data. Thus, while extracting a digital watermark, a small sample of watermarked data is enough.

The digital watermark, public/private key and host data is processed using watermarking algorithm to generate the watermarked data.

To extract (detect) the watermark, the authorized agency requires watermark and/or original host data, secure/public key and test data.

All these inputs are processed by watermark recovery program to extract watermark or confidence measure. The confidence measure indicates the degree of closeness of original watermark and recovered watermark.

Spread spectrum technique and watermarking Many digital watermarking schemes deploy spread-spectrum communication. In such schemes a watermark is embedded by adding pseudonoise (PN) signal. This PN signal functions as a secret key.

This specific PN signal can later on be detected by correlation receiver or matched filter. The probability of false-positive or false-negative detection can be made low by appropriate amplitude and the number of added samples.

It is also possible to subtract the PN signal from the host data. In this case, the correlation receiver will calculate high-negative correlation in the detection process. Thus, by using addition or subtraction process it is possible to convey one-bit of information. By sequential adding of several such bits, it is possible to convey arbitrary information.

Types of watermark
Visible watermarks: Visible watermarks are an extension of the concept of logos. Such watermarks are applicable to images only. These logos are inlaid into the image but they are transparent. Such watermarks cannot be removed by cropping the center part of the image. Further, such watermarks are protected against attacks such as statistical analysis.

The drawbacks of visible watermarks are degrading the quality of image and detection by visual means only. Thus, it is not possible to detect them by dedicated programs or devices. Such watermarks have applications in maps, graphics and software user interface.

Invisible watermark: Invisible watermark is hidden in the content. It can be detected by an authorized agency only. Such watermarks are used for content and/or author authentication and for detecting unauthorized copier.

Public watermark: Such a watermark can be read or retrieved by anyone using the specialized algorithm. In this sense, public watermarks are not secure. However, public watermarks are useful for carrying IPR information. They are good alternatives to labels.

Fragile watermark: Fragile watermarks are also known as tamper-proof watermarks. Such watermarks are destroyed by data manipulation.

Private Watermark: Private watermarks are also known as secure watermarks. To read or retrieve such a watermark, it is necessary to have the secret key.

Perceptual watermarks: A perceptual watermark exploits the aspects of human sensory system to provide invisible yet robust watermark. Such watermarks are also known as transparent watermarks that provide extremely high quality contents.

Bit-stream watermarking: The term is sometimes used for watermarking of compressed data such as video.

Text document watermarking
Text document is a discrete information source. In discrete sources, contents cannot be modified. Thus, generic watermarking schemes are not applicable. The approaches for text watermarking are hiding watermark information in semantics and hiding watermark in text format.

In semantic-based watermarking, the text is designed around the message to be hidden. Thus, misleading information covers watermark information. Such techniques defy scientific approach.

By text format, we mean layout and appearance. Commonly used techniques to hide watermark information are line shift coding, word shift coding and feature coding.

In line shift coding, single lines of the document are shifted upward or downward in very small amounts. The watermark information is encoded in the way lines are shifted upward or downward. Watermark recovery is simple because a line space in normal text is uniform.

In word shift coding, words are shifted horizontally in order to modify the spacing between consecutive words. While detecting the watermark, the original word spacing data is required because normally word spacing is variable.

In feature coding, feature of some characters are modified. In a typical case, the length of end lines to characters like b, d, h are modified. While detecting the watermark, the original lengths are known.

The formatted text method of watermarking can be defeated easily by retyping the whole text using a new character font. The retyping can be done manually or using automated 'optical character recognition' (OCR) unit. The OCR-based techniques are not perfect and require human supervision.

In general, such watermark removal methods are expensive. For text watermarking, the goal is to make watermark removal expensive and encourage copyrighted text. Thus, the above methods are robust enough to resist printing and consecutive photocopying of up to 10th generation.

Software protection
Fig. 1: Generic digital watermark embedding and detection scheme

Watermark can be a number, text or image

Secret/public key is used to enforce security of watermarked content

For secure transport of watermarked data encryption/decryption is used.

Watermark can be recovered by an authorized agency having secure key, watermark and /or original data.

Software is a discrete information source. It is not allowed either to add or delete even a single bit to software. Thus, watermarking technique is not suitable for copyright protection.

The basic objective of a software protection system is to ensure that the software can be distributed openly in protected (encrypted) form but can only be used within a trusted hardware system. Such a system has provision to process owner's license restrictions and protect software as well.

A user has to first obtain the license that contains information about accessing the software and decrypting key. A user may be allowed access to certain portion of software for a defined period only.

After seeking a license, a user can download the encrypted software over the Internet. Alternatively, the distributor can also send the software.

A trusted hardware is a secure hardware. It contains embedded authentication software. Thus, a user is required to present secret key before access is granted. A simple low cost solution is to use smart card in which the secret key may be stored.

A trusted hardware must also ensure that the licensed software is also protected against tampering/piracy.

Executable software is aware of access control mechanism. Such software can interrogate the mechanism to determine whether a particular feature is allowed by the license controlling the software.

To ensure a long period of protection, it is essential that the secret information should be minimal. System security depends on storing the private decryption key in a special hardware.

Robustness testing
Evaluation of a watermarking technique is necessary to ascertain the robustness against attacks. Here, is description of important benchmarking software for images.

Stirmark: It is a generic tool for robustness testing of image watermarking software. It is freely available and attempts to remove the hidden watermark using the following procedure: it simulates resampling to

emulate printing. It applies a minor unnoticeable geometric distortion. In other words, image is slightly stretched, sheared, bent and rotated by an unnoticeable random amount. Further, small and smoothly distributed errors are introduced.

Finally, image is resampled and interpolated.

It has been claimed by the designer that it removes all current watermarks. The effect of the attack is not visually annoying.

Unzign: It is benchmarking software for images in JPEG format. In version 1.1, Unzign introduced pixel jittering in combination with a slight image translation. For many watermarking schemes, Unzign removes the watermark. However, version 1.1 introduces severe distortion. An improved version 1.2 reduces distortion but watermark destruction capability is also reduced.

Watermarking technology is still in the evolutionary stages. It is not as secure as modern cryptology. However, watermarking is not a standalone technology. Water-marking must be suitably combined with encryption to offer reliable protection to contents.

A.K. Vanwasi, G.M. (R&D) ITI Ltd. Naini, Allahabad can be reached at vanwasi_nni@itiltd.co.in

Digital Media: Multimedia contents are expressed as streams of ones and zeros. In the digital world, the concept of copy is irrelevant. Technically, there is no distinction between original and nth copy.

Sources: There are two types of information sources--waveform and discrete sources.

Waveform sources: The transmission criterion for such a signal is characterized by fidelity criterion. This implies that there exist several digital representations of a given waveform for a given fidelity. Examples of waveform sources are audio, image or video signals.

Discrete sources: Text, software and data files belong to discrete source category. It is essential to have exact replica of software/data files at the destination end. This requires error-free transport of data files.

Information hiding: Information hiding deals with communication security. It comprises of encryption and traffic security. Encryption protects the content during distribution over an open network such as the Internet. However, an attacker knows that secret communication is taking place and a copy of the same is available. The traffic security pertains to concealing its sender, its receiver or its very existence. Thus, here an attempt is made to have secret (unobtrusive) communication between two parties where very existence is unknown to a possible attacker.

Steganography: It is a sub discipline of information hiding. Here, secret information is hidden in an innocuous (harmless) message. Such an innocuous message is also known as cover message.

Watermarking: Watermarking is also a sub-discipline of information hiding. Watermarking is the process of embedding secret and robust identifiers inside audio-visual content. Thus, the watermarking process is generally applicable to waveform type of information sources.

The purpose of watermarking is to establish the copyright of the content creator. In this sense, watermarks are also known as hidden copyright messages.

Watermarking secures the content. Thus, any attempt to modify the content can be easily detected.

Watermarking can trace the path followed by content in a distribution chain. This helps in tracing malicious users.

By detecting watermarks embedded in the content, it is possible to authenticate genuineness.

Label: This is readable public information added to content for IPR protection. It conveys ownership of content, indexing and authenticity. A label does not modify the content. Digital signature is an example of a label.

A label along with valid certification and cryptographic keys allows verification of the origin and the integrity of the content.

It is impossible to prevent removing or replacing the label from the content because they are separated from the content. However, label generally, offers the following functionality:

  • Authentication of origin of content.
  • Strict integrity of the bit stream.
  • Integrity of identification numbers and IPR data.
  • Integrity of the meaning of the content.

Finger printing:

It is a hidden serial number embedded in content. It helps in identifying copyright violators.

Standardization in digital watermarking
Secure Digital Music Initiative (SDMI): SDMI is an industry consortium comprising of all the major hardware and software companies in the music industry.

The group is working on copyright protection, copyright management and royalty tracking issues. The group is planning a secure way to download the music and prevent free downloading.

The SDMI specifications is built around SDMI-compliant portable devices, portable media that store and playback protected audio content. The specification requires that any SDMI content is protected at all times after it is imported into SDMI-compliant applications, portable device or portable medium.

Unknown contents can be checked in to an SDMI-compliant portable device, but cannot be copied.

SDMI screening technology is still under evaluation. SDMI has devised six watermarking technologies to protect the digitally recorded music.

Digital Video Disk (DVD): DVD is the latest storage technology that provides storage seven times to that of CD. It is developed as a portable medium to deliver data to consumers. One problem that delayed the development of the DVD standard is the protection of copyrighted movies.

One way to secure the content on a DVD is to link a watermark verification process to the proper functioning of the DVD player. Thus, player output is enabled only after verification. Similarly, a DVD-compliant recorder will refuse to record pirated material.

MPEG-4: It is an International Standard Organization (ISO)/International Electromechanic Commission (IEC) standard (ISO / IEC 14496). It provides an audio-visual coding standard for very-low-bit rate channels. Such channels are found in the Internet and mobile applications.

This standard also specifies an Intellectual Properly and Management interface for content protection. The content-management infrastructure will provide support to contents identification, automatic monitoring and tracking of audio-visual objects, prevention of illegal copies, tracking audio-visual object modifications history and support of transaction among user, distributors and right holders.

Technical glossary
Pseudo Noise Sequence (PN Sequence): PN Sequence is also known as maximum length sequence. It has properties of a random binary sequence.

PN sequence is a periodic binary sequence. It is generated by an n-bit linear feedback shift register. Its bit duration is fraction of information bit duration. PN sequence is used as spreading sequence in spread spectrum transmitter. Different PN sequences can be generated by feeding in different n-bit initial sequence.

Correlation: It is a measure of similarity between two signals. The degree of similarity is often expressed as a number between zero and one. Zero indicates a perfect match.

Spread spectrum modulation: It is a digital radio frequency modulation technique. This modulation method is immune to interference and interception.

Here, a given binary signal sequence is multiplied with other PN sequences to spread the signal over a much larger frequency band . The resulting signal is much below the noise floor.

At the receiver, correlation receiver/matched filter is used to recover the original binary signal sequence. The precise detection requires the original PN sequence also. By using difference PN sequences for spreading difference binary signal sequence, it is possible to send multiple signal sequence simultaneously, over a given frequency band.

Matched filter: It is used to recover binary signal sequence from the spread spectrum modulated signal. The response of matched filter is matched to PN sequence used at the transmitter, to spread the binary signal sequence.

Echo hiding: It is a method of audio watermarking. It is based on the fact that one cannot perceive short echoes of the order of a millisecond.

Echo hiding embeds data into a cover audio signal. To encode ones and zeros, two types of echoes with different delay are used. Thus, an arbitrary sequence of ones and zeros can be embedded.

Digital Video Disc (DVD): DVD is a powerful multifunctional and removable storage device. It stores seven to twenty five times more data than CD. It is nine times faster too. DVD is random access technology. No fast forwarding or rewinding is necessary. It is used to store audio and video.

Joint Photographic Expert Group (JPEG) : JPEG has created standards for compressing and coding continuous time still images of any size and any sampling rate. The scheme is valid for both monochrome and colour images. Using JPEG, good quality image compression can be achieved with a compression ratio of 32 to 1.

- <Back to Top>-  

© Copyright 2001: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by The Business Publications Division of the Indian Express Group of Newspapers. Site managed by BPD