|
Home
> Security > Full Story
Digital
Watermarking - Steering the future of security
By
A. K. Vanwasi
Online
content delivery faces massive hurdles in the absence of a
secure framework for protecting valuable data. Digital watermarking—a
technology that can be used for control, media identification,
tracing and protecting content owner's rights—provides the
solution
Business
of online delivery and distribution via CD/removable disks
of multimedia products faces huge obstacles due to unlimited
perfect copying and manipulation at the user end. Digital
watermarking is the technology
used for copy control, media identification, tracing and protecting
content owner's rights.
The Internet is an open network, being increasingly used for
delivery of digital multimedia contents. In the digital format,
content is expressed as streams of ones and zeros that can
be transported flawlessly. The contents can be copied perfectly
infinite times. A user can also manipulate these files. However,
good business senses necessitates two transaction mechanisms
content protection and secure transport over the Internet.
The content protection mechanism attempts to protect the rights
of the content creator, distributor and user. The content
owner deposits a unique description of the original to a neutral
registration authority. This unique distribution
may be hash value or textual description. Now, the registration
authority allots a unique identification number to the content
and archives these two for future reference. This unique identification
number is also conveyed to the content owner.
The content owner derives suitable
parameters, usually digital watermark pertaining to this unique
identification number. This digital watermark is securely
and secretly merged with the original content itself. The
digital watermarked content's
quality is minimally degraded. As and when required, the content
owner can prove the origin of creation by extracting the watermark
from the watermarked content.
In addition to secretly embedding a watermark in content,
a content owner/distributor can attach a 'label' that is related
to a unique identification number. This label is a public
notice that informs a user about the 'Intellectual Property
Rights' (IPR) of the content.
The second aspect is the secure transportation of copyright
protected content over the Internet. This requires a secure
channel between two end-points for the content transport.
Cryptology is an effective solution for secure transport/distribution
of copyright protected content. The implementation of a cryptology
scheme requires specialized hardware and key management system.
Cryptology prevents eavesdropping and manipulation of copyrighted
contents during transport over the Internet.
A general framework for secure distribution
of digital multimedia contents
involves:
-
Allocation of a unique registration number by a registration
authority
to the content owner for the said contents.
-
Deriving digital watermark and securely and secretly merging
with contents to claim ownership rights.
-
Attaching labels to declare that the contents are covered
by copyright protection. This is public warning against
copying/manipulation.
-
Allocation of access rights, maybe a license to a user.
-
Secure transport of protected contents using 'Public Key
Infrastructure' (PKI).
-
Monitoring traffic during distribution over the Internet
so as to prevent any copyright infringement.
Functional requirements of watermarking
A
watermark conveys as much information as possible. This implies
that the watermark data rate should be high. A watermark should
be a secret and be accessible to authorized parties only.
This requirement is known as security of the watermark. This
is achieved by the use of cryptographic
keys.
A watermark is an integral part of data. It must persist even
after signal processing and data manipulation. This also includes
malicious manipulation that attempts to remove the watermark.
This requirement is known as 'robustness requirement'.
A watermark though being irremovable
should also be imperceptible. It should not modify or alter
the quality of content. Normally, the degradation in quality
is well below one percent.
Watermark recovery process may or may not be allowed to use
the original contents of the digital watermark.
Generic model of watermarking scheme
The
generic model (see figure-1) shows how a digital watermarking
scheme functions. The first part shows how to create watermarked
data. The second part shows how to recover the watermark from
a test data.
A content owner approaches a neutral registration authority.
Depending on the nature of multimedia
content, the authority allots a unique registration number.
It also archives content and unique registration number for
future reference.
A content owner generates suitable
watermark, which can be embedded within the data. Such a watermark
should be unobtrusive and secure. To ensure that watermark
is imperceptible, the watermark signal amplitude should be
relatively small (one percent) compared to the average amplitude
of content.
To ensure security of embedded digital watermark, one or several
secret and crypto logically secure keys have to be used.
To ensure robustness against data manipulation and processing,
it is helpful to have very small digital watermarks and ensure
that it is redundantly distributed in the host data. Thus,
while extracting a digital watermark, a small sample of watermarked
data is enough.
The digital watermark, public/private
key and host data is processed using watermarking algorithm
to generate the watermarked data.
To extract (detect) the watermark, the authorized agency requires
watermark and/or original host data, secure/public key and
test data.
All these inputs are processed by watermark recovery program
to extract watermark or confidence measure. The confidence
measure indicates the degree of closeness of original watermark
and recovered watermark.
Spread spectrum technique and watermarking Many
digital watermarking schemes deploy spread-spectrum communication.
In such schemes a watermark is embedded by adding pseudonoise
(PN) signal. This PN signal functions as a secret key.
This specific PN signal can later on be detected by correlation
receiver or matched filter. The probability of false-positive
or false-negative detection can be made low by appropriate
amplitude and the number of added samples.
It is also possible to subtract the PN signal from the host
data. In this case, the correlation receiver will calculate
high-negative correlation in the detection process. Thus,
by using addition
or subtraction process it is possible to convey one-bit of
information. By sequential adding of several such bits, it
is possible to convey arbitrary information.
Types of watermark
Visible watermarks:
Visible watermarks are an extension of the concept of logos.
Such watermarks are applicable
to images only. These logos are inlaid into the image but
they are transparent. Such watermarks cannot be removed by
cropping the center part of the image. Further, such watermarks
are protected against attacks such as statistical analysis.
The drawbacks of visible watermarks are degrading the quality
of image and detection by visual means only. Thus, it is not
possible to detect them by dedicated programs or devices.
Such watermarks have applications
in maps, graphics and software user interface.
Invisible watermark: Invisible watermark is hidden
in the content. It can be detected by an authorized agency
only. Such watermarks are used for content and/or author authentication
and for detecting unauthorized copier.
Public watermark: Such a watermark can be read or retrieved
by anyone using the specialized algorithm. In this sense,
public watermarks are not secure. However, public watermarks
are useful for carrying IPR information. They are good alternatives
to labels.
Fragile watermark: Fragile watermarks are also known
as tamper-proof watermarks. Such watermarks are destroyed
by data manipulation.
Private Watermark: Private watermarks are also known
as secure watermarks. To read or retrieve such a watermark,
it is necessary to have the secret key.
Perceptual watermarks: A perceptual watermark exploits
the aspects of human sensory system to provide invisible yet
robust watermark. Such watermarks are also known as transparent
watermarks that provide extremely high quality contents.
Bit-stream watermarking: The term is sometimes used
for watermarking of compressed data such as video.
Text document watermarking
Text document is a discrete information source. In discrete
sources, contents cannot be modified. Thus, generic watermarking
schemes are not applicable. The approaches for text watermarking
are hiding watermark information in semantics and hiding watermark
in text format.
In semantic-based watermarking, the text is designed around
the message to be hidden. Thus, misleading information covers
watermark information.
Such techniques defy scientific
approach.
By text format, we mean layout and appearance. Commonly used
techniques
to hide watermark information are line shift coding, word
shift coding and feature coding.
In line shift coding, single lines of the document are shifted
upward or downward in very small amounts. The watermark information
is encoded in the way lines are shifted upward or downward.
Watermark recovery is simple because a line space in normal
text is uniform.
In word shift coding, words are shifted horizontally in order
to modify the spacing between consecutive words. While detecting
the watermark, the original word spacing data is required
because normally word spacing
is variable.
In feature coding, feature of some characters are modified.
In a typical case, the length of end lines to characters
like b, d, h are modified. While detecting the watermark,
the original lengths are known.
The formatted text method of watermarking can be defeated
easily by retyping the whole text using a new character font.
The retyping can be done manually or using automated 'optical
character recognition' (OCR) unit. The OCR-based techniques
are not perfect and require human supervision.
In general, such watermark removal methods are expensive.
For text watermarking, the goal is to make watermark removal
expensive and encourage copyrighted text. Thus, the above
methods are robust enough to resist printing and consecutive
photocopying of up to 10th generation.
Software protection
|
|
| Fig.
1: Generic digital watermark embedding and detection
scheme
Watermark can be a number, text or image
Secret/public key is used to enforce security of watermarked
content
For secure transport of watermarked data encryption/decryption
is used.
Watermark can be recovered by an authorized agency
having secure key, watermark and /or original data.
|
Software
is a discrete information source. It is not allowed either
to add or delete even a single bit to software. Thus, watermarking
technique is not suitable for copyright protection.
The basic objective of a software protection system is to
ensure that the software can be distributed openly in protected
(encrypted) form but can only be used within a trusted hardware
system. Such a system has provision to process owner's license
restrictions and protect software as well.
A user has to first obtain the license that contains information
about accessing the software and decrypting
key. A user may be allowed access to certain portion of software
for a defined period only.
After seeking a license, a user can download the encrypted
software over the Internet. Alternatively, the distributor
can also send the software.
A trusted hardware is a secure hardware. It contains embedded
authentication software. Thus, a user is required to present
secret key before access is granted. A simple low cost solution
is to use smart card in which the secret key may be stored.
A trusted hardware must also ensure that the licensed software
is also protected against tampering/piracy.
Executable software is aware of access control mechanism.
Such software can interrogate the mechanism to determine whether
a particular feature
is allowed by the license controlling
the software.
To ensure a long period of protection, it is essential that
the secret information
should be minimal. System security depends on storing the
private decryption key in a special hardware.
Robustness testing
Evaluation of a watermarking technique is necessary to ascertain
the robustness against attacks. Here, is description of important
benchmarking software for images.
Stirmark: It is a generic tool for robustness testing
of image watermarking software. It is freely available and
attempts to remove the hidden watermark using the following
procedure: it simulates resampling to
emulate printing. It applies a minor unnoticeable geometric
distortion. In other words, image is slightly stretched, sheared,
bent and rotated by an unnoticeable random amount. Further,
small and smoothly distributed
errors are introduced.
Finally, image is resampled and interpolated.
It has been claimed by the designer that it removes all current
watermarks. The effect of the attack is not visually annoying.
Unzign: It is benchmarking software for images in JPEG
format. In version 1.1, Unzign introduced pixel jittering
in combination with a slight image translation. For many watermarking
schemes, Unzign removes the watermark. However, version 1.1
introduces severe distortion. An improved version 1.2 reduces
distortion but watermark destruction capability is also reduced.
Watermarking technology is still in the evolutionary stages.
It is not as secure as modern cryptology. However, watermarking
is not a standalone technology. Water-marking
must be suitably combined
with encryption to offer reliable protection to contents.
A.K. Vanwasi, G.M. (R&D) ITI Ltd. Naini,
Allahabad can be reached at vanwasi_nni@itiltd.co.in
Digital
Media:
Multimedia contents are expressed as streams of ones and zeros.
In the digital world, the concept of copy is irrelevant. Technically,
there is no distinction between original and nth copy.
Sources: There are two types of information sources--waveform
and discrete sources.
Waveform sources: The transmission criterion for such
a signal is characterized by fidelity criterion. This implies
that there exist several digital representations of a given
waveform for a given fidelity. Examples of waveform sources
are audio, image or video signals.
Discrete sources: Text, software and data files belong
to discrete source category. It is essential to have exact
replica of software/data files at the destination end. This
requires error-free transport of data files.
Information hiding: Information hiding deals with communication
security. It comprises of encryption and traffic security.
Encryption protects the content during distribution over an
open network such as the Internet. However, an attacker knows
that secret communication is taking place and a copy of the
same is available. The traffic security pertains to concealing
its sender, its receiver or its very existence. Thus, here
an attempt is made to have secret (unobtrusive) communication
between two parties where very existence is unknown to a possible
attacker.
Steganography: It is a sub discipline of information
hiding. Here, secret information is hidden in an innocuous
(harmless) message. Such an innocuous message is also known
as cover message.
Watermarking: Watermarking is also a sub-discipline
of information hiding. Watermarking is the process of embedding
secret and robust identifiers inside audio-visual content.
Thus, the watermarking process is generally applicable to
waveform type of information sources.
The purpose of watermarking is to establish the copyright
of the content creator. In this sense, watermarks are also
known as hidden copyright messages.
Watermarking secures the content. Thus, any attempt to modify
the content can be easily detected.
Watermarking can trace the path followed by content in a distribution
chain. This helps in tracing malicious users.
By detecting watermarks embedded in the content, it is possible
to authenticate genuineness.
Label: This is readable public information added to
content for IPR protection. It conveys ownership of content,
indexing and authenticity. A label does not modify the content.
Digital signature is an example of a label.
A label along with valid certification and cryptographic keys
allows verification of the origin and the integrity of the
content.
It is impossible to prevent removing or replacing the label
from the content because they are separated from the content.
However, label generally, offers the following functionality:
-
Authentication of origin of content.
-
Strict integrity of the bit stream.
-
Integrity of identification numbers and IPR data.
-
Integrity of the meaning of the content.
Finger printing:
It is a hidden serial number embedded in content. It helps
in identifying copyright violators.
Standardization
in digital watermarking
Secure Digital Music Initiative (SDMI): SDMI is an industry
consortium comprising of all the major hardware and software
companies in the music industry.
The group is working on copyright protection, copyright management
and royalty tracking issues. The group is planning a secure
way to download the music and prevent free downloading.
The SDMI specifications is built around SDMI-compliant portable
devices, portable media that store and playback protected
audio content. The specification requires that any SDMI content
is protected at all times after it is imported into SDMI-compliant
applications, portable device or portable medium.
Unknown contents can be checked in to an SDMI-compliant portable
device, but cannot be copied.
SDMI screening technology is still under evaluation. SDMI
has devised six watermarking technologies to protect the digitally
recorded music.
Digital Video Disk (DVD): DVD is the latest storage
technology that provides storage seven times to that of CD.
It is developed as a portable medium to deliver data to consumers.
One problem that delayed the development of the DVD standard
is the protection of copyrighted movies.
One way to secure the content on a DVD is to link a watermark
verification process to the proper functioning of the DVD
player. Thus, player output is enabled only after verification.
Similarly, a DVD-compliant recorder will refuse to record
pirated material.
MPEG-4: It is an International Standard Organization
(ISO)/International Electromechanic Commission (IEC) standard
(ISO / IEC 14496). It provides an audio-visual coding standard
for very-low-bit rate channels. Such channels are found in
the Internet and mobile applications.
This standard also specifies an Intellectual Properly and
Management interface for content protection. The content-management
infrastructure will provide support to contents identification,
automatic monitoring and tracking of audio-visual objects,
prevention of illegal copies, tracking audio-visual object
modifications history and support of transaction among user,
distributors and right holders.
Technical glossary
Pseudo Noise Sequence (PN Sequence): PN Sequence is also known
as maximum length sequence. It has properties of a random
binary sequence.
PN sequence is a periodic binary sequence. It is generated
by an n-bit linear feedback shift register. Its bit duration
is fraction of information bit duration. PN sequence is used
as spreading sequence in spread spectrum transmitter. Different
PN sequences can be generated by feeding in different n-bit
initial sequence.
Correlation: It is a measure of similarity between
two signals. The degree of similarity is often expressed as
a number between zero and one. Zero indicates a perfect match.
Spread spectrum modulation: It is a digital radio frequency
modulation technique. This modulation method is immune to
interference and interception.
Here, a given binary signal sequence is multiplied with other
PN sequences to spread the signal over a much larger frequency
band . The resulting signal is much below the noise floor.
At the receiver, correlation receiver/matched filter is used
to recover the original binary signal sequence. The precise
detection requires the original PN sequence also. By using
difference PN sequences for spreading difference binary signal
sequence, it is possible to send multiple signal sequence
simultaneously, over a given frequency band.
Matched filter: It is used to recover binary signal
sequence from the spread spectrum modulated signal. The response
of matched filter is matched to PN sequence used at the transmitter,
to spread the binary signal sequence.
Echo hiding: It is a method of audio watermarking.
It is based on the fact that one cannot perceive short echoes
of the order of a millisecond.
Echo hiding embeds data into a cover audio signal. To encode
ones and zeros, two types of echoes with different delay are
used. Thus, an arbitrary sequence of ones and zeros can be
embedded.
Digital Video Disc (DVD): DVD is a powerful multifunctional
and removable storage device. It stores seven to twenty five
times more data than CD. It is nine times faster too. DVD
is random access technology. No fast forwarding or rewinding
is necessary. It is used to store audio and video.
Joint Photographic Expert Group (JPEG) : JPEG has created
standards for compressing and coding continuous time still
images of any size and any sampling rate. The scheme is valid
for both monochrome and colour images. Using JPEG, good quality
image compression can be achieved with a compression ratio
of 32 to 1.
|