What is Obfuscation?
Obfuscation refers to the practice of deliberately making a piece of information hidden, unclear or difficult to understand. It is important to note that this is different from encryption or hashing, as obfuscation does not make it impossible to read, it only makes it harder for unintended recipients to understand. It is still interpretable to those who know how.
For instance, obfuscation can be used to hide information inside of and image or to store payment details in a manner that hides sensitive info such as credit card numbers.
Types of Obfuscation
Steganography
Steganography is a method of obscuring information by embedding it within non-secret data, thus achieving security through obscurity. The information is typically invisible or undetectable to an unintended observer, yet it is still there, hidden in plain sight.
This typically uses a method known as Least Significant Bit (LSB) Insertion. This approach substitutes secret data for the least significant bit of each byte in the carrier payload/signal (e.g. an image payload). This allows us to hide information without causing significant changes to the original payload, decreasing the chance of our secret message being detected.
Common Steganography Techniques
There are many forms of steganography as data can be hidden using many unique methods and carrier payloads. Below are some common examples:
- Image Steganography: One of the most common forms is to hide a message within an image file. This process alters pixel values subtly to embed data within the original image, making the changes invisible to the naked eye. You can use this tool to experiment with just one form of Image Steganography!
- Network-based Steganography: In this technique, information is embedded within the unsuspicious packets of data transferred over a network. It can be used by hackers to hide their malicious activities within a network.
- Invisible Watermarks: Often used in printing, invisible watermarks such as tiny yellow dots can be applied by printers. These watermarks are hidden from view but can convey important information and be used to track the source of printed documents.
- Audio Steganography: This method involves embedding a secret message within a digital audio file. Techniques similar to image steganography are used, modifying sound waves to conceal data without significantly affecting the original audio quality. Very subtle background noise can also be added which may represent a message in the form of morse code.
- Video Steganography: This is similar to image steganography but applied on a larger scale. Here, the message is hidden across a sequence of frames, maintaining a balance with the signal-to-noise ratio to avoid detection while increasing the amount of hidden information we can transmit.
- (A signal would be the original video frame, and noise is the hidden data. Maintaining a balance between the two is important to avoid detection.)
Tokenization
Tokenization is a process where sensitive data is replaced with a non-sensitive equivalent known as a token. For example, sensitive information like a UK National Insurance Number might be replaced with a token such as RJ185915b. This token would then represent the sensitive data so that the original information does not get exposed.
This approach is also frequently used in the payment processing industry to protect card numbers during transactions. The key advantage of tokenization is that unlike encryption or hashing; there is no mathematical relation between the original data and the token. Therefore, there is less computational overhead due to a lack of an encryption process.
Additionally, unlike encryption or hashing, tokenization can preserve the data format of the original information. This allows it to be integrated easily into existing systems. See the credit card example below:

Data Masking
Data masking involves the obfuscation or hiding of particular data fields to protect sensitive data such as Personally Identifiable Information (PII). The masked data might still exist in its original form in secure storage but is obscured while in use elsewhere. Different levels of masking may be applied depending on user permissions.
Techniques of data masking include:
- Substituting/Pseudonymisation: Replacing actual data with fictitious data.
- Shuffling: Randomising data within a dataset to disconnect sensitive data from their data owners.
- Encrypting: Applying encryption to hide data to those without the key.
- Redaction/Masking: Using symbols like asterisks to redact data.
- Averaging: Averaging numerical data within a dataset and applying it to all entries.
Each technique ensures that sensitive data remains unreadable to unauthorised users while maintaining its usability for legitimate purposes.
