Hide Data and Files

Hiding text
System-internal methods
NTFS Alternate Data Streams
Hiding folders and files
Hiding data in a file
Appending data to a file
Fake header / footer
Mixing data
Rotate data
Encrypt data
Attaching an archive file to another file
Embed binary data in an HTML file
Steganography via XML/HTML tags
Steganography using the LSB method

Hiding text

The simplest way to hide text is to match the text color to the background color. This is possible in word processing programs like Word or OpenOffice, but also in HTML files. Selecting the relevant area will reveal this text.

In HTML, other ways to hide information are conceivable, such as comments. Invisible or hidden elements can be implemented using CSS. The properties "visibility: hidden" and "display: none" can be used to hide elements and their subelements. A look at the source code reveals the hidden area.

System-internal methods

One of the easiest ways to hide a file is to assign the "Hidden" attribute to it. Additionally, the "System" attribute can also be set. Files with these attributes are not displayed by Windows Explorer using the default settings.

Files can also be encrypted using the file properties. This means that only the user who encrypted them can read the data.

NTFS Alternate Data Streams

How alternate data streams can be used to store and extract files is described on the page "Alternate Data Streams (ADS)".

Hiding folders and files

Countless programs for hiding folders and files are available online. These use various techniques to conceal folders from the unsuspecting user. Some even lull the user into a false sense of security by suggesting a certain level of protection with a password. Usually, both the hiding and the password protection are purely visual and can be easily circumvented.

Some programs simply set the "Hidden" and/or "System" file attributes for a folder to hide it. However, this only works if the default settings are used in Windows Explorer. Hidden folders can be made visible again using an alternative file manager or by changing the display options in Windows Explorer.

Other programs go a step further and use their own system services to prevent access to the protected folders. In this case, the system must be started without these services to gain access to the protected folders.

Data recovery programs that access the disk directly and do not use Windows file access functions cannot be prevented from accessing the folders and files, even by running system services.

Renaming a file is one way to conceal its contents or make them uninteresting to others. Not only the actual file name can be changed, but also the file extension. For example, the file "Finances.xls" becomes "kbekzld.dll," which can be stored in the Windows system directory so that it is less noticeable than in the user folder. However, there are programs that recognize files based on their content (magic bytes) and thus reveal the actual file type.

For container files, such as VeraCrypt files, which only contain encrypted data and random data, file extensions such as ".BIN" or ".DAT" would be suitable, as they do not specify a specific file type.

The peculiarities of Windows or Windows Explorer can also be used to hide folders. For example, Windows Explorer does not display the contents of the "C:\Windows\assembly" folder belonging to the Microsoft .NET Framework, but rather the available assemblies within it. However, using an alternative file manager, the contained folders and files are displayed as usual.

Practices such as using transparent folder icons or filenames consisting of spaces or special characters are also circulating. These methods probably only work for very inexperienced users.

Hiding data in a file

Depending on the file format, data can theoretically be hidden within a binary file. By manipulating position information in the file header, data could be hidden in the middle of the file. The example of a bitmap file shows how the header could be manipulated by a program to insert data unnoticed.

Since the file specifies the offset of the beginning of the image data, the image data could easily be shifted backward. The resulting space could then be used to store other data, but it should always be encrypted.

Header

Inserted data

Image data

In order for the image file to be displayed correctly, some changes to the header are necessary. First, the length of the BMP file must be adjusted, as well as the offset and length of the image data. Before inserting data, it is also necessary to check whether the BMP file contains a palette (e.g., with 256 colors).

VeraCrypt container files with a hidden volume also provide space for data. These containers consist of a header for the outer volume and a header for the hidden volume, followed by the data for both volumes. Each header is 64 KB in size. If one of the two headers is overwritten, the other header remains readable. If the first 64 KB is overwritten, the hidden volume remains fully usable. If the second 64 KB is overwritten, the outer volume remains intact. A VeraCrypt container can therefore hold a maximum of 64 KB of data.

There is an approach in which video files and VeraCrypt Hidden Volume containers are linked in such a way that both files are theoretically readable and the hidden volume of the VeraCrypt container remains usable. The video file header is written to the first 64 KB of the VeraCrypt container, and the rest of the video data is appended to the container file. This results in the loss of the outer volume, but the hidden volume remains usable. The video should also be playable.

In theory, these methods may work, but in practice, there will be viewing programs that cannot correctly process these changes. For example, an image viewer might search for the image data directly based on the header instead of adhering to the position specified in the header. In the case of the video-VeraCrypt hybrid, some reports indicate that the video no longer works in Windows Media Player, whereas other video players do not seem to have any problems.

Appending data to a file

A very simple method to hide data in a file is to append the data to the end of the file. For this to work smoothly, a file with a fixed file length is required as the carrier file. This could be, for example, various image files or program files.

The following uses a JPEG image file for the examples and explanations. In any case, the file format of the carrier file should be known to avoid unwanted side effects from appending data. Every JPEG file ends with the bytes "0xFF" and "0xD9", the EOI (End Of Image) marker. All data following the EOI marker is ignored when displaying the image.

Modified file with text after the EOI marker:

352A 5C91 D0D2 B14A 674C AEAD D0D3 C107  5*\....JgL......
BD60 C57B 8EA6 AEC3 7A0F 5228 D8B5 24CF  .`.{....z.R(..$.
FFD9 4469 6573 2069 7374 2065 696E 2067  ..Dies ist ein g
6568 6569 6D65 7220 5465 7874 21         eheimer Text!

The highlighted text has been appended to the original file, and unless too much data is appended, this will hardly be noticeable.

This cannot be considered concealment or even hiding, as the data is still readable. Even if you open this image file with a text editor, you would still be able to read the appended text at the end of the file. The data should therefore be compressed or encrypted before being appended to the file.

The appending and extracting of data or files can also be automated with a program. All that is required is the starting position or length of the appended file. Multiple files can also be appended in this way. The structure of such a construct could look something like this:

Carrier file

Appended file 1

Length of file 1

Appended file 2

Length of file 2

If you use a program that can extract the attached data as the carrier file, you will get a self-extracting archive, just like with common compression programs.

Fake header / footer

To obfuscate the magic bytes of a cryptographic or compression program, fake headers or footers can be added to the file. For example, the MZ signature of a program file - possibly followed by random data - can obfuscate the header, and the version info block can obfuscate the footer of an archive. A file's magic bytes can also be removed or replaced to obfuscate the actual contents.

Mixing data

When writing the data to be hidden, random data can be used to prevent easy reading. If a bit of random data is inserted after each bit, the original program or a self-written program or script is required to extract the data.

If two or more files are to be concatenated, this can also be done bit by bit:

Bit 1 of file 1

Bit 1 of file 2

Bit 2 of file 1

Bit 2 of file 2

...

If a file is longer, the remaining data can be filled with random data.

Rotate data

Simple encryption or rotation methods are suitable for making messages more difficult to read.

For example, ROT-13 is sometimes used on websites to display solutions or clues to questions in a way that is not immediately legible. Besides or in addition to ROT-13 for letters, ROT-5 for numbers, and ROT-47 for punctuation and special characters, can also be used.

A simple type of encryption is bit rotation. If the data is divided into blocks, e.g., two bits each, which are then rotated, the original text becomes unreadable. Ideally, several rotations of varying lengths should be performed.

Encrypt data

The encryption method should be chosen based on the purpose and capabilities of the sender and the recipient(s).

For binary data, the binary XOR operator can be used, which also significantly alters the data. A single character or a character sequence can be used as the key for the XOR operation. In both cases, the key is repeated a number of times to match the length of the text to be encrypted. Another option would be to use pseudorandom data based on a key for the XOR operations. If the key is completely random throughout its length, this would correspond to one-time pad encryption (also known as Vernam encryption).

An encryption method such as RC4 is also relatively easy to implement. More secure methods would then be AES, Serpent, or Twofish.

Attaching an archive file to another file

For some file formats, the file information is not located at the beginning, but at the end of the file. This is at least the case with RAR, ZIP, and 7-Zip archives. When such an archive is opened, the compression program finds all information at the end of the file. The data before the actual compressed data is ignored.

If you now attach such an archive, for example, to a JPEG file, which stores the file information at the beginning of the file, the image file can be opened normally. However, the file can also be opened with a compression program that displays the attached archive and ignores the image data before it.

With the console command

copy /b Image.jpg + Archive.rar Image_with_Archive.jpg

the file "Archive.rar" is appended to the file "Image.jpg". The resulting new file is saved as "Image_with_Archive.jpg". The "/b" parameter specifies the binary mode for copying.

Instead of the JPEG file, any other file can be used that stores its file information at the beginning of the file and ignores appended data. These include, for example, GIF, PNG, and BMP files. Video and audio files such as MP4, WEBM, MP3, M4A, and WAV can also be used, as can Microsoft Word and Excel documents (DOC and XML).

To prevent the data from being accidentally overwritten, which could potentially happen with Word and Excel, the file should be set to "Read-Only". Password protection for the hidden archive is also recommended.

Embed binary data in an HTML file

In einer HTML-Datei können binäre Daten eingebettet werden, ohne dass sie übermäßig auffallen. So besteht die Möglichkeit, dass im SRC-Attribut des IMG-Tags z.B. Bilder eingebettet werden. Das Bild kann natürlich auch versteckte Daten enthalten. Auch wenn keine Bilddaten eingefügt werden, wird es kaum auffallen, wenn der IMG-Tag auskommentiert wird.

<html>
<head>
<title>Test</title>
</head>
<body>

<p>Page with a hidden embedded image</p>

<!--
<img src="data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEAYABgAAD/4QECRXhpZgAA
TU0AKgAAAAgACwEaAAUAAAABAAAAkgEbAAUAAAABAAAAmgEoAAMAAAABAAIAAAExAAIAAAAQA
...
G39qv/k6H4kf9jTqf/pXLXvX/D5z4of9AHwD/wCAV3/8k18vfEDxpdfEfx5rfiK+jt4r3Xr+f
UbhIFKxJJNI0jBASSFBY4BJOO560229WJJLRH//2Q==" alt="image" border="0" height="50" width="200">
-->

</body>
</html>

The console command CERTUTIL can be used to encode files as Base64:

certutil -encode picture.jpg base64.txt

After that, all you need to do is remove the first and last lines of the file. It's also possible in the other direction. To do this, add the following lines to the front and back of the Base64 data:

-----BEGIN CERTIFICATE-----
[Base64 data]
-----END CERTIFICATE-----

The file can then be decoded with CERTUTIL:

certutil -decode base64.txt picture.jpg

Steganography via XML/HTML tags

The arrangement and formatting of tags and attributes in XML/HTML source code can be used to hide data. However, depending on the number of tags, the amount of data is limited, as only one bit can be stored per tag or attribute.

An empty element can be terminated either with its own end tag or within the tag itself:

<img src="picture.jpg"></img>
<img src="picture.jpg"/>

Depending on the notation, one variant can represent the value 1 and the other the value 0.

The next possibility is spaces at the end of tags. This way, spaces could be considered 1 and no spaces could be considered 0. Thus, the following HTML code would contain the value "01110110".

<h1>Title</h1 >
<div ><b >Sub-Title</b></div >
<p >Text</p>

The order of tags and attributes could also be used as a storage method. Depending on the order, the values 0 and 1 can be assigned.

<file><name>...</name><size>...</size></file>
<file><size>...</size><name>...</name></file>

<file name="..." size="..." />
<file size="..." name="..." />

Especially with long lists, such as a customer directory, a product overview or a file list, some information can be stored in this way.

Steganography using the LSB method

A better technique for concealing data is to hide information within the (binary) data itself. This requires a dedicated program or script that handles the storage and extraction of the information. A wide variety of graphic, audio, and video formats can be used as the carrier file that contains the information.

A 24-bit bitmap file, for example, consists of the header and the actual image data. The image data, in turn, consists of three bytes each, which specify the red, green, and blue values. To ensure that changes in the file are not noticeable when viewed, the color values may only be changed slightly.

With the LSB (Least Significant Bit) method, the secret data is inserted only into the least significant bit of each color value. This bit has the lowest value in the byte, and the changes therefore have only a minimal effect and are not noticeable to the human eye.

Were the free content on my website helpful for you?
Support the further free publication with a donation via PayPal.

Gaijin.at