Python Forensics and Virtualization | Hash Functions

In this tutorial, we will learn the Forensics science using Python, basic Python forensics applications, Hash functions, Cracking an Encryption, Visualization, Naming Conventions, Dshell and Scapy, Network Forensics with its detailed explanation.

Introduction

Collecting and preserving evidence is most essential for cyber forensic investigation and analysis at the computer devices. It plays important role in a court room to be used against the criminal. Nowadays, technology facilitates us to get the information by just typing the query on the browser. But it also invites the cyber crooks. Cyber crooks are those who perform the malicious activity by using their system and internet. They can get your all information from sitting somewhere else.

With its wide applications, Python also provides the facility to work with the digital forensics. By using it, we can gather data, extract evidence, and also encrypt password. It will support us to reinstate the reliability of evidence.

Before go further, you must familiar with the Python and its advance concepts.

Introduction to Computational Forensics

Computational Forensics is a part of study which used to solve problems in various forensics disciplines. It uses computer-based modeling, analysis, computer simulation and recognition. Python Forensics was invented by the Chet Homster. There are also pattern evidence, such as fingerprints, shoeprints, toolmarks and any documents. It makes use of procedures, scope of objects, and substances. There are also physiological and behavioral patterns such as digital evidence, DNA, and crime scenes.

Python Forensics and Virtualization

We can also use the various algorithms to deal with the signal and image processing. By using algorithms, we can also handle the, data mining, computer graphics, machine learning, computer vision data visualization, and statistical pattern recognition.

In a few words, the computation forensics is used to study the digital evidence, computational forensics deals with the various types of evidence.

Naming Conventions for Python Forensics Application

We must familiar with the naming convention and patterns to follow the Python Forensics guidelines. Consider the following table.

Naming Convention Example
Local variables camelCase with optional underscore studentName
Constant Uppercase, words separated by underscores STUDENT_NAME
Global variable Prefix with camelCase with optional underscores my_studentName
Function PascalCase with optional underscores; active voice MystudentName
Module Prefix with the camel case _studentname
Class Prefix class with Pascalcase; keep it sort class_MyStudentName
Object Prefix ob_with camelcase ob_studentName

The hashing algorithm is one of the best ways of take as an input a stream of binary data. In the real life scenario, we can encrypt our password, file, or even any kinds of digital file or data. The algorithm takes an input and generates the encrypted message. Let's see the given example.

Example

Python Hash Function

Python hash function is used to map a vast amount of data to a fixed value. An input returns the same output. It is a hash sum and stores features with precise information. Once we map the data to a fixed value, that cannot be revert. That's why we also refer it as one-way cryptographic algorithm.

Let's understand the following example -

Example -

Output:

Enter your password: sharma
The hash string to store in the db is: 947782bdb0c7a5ad642f1f26179b6aef2d9857427b45a09af4fce3b8f1346e91:8a8371941513482487e5ab8af2ae6466

Now, we will re-enter the password.

Output:

Enter your password devansh 
The hash string to store in the db is: 4762866edd3b49c7736163ef3d981e42629a09a9ca7e081f56d116e137d77b9c:ebbf5b16bd9f4b989505a495bf7ae9b9
Enter new password sharma
Passwords do not match

The hash function has the following properties.

  • We can simply transform any hash value for any input value.
  • It doesn't able to produce the same output as given hash value.
  • It is unrealistic to transform the input without moving the hash value.

Cracking an Encryption in Python

We must know how to encrypt the text data that we fetch during analysis and evidence. First, understand the basic cryptographic.

Generally, secret messages are sent by the army person to convey their plans without get read by their enemies. These messages are not in the human-readable format. The plain texts are encrypted by using the encryption algorithm and these texts are called cipher text.

Suppose a general commander sends a message to senior to save the text from their enemies. Here, we take shift the plain text letter four place in the alphabet. Now, the A will be E, each B is F and so no.

Let's understand the following example to crack the vector data.

Example -

Output:

Enter message: Yes
~
}
|
{
z
y
x
w
v
u
t
s
r
r~
q
q}
p
p|
o
o{
n
nz
m
my
l
lx
k
kw
j
jv
i
iu
h
ht
g
gs
f
fr
e
eq
d
dp
dp~
c
co
co}
b
bn
bn|
a
am
am{
`
`l
`lz
_
_k
_ky
j
jx
i
iw
h
hv
g
gu
f
ft

Virtualization

A virtualization is an act of emulate IT system such as workstations, networks and storage. We make the virtual instance of such a resource. It can be done with the help of hypervisor.

Python Forensics and Virtualization

The virtualization of hardware plays very important role in the computer forensics. By using the virtualization, we can get following advantages.

  • We can use the workstation in a validate state for each investigation.
  • We can recover deleted data by including dd images of a drive on a virtual machine.
  • The virtual machine can turn into the recovery device that will help to gather evidence.

We define the following steps to create virtual machine using Python

Step - 1: Suppose we consider our local machine as "dummy". Each Virtual Machine will have at least 512 MB of memory.

Step - 2: Now, we attach this virtual machine to the default cluster.

Step - 3: Next, boot the virtual machine from the virtual HDD.

Now, we will combine the above steps into a virtual machine parameter object. Let's understand the following example.

Example -

Output:

Virtual Machine dummy added successfully.

Network Forensics in Python

Python also provide the facility to work with the network forensics. In the modern days, Python network forensics environment investing can come across many difficulties. These problems can be responding to a breach report, executing assessments pertaining to susceptibility, or validating regularity compliances. Let's understand the basic terminology of network forensics.

Client - The client runs personal computer and workstation.

Server - The server executes the client's request.

Protocols - Protocols are the set of rule that must be followed while data transfer.

Websockets - A websockets are protocol that provides the full-duplex communication and runs over the TCP connection. We can send the bi-direction messages using the websockets.

With the help of those protocols, we can authenticate the information and sent or received by the third party users. But, encryption is necessary to secure channels.

Let's understand the following example of network

Example -

Output:

The client waits for connection

Python Scapy and Dshell

Let's understand the brief introduction Python Scapy and Dshell.

Python Scapy

A scapy is Python-based tool which analyze and manipulate network traffic. With the help of scapy, we can analyze packet manipulation. We can also capture and decode the packets of a wide number of protocols. The benefit of using scapy is to get the detailed report about network traffic to the investigator. The third-party tools such as OS fingerprint app can be also used in Scapy. Let's understand the following example.

Example -

Output:

source INDIA >> destination USA

Python Dshell

The Dshell is a Python-based network forensics analysis toolkit. It was developed by the US army research laboratory and released it open-source in 2014. It makes the forensics investigation very easy. Dshell provides the following decoders.

  • reservedips - It is used to identify solutions for the DNS problems.
  • rip-http - It extracts the files from HTTP traffic.
  • large-flows - It is a decoder that represents the list net flows.
  • Protocols - It identifies the non-standard protocols.
  • dns - It extracts DNS-related queries.

Python Searching

Searching is the most important part of the forensics investigation. Nowadays, the good search is upon the investigator who is running the evidence. Keyword searching from the message is a pillar of the investigation. We can find the strong evidence with the help of a keyword.

The experience and knowledge both are required to get the information from the deleted messages.

Python provides the various built-in modules to support search operation. The investigator can find the result using the keywords such as "who", "what", "where", "when", "which", etc. Let's understand the following example.

Example -

Output:

11
11
-1

Python Indexing

Indexing is feature that the investigator can use to gather potential evidence from the files. The evidence can be restricted within the memory snapshot, a disk image, a file, or a network trace. It is very helpful to reduce time for time-consuming tasks like keyword searching. The indexing also used to locate the keywords in interactive searching phase. In the following example, we have explained indexing in Python.

Example -

Output:

Index example :  1
Index for indexing :  3
Index of the character keyword found is 
10

Python Image Library

The real meaning of forensics investigation is to extract the valuable information from the available resources. Getting all the relevant information from the resource is essential for the report. It helps us to derive appropriate result.

Python Forensics and Virtualization

Resource data can be either simple data structure such as databases or complex data structures such as JPEG image.

Investigator can easily access the information from the simple data structure but extracting information from the complex data structure is tedious task.

Python provides the Image library which is known as PIL. It is used to add image processing capabilities to out Python interpreter. It also support the file formats, graphics capabilities and also provides powerful image processing. Let's understand the following image to extracting data from images.

We define the programming example to explain how it actually works.

Step - 1: Suppose we have a following image where we need to extract the details.

Python Forensics and Virtualization

Step - 2: An image consists of various pixel values. The PIL library uses to extract the image details for gather evidence. Let's understand the following example.

Example -

Output:

[255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255]

The output is returned in the form of a list. It is a pixel value of the RGB combination that gives a better picture of what data is needed.

Python Multiprocessing Support

Forensics experts find difficulties to apply digital solutions to large digital evidence on the common crime. Most of the digital evidences are the single threaded that mean we can execute only one command at time. Let's see the brief introduction of multiprocessing.

Multiprocessing

Multiprocessing is an ability of the system that support more than one process. It enables the several programs to run concurrently. There are two types of the multiprocessing - symmetric and asymmetric processing.

Let's understand the following example of multiprocessing.

Example -

Output:

List processing complete

Mobile Forensics in Python

Forensics investing is not only limited to the standard computer hardware such as hard disk, CPUs, etc. Hardware is followed with the help of techniques to analyze non-standard hardware or transient evidence.

Nowadays, smartphones are widely used in digital investigation, but they still meant as non-standard. With the proper research of smartphones, we can extract photos, smartphones, and messages.

The android smartphones uses the PIN, or alphanumeric password. The password can be between 4 and 16 digits/characters.

In the following example, we will get through a lock screen to extract data. The smartphone password generally stores inside a file password.key in /data/system.

Android stores a salted SHA1-hashsum and MD5-hashsum of this password. Let's see the following example.

Example -

The above code is a sample code of crack smartphone password. The dictionary attack won't be affected to crack the password since hashed password is stored in a salt file. The salt file is a string of hexadecimal representation of a random integer of 64 bit. The Rooted smartphones or JTAG Adapter can access the salt file.

Rooted Smartphones

The file's dump /data/system/password.key is stored in SQLite database under the lock screen.password_salt. The Password is stored under settings.db.

JTAG Adapter

The JTAG stands for Joint Test Action Group which can be used to access the salt. Similarly, a Riff-Box or a JIG-Adapter can be used to access the sale files. We can find the position of the encrypted data using the obtained information from Riff-box. The rules are given below.

  • Find the associated string "password_salt".
  • The width of the salt file represents in the bytes. This is its length.
  • This is the length which is actually searched to get the stored password/pin of the smartphones.

Memory and Forensics

Python forensics primarily focuses on the volatile memory with the help of Volatility which is a Python based framework.

Volatile Memory

Volatile memory is a type of memory that erased when the system's power is turned off or interrupted. In the simple words, if we are working on a document that has not been saved to the hard disk and suddenly the power goes off, we will lose our data.

The volatile memory follows the same pattern as the other forensics investigations.

  • First, it needs to be selected the target of the investing.
  • Acquire the forensics data.
  • Forensics Analysis

The RAM dump is tool which used to analysis the gathered data from the RAM.

YARA Rules

YARA is a tool which used to examine the suspected files/ directories and match strings. It is based on the pattern matching implementation. It plays an important role in forensics analysis.

Example -






Contact US

Email:[email protected]

Forensics & Virtualization
10/30