Malware analysis interview questions with detailed answers (Part 2)

Here are few more important questions with detailed answers for malware analysis interview. Topic covers in this part is OS concepts, Programming, Assembly language and Dynamic analysis.

7 min read
Malware analysis interview questions with detailed answers (Part 2)

In this part we will look at questions related to few other topics.

Click Here For Part 1

Assembly Language (Intel x86/64)

If you want to be malware analyst or researcher then you must need to have experience in reversing a software. To reverse a software you must be familiar with reading  and debugging assembly instructions. The questions in this topic can be of two type, either what some instruction do( for example shr instruction) or some intel x86 assembly concept. Since mostly the instructions name give a reasonable hint about what they do, and they are easy to look over internet, interviewer's mostly focus on asking about concept used in Intel x86 or x64.

1) What is difference between "mov" and "lea" instruction?

Answer:

Lets understand this  with a example.

mov eax, [ebx]

and

lea eax, [ebx]

Suppose value in ebx is 0x400000. Then mov will go to address 0x400000 and copy 4 byte of data present their to eax register.Whereas lea will copy the address 0x400000 into eax. So, after the execution of each instruction value of eax in each case will be following(assuming at memory 0x400000 contain is 30).

eax = 30         (in case of mov)
eax = 0x400000   (in case of lea)

For definition mov copy the data from rm32 to destination (mov dest rm32) and lea(load effective address) will copy the address to destination (mov dest rm32).


2) What are different calling conventions?

Variant: What is difference between stdcall and cdecl calling convention?

Answer:

This is one of the most asked question in assembly.

Types of calling conventions:
Stdcall (Standard Call)
Cdecl (C declaration)
Some uncommon are:
FASTCALL
THISCALL

In Stdcall arguments of a function is pushed in stack from right-to-left. And cleaning of stack is done by callee( who has been called) function.

In Cdecl arguments of a function is pushed in stack from left-to-right. And cleaning of stack is done by caller( who is calling) function. Check here for more info on this.


3) What calling convention you generally find in Windows C++ programs?

Answer:

Windows program by default use StdCall but inside your IDE or compiler you have option to change the default convention.

Changing calling convension inside Visual studio settings

4) What is xor instruction do? Where it is used mostly in x86/64?

Answer:

xor (exclusive or) gives output 0 if bits are same otherwise 0. i.e

1 xor 1 = 0
0 xor 0 = 0
1 xor 0 = 1
0 xor 1 = 1

xor is mostly used in intel x86/64 to set a register value to 0. For example      xor eax, eax will 0 the eax register.

Also xor eax, eax prefer over mov eax, 0 because of performance reason.


5) What is the difference between "cmp" and "test" instructions?

Answer:

In cmp the comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the sub instruction.

Test computes the bit-wise logical and of first operand (source 1 operand) and the second operand (source 2 operand) and sets the SF, ZF, and PF status flags according to the result.

In short, cmp does the subtraction while test does the bitwise AND.


6) What is x86 function prologue? Is it necessary to have those instructions in a programs function?

Answer:


push ebp   //push the previous base pointer onto the stack.
mov ebp, esp  // set a new base pointer to create a new stack frame

No, It is not necessary to have function prologue. It is just used by compiler for  convention and security purposes.


Programming Questions

Since malware analyst job contain static code analysis/ reverse engineering, you must have basic understanding of C programming. Other language experience may require based on the profile you are applying for.

You may have to face any type of questions related to C programming, so you must be ready for that. The questions in C are wide but I will suggest having a decent understanding of pointers in c. Other than that I am adding few questions that I have faced and think may be related to reversing C program.

1) What is difference between structure and union?

Answer:

You can have different variables inside a structure and you are allowed to set value for each variable. Also, memory allocated to that structure is sum of all the variable's memory.

struct [my_struct]
   {
       int a;
       int b;
   };
   struct my_struct s;
   s.a = 10;
   s.b = 12;

In union you can only set value of one variable at a time from all the variable present in union. Setting value for other variable will replace the value of first variable since there is one memory space allocated for whole union( which is equal to size of variable with maximum size) .

union [my_union]
   {
       int a;
       int b;
   };
   union my_union s;
   s.a = 10;
   s.b = 12;  //will replace value of s.a makes current val s.a=12, s.b=12

It is important to know difference between structure and union since it is used at lots of places in PE header.


2) Question related to memory allocation.

You may asked questions about how much memory int variable take or int b[10] takes etc.


3) Questions related to pointers.


4) How does malloc works?

Variant: How does dynamic memory/heap allocators works?

Answer:

Malloc or any other custom memory allocator provide dynamic memory space from heap. In backend these allocators use brk/sbrk(default mem allocator)  system call in linux and VirtualAlloc api call in case of Windows. Modern memory allocator are wrapper around these default allocators. We use malloc just because it manages memory more efficiently.

When you allocate memory chunk through malloc it uses brk to allocate memory chunk and then stores the chunk meta information (like chunk size, address of next chunk etc.) at the starting of each chunk. Beside that it maintain linked lists of free chunks to speed up allocation. Whenever you do free(mymem), the mymem chunk get added to free chunks linked list. If you use mymem2 = malloc(100) to allocate 100 bytes for mymem2, malloc look at free chunk list if any chunk of more then 100 bytes is free. If he found a free chunk then it gives that memory to mymem2 and remove it from free mem list. Otherwise if it fail to find chunk of more then 100 bytes then it uses brk to allocate a new memory. You can learn more about this here.


5) What will happen when we do free(mymem)? Is the data you store at mymem chunk also get deleted?

Answer:

As already mentioned, free(mem) will just add that chunk to free memory link list. The data still remains there.

These scenarios also cause some exploit conditions like "use after free".


Operating system concepts (or OS Fundamentals)

Operating systems concepts or questions related to "how to archive specific task in windows" are frequently asked in interview. There are very vast amounts of questions in this topic, but I am covering some of them that I have faced.

1) How much memory is allocated to windows kernel?

I know this question doesn't make much sense but it has been asked to me more then twice

.

Answer:

Windows kernel is modular so the amount of memory allocated to it with respect to userspace may depend on the current state of system. But as answer you can tell 1:3 memory is allocated to kernel and userspace, means out of 4 GB, 1GB for kernel space and 3 GB for user space.


2) Explain Virtual memory.

Variant: When a dll (ex ntdll.dll) is used by two different processes, both the processes get different address of the dll. Why it is so?

Answer:

Virtual memory is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory." Source: wikipedia

In modern operating systems such as Windows, applications and many system processes always reference  memory by using virtual memory addresses. Virtual memory addresses are  automatically translated to real (RAM) addresses by the hardware. Only  core parts of the operating system kernel bypass this address  translation and use real memory addresses directly. Click here for more info.

When two programs need one dll, the dll get loaded at a particular physical memory address. Then it get mapped to both processes memory space at different virtual memory address.


3) How to view Windows registry?

Answer:

In the Start Menu, either in the Run Box or the Search box, type "regedit" and press Enter.


4) How to see running tasks through cmd?

Answer:

You can use the "TASKLIST" command to display a list of currently-running tasks.


Dynamic Analysis

Dynamic analysis is a major part of malware analysis process. But I am little shocked to find out that interviewers doesn't ask much about this.

1) When you get a malware sample, how you proceed with it?

My answer to this question may differ with yours as it depend on different analyst, how they proceed with initial analysis. But you need to start with behavior analysis then dynamic analysis and at last static analysis or debugging.

Answer:

First I load the program in CFF explorer to look for sections and imports of the image(also some general properties of image). Section name gives idea if default sections are touched or not. Even new uncommon section can also be present which may support the presence of packers. Imports will give an idea about what the program might up to. For example, import of RegOpenKey function give you a hint that there maybe registry modification happening.

Then I load the program in DetectitEasy tool where I look for entropy of different sections. High entropy(above 7) might result to packed code or encryption of that section.

Entropy screen in detectiteasy

As a part of dynamic analysis I first start process explorer, ProcMon(Process monitor) and Rohitab api monitor in background and then run the malware.

We always run the sample in virtual enviornmnet( like virtual box) not on our main production system.

Process explorer will show what new processes and services are started by the program. Procmon will give you logs of all the system activity that the malware has done like file system, Registry, process, thread and DLL activity in real-time. And api monitor will show the sequence of api calls done by the malware.

Then I load the program in IDA pro for static analysis( looking for strings present, assembly code) and ollydbg for debugging.

Also, it is better to include use of sysinternal suite in your answer because it really is the Swiss army knife for dynamic analysis.


2) What is the entropy of sections? What idea you get with the high entropy of a section?

Answer:

In general entropy show randomness. For a section entropy value show how random data is at that particular section.

High entropy show that the section is obfuscated or compressed or packed.


We will look at more question on this topic and few other topics on next part.

Click Here For Part 3

Click Here For Part 4