Chapter 14

Chapter 14: Thread Hijacking & APC Injection

The previous chapter explored process injection techniques that create new threads to execute malicious code. While effective, CreateRemoteThread and its variants are heavily monitored by security products. Every major EDR platform watches for cross-process thread creation—it's one of the most reliable indicators of injection activity. This chapter examines alternatives that work with existing threads rather than creating new ones, fundamentally changing the detection characteristics of code injection.

Thread hijacking and Asynchronous Procedure Call (APC) injection represent a different philosophy of code execution. Instead of adding new execution contexts to a process, these techniques redirect or queue work to threads that already exist. The result is code execution that leaves fewer artifacts and is harder to distinguish from normal process behavior.

Thread-Based Execution Strategies

Windows threads provide several mechanisms that can be leveraged for code execution beyond their intended purposes. Understanding these mechanisms reveals why thread-based injection is both powerful and subtle.

                    THREAD-BASED INJECTION TAXONOMY

    ┌─────────────────────────────────────────────────────────────────────┐
    │                      THREAD HIJACKING                                │
    │  ┌────────────────────────────────────────────────────────────────┐ │
    │  │  Concept: Take control of an existing thread's execution       │ │
    │  │                                                                │ │
    │  │  Mechanism:                                                    │ │
    │  │  ├── Suspend the target thread                                │ │
    │  │  ├── Read its current context (registers, instruction ptr)   │ │
    │  │  ├── Modify the instruction pointer to your shellcode         │ │
    │  │  ├── Resume the thread                                        │ │
    │  │  └── Thread now executes shellcode instead of original code   │ │
    │  │                                                                │ │
    │  │  Advantages:                                                   │ │
    │  │  ├── No new thread creation                                   │ │
    │  │  ├── Uses existing thread's context and stack                 │ │
    │  │  └── Harder to detect than CreateRemoteThread                 │ │
    │  └────────────────────────────────────────────────────────────────┘ │
    ├─────────────────────────────────────────────────────────────────────┤
    │                      APC INJECTION                                   │
    │  ┌────────────────────────────────────────────────────────────────┐ │
    │  │  Concept: Queue code to run when a thread enters special state │ │
    │  │                                                                │ │
    │  │  Mechanism:                                                    │ │
    │  │  ├── Write shellcode to target process memory                 │ │
    │  │  ├── Queue APC to target thread(s)                            │ │
    │  │  ├── Wait for thread to enter "alertable" wait state          │ │
    │  │  └── APC executes automatically in thread's context           │ │
    │  │                                                                │ │
    │  │  Alertable states include:                                     │ │
    │  │  ├── SleepEx(ms, TRUE)                                        │ │
    │  │  ├── WaitForSingleObjectEx(..., TRUE)                         │ │
    │  │  ├── WaitForMultipleObjectsEx(..., TRUE)                      │ │
    │  │  └── MsgWaitForMultipleObjectsEx(..., MWMO_ALERTABLE)         │ │
    │  └────────────────────────────────────────────────────────────────┘ │
    ├─────────────────────────────────────────────────────────────────────┤
    │                      EARLY BIRD INJECTION                            │
    │  ┌────────────────────────────────────────────────────────────────┐ │
    │  │  Concept: Inject before EDR hooks are installed                │ │
    │  │                                                                │ │
    │  │  Mechanism:                                                    │ │
    │  │  ├── Create process in suspended state (CREATE_SUSPENDED)     │ │
    │  │  ├── Write shellcode to the new process                       │ │
    │  │  ├── Queue APC to the main thread (before it runs)            │ │
    │  │  ├── Resume the thread                                        │ │
    │  │  └── APC executes before process's original code runs         │ │
    │  │                                                                │ │
    │  │  Result: Shellcode runs before security products initialize   │ │
    │  └────────────────────────────────────────────────────────────────┘ │
    └─────────────────────────────────────────────────────────────────────┘

These techniques share a common trait: they don't create new threads. This seemingly small difference has profound implications for detection and evasion.

Thread Enumeration

Before hijacking or queuing APCs to threads, you need to find suitable targets. Thread enumeration in Windows follows similar patterns to process enumeration, with options ranging from high-level APIs to native syscalls.

The Toolhelp Approach

The Toolhelp32 API provides a straightforward way to enumerate threads, similar to how we enumerate processes:

                    THREAD ENUMERATION VIA TOOLHELP

    CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0)
    │
    └── Returns snapshot handle containing all threads
        │
        └── Thread32First / Thread32Next
            │
            └── Returns THREADENTRY32 for each thread:
                ├── th32ThreadID: Thread ID
                ├── th32OwnerProcessID: Owning process ID
                ├── tpBasePri: Base priority
                └── tpDeltaPri: Delta priority

    Typical workflow:
    1. Create snapshot of all system threads
    2. Iterate through threads
    3. Filter by owning process ID (target PID)
    4. OpenThread() to get handle for interesting threads
    5. Close snapshot when done

While simple, this approach touches well-known APIs that security products monitor. Each of these functions can be hooked, allowing EDRs to observe thread enumeration activity.

Native API Enumeration

NtQuerySystemInformation with the SystemProcessInformation class returns process and thread information together. Each process entry includes an array of SYSTEM_THREAD_INFORMATION structures describing its threads:

                    NATIVE THREAD ENUMERATION

    NtQuerySystemInformation(SystemProcessInformation, ...)
    │
    └── Returns buffer of SYSTEM_PROCESS_INFORMATION entries
        │
        └── Each entry contains:
            ├── NumberOfThreads: Count of threads
            ├── ... other process info ...
            │
            └── Followed immediately by NumberOfThreads instances of:
                SYSTEM_THREAD_INFORMATION
                ├── ClientId.UniqueThread: Thread ID
                ├── StartAddress: Thread start address
                ├── ThreadState: Current state (waiting, running, etc.)
                ├── WaitReason: Why thread is waiting (if applicable)
                └── ... timing and priority info ...

    ThreadState values of interest:
    ├── 5 (Waiting): Thread is waiting, may be alertable
    ├── 2 (Running): Thread is currently executing
    └── 3 (Standby): Thread is scheduled to run next

The ThreadState and WaitReason fields are particularly useful for APC injection, as they indicate whether a thread might be in an alertable wait state.

Target Thread Selection

Not all threads are equally suitable for hijacking or APC injection:

                    THREAD SELECTION CRITERIA

    For Thread Hijacking:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Good Targets:                                                       │
    │  ├── Threads in wait states (less disruption when hijacked)        │
    │  ├── Worker threads (less visible impact if they malfunction)      │
    │  └── Threads with predictable behavior patterns                    │
    │                                                                      │
    │  Poor Targets:                                                       │
    │  ├── UI threads (hijacking causes visible freezing)                │
    │  ├── Main application thread (critical to application)             │
    │  ├── Threads holding locks (may cause deadlock)                    │
    │  └── Security product threads (may have detection)                 │
    └─────────────────────────────────────────────────────────────────────┘

    For APC Injection:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Requirements:                                                       │
    │  ├── Thread must enter alertable wait state                        │
    │  └── GUI applications often have alertable threads                 │
    │                                                                      │
    │  Common alertable patterns:                                          │
    │  ├── MsgWaitForMultipleObjectsEx (message loop threads)            │
    │  ├── SleepEx in worker threads                                     │
    │  └── WaitForMultipleObjectsEx (IO completion patterns)             │
    │                                                                      │
    │  Strategy: Queue APC to ALL threads in target process              │
    │  └── At least one will eventually enter alertable state            │
    └─────────────────────────────────────────────────────────────────────┘

Thread Hijacking

Thread hijacking takes direct control of an existing thread's execution by modifying its instruction pointer. When the thread resumes, it executes from the new location instead of where it was originally headed.

The Thread Context

Every thread has a context—a snapshot of its CPU state including all registers. The key register for hijacking is the instruction pointer (RIP on x64, EIP on x86), which determines what instruction the thread executes next:

                    THE CONTEXT STRUCTURE

    x64 CONTEXT (simplified):
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Rip: Instruction pointer (where thread executes next)             │
    │  Rsp: Stack pointer                                                │
    │  Rbp: Base pointer                                                 │
    │                                                                      │
    │  General purpose registers:                                         │
    │  Rax, Rbx, Rcx, Rdx: Data/argument registers                       │
    │  Rsi, Rdi: Source/destination registers                            │
    │  R8-R15: Additional general purpose                                │
    │                                                                      │
    │  Control flags:                                                      │
    │  EFlags: Processor flags (zero, carry, overflow, etc.)             │
    │                                                                      │
    │  SIMD state (if ContextFlags includes CONTEXT_FLOATING_POINT):     │
    │  XMM0-XMM15: Vector registers                                      │
    └─────────────────────────────────────────────────────────────────────┘

    Hijacking Principle:
    ├── Original Rip points to legitimate code
    ├── We change Rip to point to shellcode
    ├── Thread resumes executing from new Rip
    └── Original execution context is lost (unless we preserve it)

Basic Hijacking Flow

The core hijacking process involves suspending the thread, modifying its context, and resuming:

                    THREAD HIJACKING SEQUENCE

    1. SuspendThread(hThread)
       ┌──────────────────────────────────────────────────────────────────┐
       │  Thread stops executing at its current instruction               │
       │  Returns previous suspend count                                  │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    2. GetThreadContext(hThread, &ctx)
       ┌──────────────────────────────────────────────────────────────────┐
       │  Must set ctx.ContextFlags before calling                        │
       │  CONTEXT_FULL captures all registers                             │
       │  Returns current thread state including Rip                      │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    3. Save original Rip (optional)
       ┌──────────────────────────────────────────────────────────────────┐
       │  If shellcode should return to original code:                    │
       │  └── Store original Rip for later restoration                   │
       │                                                                  │
       │  If shellcode replaces original execution:                       │
       │  └── Original Rip can be discarded                              │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    4. ctx.Rip = (DWORD64)pShellcode
       ┌──────────────────────────────────────────────────────────────────┐
       │  Point instruction pointer to shellcode address                  │
       │  When resumed, thread will execute shellcode                     │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    5. SetThreadContext(hThread, &ctx)
       ┌──────────────────────────────────────────────────────────────────┐
       │  Apply modified context to suspended thread                      │
       │  Context remains applied when thread resumes                     │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    6. ResumeThread(hThread)
       ┌──────────────────────────────────────────────────────────────────┐
       │  Thread begins executing from new Rip                            │
       │  Shellcode runs in thread's context                              │
       └──────────────────────────────────────────────────────────────────┘

The Trampoline Pattern

Simple hijacking permanently diverts the thread—it executes shellcode but never returns to its original work. This can crash applications or cause obvious behavioral changes. The trampoline pattern addresses this by having the shellcode return to the original code after executing:

                    TRAMPOLINE HIJACKING

    Before Hijacking:
    Thread executing at:  0x7FF700001000 (legitimate code)

    After Hijacking:
    Thread redirected to: 0x7FF800002000 (trampoline)

    Trampoline Structure:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  ; Save all registers                                               │
    │  push rax                                                           │
    │  push rcx                                                           │
    │  push rdx                                                           │
    │  push rbx                                                           │
    │  push r8                                                            │
    │  push r9                                                            │
    │  push r10                                                           │
    │  push r11                                                           │
    │  sub rsp, 0x28              ; Shadow space for calls               │
    │                                                                      │
    │  ; Call actual shellcode                                            │
    │  mov rax, SHELLCODE_ADDRESS                                         │
    │  call rax                                                           │
    │                                                                      │
    │  ; Restore all registers                                            │
    │  add rsp, 0x28                                                      │
    │  pop r11                                                            │
    │  pop r10                                                            │
    │  pop r9                                                             │
    │  pop r8                                                             │
    │  pop rbx                                                            │
    │  pop rdx                                                            │
    │  pop rcx                                                            │
    │  pop rax                                                            │
    │                                                                      │
    │  ; Return to original code                                          │
    │  mov rax, ORIGINAL_RIP                                              │
    │  jmp rax                                                            │
    └─────────────────────────────────────────────────────────────────────┘

    Result: Shellcode executes, then thread continues normally

This pattern is more complex but allows repeated execution and doesn't break the target application's functionality.

Remote Thread Hijacking

Hijacking threads in another process requires cross-process memory access:

                    REMOTE HIJACKING REQUIREMENTS

    Process Handle Requirements:
    ├── PROCESS_VM_OPERATION: VirtualAllocEx
    ├── PROCESS_VM_WRITE: WriteProcessMemory
    └── PROCESS_VM_READ: ReadProcessMemory (optional)

    Thread Handle Requirements:
    ├── THREAD_SUSPEND_RESUME: SuspendThread/ResumeThread
    ├── THREAD_GET_CONTEXT: GetThreadContext
    └── THREAD_SET_CONTEXT: SetThreadContext

    Workflow:
    1. OpenProcess with required access
    2. Enumerate target process's threads
    3. OpenThread with required access
    4. VirtualAllocEx to allocate shellcode memory in target
    5. WriteProcessMemory to copy shellcode
    6. Suspend → Get Context → Modify → Set Context → Resume

Asynchronous Procedure Calls

APCs provide a mechanism for queuing work to specific threads. Originally designed for I/O completion and other asynchronous operations, APCs can be leveraged to execute arbitrary code when a thread enters an appropriate state.

Understanding APCs

                    APC FUNDAMENTALS

    What is an APC?
    ├── A queued function call that executes in a specific thread's context
    ├── Thread must voluntarily enter "alertable" state for APC to run
    └── Multiple APCs queue up, executing in order when thread is alertable

    Types of APCs:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  User-Mode APCs:                                                     │
    │  ├── Queued with QueueUserAPC / NtQueueApcThread                   │
    │  ├── Execute in user mode when thread enters alertable wait        │
    │  └── This is what we use for injection                             │
    │                                                                      │
    │  Kernel-Mode APCs:                                                   │
    │  ├── Queued by kernel (drivers)                                    │
    │  ├── Can be "normal" (wait for alertable) or "special" (immediate) │
    │  └── Special APCs can interrupt any user-mode code                 │
    └─────────────────────────────────────────────────────────────────────┘

    When do User-Mode APCs Execute?
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Thread calls alertable wait function:                              │
    │  ├── SleepEx(milliseconds, TRUE)        ← TRUE = alertable         │
    │  ├── WaitForSingleObjectEx(..., TRUE)   ← TRUE = alertable         │
    │  ├── WaitForMultipleObjectsEx(..., TRUE)                           │
    │  ├── SignalObjectAndWait(..., TRUE)                                │
    │  └── MsgWaitForMultipleObjectsEx(..., MWMO_ALERTABLE)              │
    │                                                                      │
    │  Thread returns to user mode after system call:                      │
    │  └── APCs checked, executed if pending and thread is alertable     │
    └─────────────────────────────────────────────────────────────────────┘

The Alertable Wait Problem

The main challenge with APC injection is ensuring the APC actually executes. If the target thread never enters an alertable wait state, the APC sits queued indefinitely:

                    THE ALERTABLE WAIT CHALLENGE

    Scenario 1: Thread IS alertable (Success)
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Thread code:                                                        │
    │      while (true) {                                                 │
    │          WaitForSingleObjectEx(hEvent, INFINITE, TRUE);  // ← TRUE │
    │          HandleEvent();                                             │
    │      }                                                              │
    │                                                                      │
    │  When APC queued:                                                    │
    │  └── Next time thread calls WaitForSingleObjectEx, APC executes    │
    └─────────────────────────────────────────────────────────────────────┘

    Scenario 2: Thread is NOT alertable (Failure)
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Thread code:                                                        │
    │      while (true) {                                                 │
    │          WaitForSingleObject(hEvent, INFINITE);  // ← No alertable │
    │          HandleEvent();                                             │
    │      }                                                              │
    │                                                                      │
    │  When APC queued:                                                    │
    │  └── APC sits in queue forever, never executes                     │
    └─────────────────────────────────────────────────────────────────────┘

    Common Alertable Patterns:
    ├── GUI message loops often use alertable waits
    ├── Overlapped I/O completion routines require alertable waits
    ├── .NET async/await patterns use alertable waits internally
    └── Many Windows services use alertable waits for efficiency

APC Injection Implementation

The typical approach queues APCs to every thread in the target process, hoping at least one will enter an alertable state:

                    APC INJECTION WORKFLOW

    1. Prepare shellcode in target process
       ┌──────────────────────────────────────────────────────────────────┐
       │  hProcess = OpenProcess(PROCESS_VM_*)                           │
       │  pRemote = VirtualAllocEx(hProcess, PAGE_EXECUTE_READWRITE)     │
       │  WriteProcessMemory(hProcess, pRemote, shellcode)               │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    2. Enumerate all threads in target process
       ┌──────────────────────────────────────────────────────────────────┐
       │  CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD)                    │
       │  Filter threads where th32OwnerProcessID == target PID          │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    3. Queue APC to each thread
       ┌──────────────────────────────────────────────────────────────────┐
       │  for each thread:                                                │
       │      hThread = OpenThread(THREAD_SET_CONTEXT)                   │
       │      QueueUserAPC(pRemote, hThread, 0)                          │
       │      CloseHandle(hThread)                                        │
       └──────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    4. Wait for execution
       ┌──────────────────────────────────────────────────────────────────┐
       │  At least one thread should enter alertable state eventually    │
       │  When it does, APC executes shellcode                           │
       │  Note: APC may execute multiple times if queued to many threads │
       └──────────────────────────────────────────────────────────────────┘

NtQueueApcThread

The native API provides more control than the documented QueueUserAPC:

                    NTQUEUEAPCTHREAD

    Function signature:
    NTSTATUS NtQueueApcThread(
        HANDLE ThreadHandle,          // Target thread
        PVOID ApcRoutine,             // Function to call
        PVOID ApcArgument1,           // First argument
        PVOID ApcArgument2,           // Second argument
        PVOID ApcArgument3            // Third argument
    );

    Key differences from QueueUserAPC:
    ├── Three arguments instead of one
    ├── Direct syscall, not wrapper
    ├── Returns NTSTATUS (more detailed errors)
    └── Can bypass QueueUserAPC hooks

    Usage for injection:
    ├── ApcRoutine = shellcode address
    ├── Arguments can pass context to shellcode
    └── Shellcode must use correct calling convention

Early Bird Injection

Early Bird combines process creation and APC injection to execute code before the target process's main code runs—and critically, before security products have a chance to install their hooks.

The Early Bird Concept

                    EARLY BIRD TIMING

    Normal Process Startup:
    ┌─────────────────────────────────────────────────────────────────────┐
    │                                                                      │
    │  CreateProcess()                                                     │
    │       │                                                              │
    │       ▼                                                              │
    │  Kernel creates process structures                                  │
    │       │                                                              │
    │       ▼                                                              │
    │  Main thread starts                                                 │
    │       │                                                              │
    │       ▼                                                              │
    │  ntdll!LdrInitializeThunk                                           │
    │       │                                                              │
    │       ▼                                                              │
    │  EDR DLL loaded ◀───── Security product initializes here           │
    │  Hooks installed                                                     │
    │       │                                                              │
    │       ▼                                                              │
    │  Application code runs (under EDR monitoring)                       │
    │                                                                      │
    └─────────────────────────────────────────────────────────────────────┘

    Early Bird Injection:
    ┌─────────────────────────────────────────────────────────────────────┐
    │                                                                      │
    │  CreateProcess(CREATE_SUSPENDED)                                    │
    │       │                                                              │
    │       ▼                                                              │
    │  Kernel creates process structures                                  │
    │  Main thread SUSPENDED                                              │
    │       │                                                              │
    │       ▼                                                              │
    │  Attacker allocates + writes shellcode                              │
    │       │                                                              │
    │       ▼                                                              │
    │  Attacker queues APC to main thread                                 │
    │       │                                                              │
    │       ▼                                                              │
    │  ResumeThread()                                                      │
    │       │                                                              │
    │       ▼                                                              │
    │  Thread enters alertable state (initialization)                     │
    │       │                                                              │
    │       ▼                                                              │
    │  ★ APC EXECUTES SHELLCODE ★ ◀─── BEFORE EDR hooks!                │
    │       │                                                              │
    │       ▼                                                              │
    │  ntdll!LdrInitializeThunk continues                                 │
    │       │                                                              │
    │       ▼                                                              │
    │  EDR DLL loaded (shellcode already ran!)                            │
    │       │                                                              │
    │       ▼                                                              │
    │  Application code runs                                              │
    │                                                                      │
    └─────────────────────────────────────────────────────────────────────┘

The key insight is that the Windows loader enters alertable wait states during process initialization. An APC queued to a suspended main thread executes during this initialization phase, before the process's actual code runs.

Why Early Bird Is Powerful

                    EARLY BIRD ADVANTAGES

    Timing:
    ├── Executes before EDR DLLs are loaded
    ├── EDR hooks are not yet installed
    ├── Security product callbacks may not have fired yet
    └── Less monitoring active during initialization

    Process Context:
    ├── Shellcode runs in the context of a legitimate process
    ├── No cross-process memory operations during execution
    ├── Process appears normal after shellcode completes
    └── Can choose any legitimate process to spawn

    Detection Challenges:
    ├── CREATE_SUSPENDED is used by legitimate software (debuggers, etc.)
    ├── APCs during initialization are normal behavior
    ├── No CreateRemoteThread to detect
    └── Memory allocation happens before process fully initializes

Implementation Considerations

                    EARLY BIRD IMPLEMENTATION

    Step-by-Step:

    1. Choose sacrificial process
       ├── Should be a legitimate Windows executable
       ├── Won't be suspicious if seen running
       └── Examples: svchost.exe, notepad.exe, RuntimeBroker.exe

    2. Create suspended
       └── CreateProcessW(path, ..., CREATE_SUSPENDED, ..., &pi)

    3. Allocate memory in new process
       ├── VirtualAllocEx(pi.hProcess, PAGE_EXECUTE_READWRITE)
       └── No hooks installed yet, allocation is unmonitored

    4. Write shellcode
       └── WriteProcessMemory(pi.hProcess, pRemote, shellcode)

    5. Queue APC to main thread
       └── QueueUserAPC(pRemote, pi.hThread, 0)

    6. Resume thread
       └── ResumeThread(pi.hThread)

    What happens next:
    ├── Thread resumes, enters LdrInitializeThunk
    ├── Loader enters alertable state
    ├── APC fires, shellcode executes
    ├── Shellcode completes
    ├── Loader continues, initializes process normally
    └── Application runs (process appears normal)

Variations and Enhancements

DEBUG_PROCESS Enhancement: Creating the process with DEBUG_PROCESS flag provides additional control. The attacker becomes the debugger and can handle debug events. Before resuming, DebugActiveProcessStop detaches cleanly, leaving no debug traces for anti-debug checks to find.

NtQueueApcThreadEx: On modern Windows, NtQueueApcThreadEx allows queuing "special" user APCs that execute immediately rather than waiting for alertable state. This provides more reliable execution but requires careful handling.

Callback-Based Execution Alternatives

Beyond hijacking and APCs, Windows provides numerous legitimate callback mechanisms that can trigger code execution. These provide yet more options for executing shellcode without explicit thread creation.

Timer Callbacks

Windows timer queues call user-specified functions at scheduled times:

                    TIMER-BASED EXECUTION

    CreateTimerQueue / CreateTimerQueueTimer:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  1. Create timer queue (or use default)                             │
    │  2. CreateTimerQueueTimer with:                                     │
    │     ├── Callback = shellcode address                               │
    │     ├── DueTime = 0 (immediate)                                    │
    │     ├── Period = 0 (one-shot)                                      │
    │     └── Flags = WT_EXECUTEINTIMERTHREAD                            │
    │  3. Shellcode executes in timer thread context                     │
    └─────────────────────────────────────────────────────────────────────┘

    Advantages:
    ├── No explicit thread creation by our code
    ├── Timer threads are normal system behavior
    ├── Call stack shows timer infrastructure
    └── Can schedule delayed execution

Enumeration Callbacks

Many Windows APIs take callback functions that are called for each enumerated item. These can be abused to trigger shellcode execution:

                    ENUMERATION CALLBACK TECHNIQUES

    EnumChildWindows:
    ├── Enumerates child windows, calling callback for each
    ├── Set callback = shellcode
    ├── Parameter passed to callback
    └── Executes immediately

    EnumUILanguages / EnumSystemLocales:
    ├── Enumerates locales/languages
    ├── Callback invoked for each
    ├── Less commonly monitored
    └── Execution context looks normal

    EnumFonts / EnumFontFamilies:
    ├── Enumerates fonts
    ├── GDI context
    └── Unusual but functional

    CertEnumSystemStore:
    ├── Enumerates certificate stores
    ├── Crypto API context
    └── Less commonly monitored

    Common Pattern:
    1. Allocate RWX memory
    2. Copy shellcode
    3. Call enumeration function with shellcode as callback
    4. Shellcode executes

Thread Pool Callbacks

The thread pool API provides work items, timers, and wait callbacks:

                    THREAD POOL TECHNIQUES

    Work Items (Immediate execution):
    ├── CreateThreadpoolWork / SubmitThreadpoolWork
    ├── Work callback = shellcode
    └── Executes on pool thread

    Timer Callbacks:
    ├── CreateThreadpoolTimer / SetThreadpoolTimer
    ├── Timer fires, callback executes
    └── Can schedule delayed execution

    Wait Callbacks:
    ├── CreateThreadpoolWait / SetThreadpoolWait
    ├── Callback fires when object signaled
    └── Event-driven execution

    Advantages:
    ├── Uses existing thread pool (no new threads)
    ├── Call stack shows legitimate thread pool code
    ├── Normal Windows behavior
    └── Multiple execution options

Detection and Defense

Thread-based injection techniques present unique detection challenges because they don't create new threads. However, they still leave observable artifacts.

Detection Indicators

                    THREAD INJECTION DETECTION

    Thread Hijacking Indicators:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  API Sequence:                                                       │
    │  ├── OpenThread with THREAD_*_CONTEXT access                       │
    │  ├── SuspendThread on external process thread                      │
    │  ├── GetThreadContext / SetThreadContext pair                      │
    │  └── ResumeThread to continue execution                            │
    │                                                                      │
    │  Context Changes:                                                    │
    │  ├── Instruction pointer changed to non-module memory              │
    │  ├── Stack pointer modifications                                   │
    │  └── Large context changes (many registers modified)               │
    │                                                                      │
    │  ETW Events:                                                         │
    │  ├── Thread context modification events                            │
    │  └── Cross-process thread operations                               │
    └─────────────────────────────────────────────────────────────────────┘

    APC Injection Indicators:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  API Calls:                                                          │
    │  ├── QueueUserAPC to remote process threads                        │
    │  ├── NtQueueApcThread cross-process                                │
    │  └── Multiple APCs queued in rapid succession                      │
    │                                                                      │
    │  Pattern Recognition:                                                │
    │  ├── VirtualAllocEx + WriteProcessMemory + QueueUserAPC           │
    │  ├── APC target address in non-module memory                       │
    │  └── APC to process not created by queuing process                 │
    └─────────────────────────────────────────────────────────────────────┘

    Early Bird Indicators:
    ┌─────────────────────────────────────────────────────────────────────┐
    │  Process Creation:                                                   │
    │  ├── CREATE_SUSPENDED followed by memory operations                │
    │  ├── APC queued before ResumeThread                                │
    │  └── Parent process modifying child before child runs              │
    │                                                                      │
    │  Timing Analysis:                                                    │
    │  ├── Memory allocation in suspended process                        │
    │  ├── Short time between CreateProcess and ResumeThread             │
    │  └── APC queued during suspension                                  │
    └─────────────────────────────────────────────────────────────────────┘

Evasion Considerations

To evade detection, thread-based injection should:

Use Direct Syscalls: Bypass user-mode hooks on OpenThread, QueueUserAPC, etc.

Minimize Timing Gaps: Reduce time between suspension and resume to avoid time-based detection.

Target Selection: Choose processes where thread manipulation is normal behavior.

Memory Location: Use module stomping or backed memory instead of allocating new regions.

Clean Up: If using trampoline hijacking, restore original context after shellcode completes.

Summary

Thread hijacking and APC injection provide powerful alternatives to thread creation-based injection. By working with existing threads, these techniques reduce the most obvious detection indicators while still achieving code execution across process boundaries.

Technique	Creates Thread	Reliability	Stealth
Thread Hijacking	No	Medium (state-dependent)	High
APC Injection	No	Medium (alertable-dependent)	High
Early Bird	No	High	Very High
Timer Callbacks	No (uses existing)	High	Very High
Enum Callbacks	No	High	Very High

Key principles:

Prefer existing threads: Creating new threads is heavily monitored; using existing ones is harder to detect.

Understand alertable states: APC injection only works if threads enter alertable wait states.

Timing matters: Early Bird works because it executes before security products initialize.

Combine techniques: Layer thread injection with other evasion methods (syscalls, memory techniques) for best results.

Consider cleanup: Trampoline patterns allow continued execution without breaking target functionality.

The next chapter explores anti-analysis techniques—methods for detecting and evading debugging, sandboxes, and automated analysis systems.

References

MITRE ATT&CK: T1055.003 (Thread Execution Hijacking), T1055.004 (Asynchronous Procedure Call)
"Early Bird Injection" Research - CyberBit
Microsoft Documentation: Asynchronous Procedure Calls, Thread Context
Windows Internals: Thread Scheduling and APC Delivery

← Back to Wiki