Welcome to Reversing: Secrets of Reverse Engineering. This book was written
after years of working on software development projects that repeatedly
required reverse engineering of third party code, for a variety of reasons. At
first this was a fairly tedious process that was only performed when there was
simply no alternative means of getting information. Then all of a sudden, a
certain mental barrier was broken and I found myself rapidly sifting through
undocumented machine code, quickly deciphering its meaning and getting
the answers I wanted regarding the code’s function and purpose. At that point
it dawned on me that this was a remarkably powerful skill, because it meant
that I could fairly easily get answers to any questions I had regarding software
I was working with, even when I had no access to the relevant documentation
or to the source code of the program in question. This book is about providing
knowledge and techniques to allow anyone with a decent understanding of
software to do just that.
The idea is simple: we should develop a solid understanding of low-level
software, and learn techniques that will allow us to easily dig into any program’s
binaries and retrieve information. Not sure why a system behaves the
way it does and no one else has the answers? No problem—dig into it on your
own and find out. Sounds scary and unrealistic? It’s not, and this is the very
purpose of this book, to teach and demonstrate reverse engineering techniques
that can be applied daily, for solving a wide variety of problems.
But I’m getting ahead of myself. For those of you that haven’t been exposed
to the concept of software reverse engineering, a little introduction is in order.
Executive Editor
Robert Elliott
Development Editor
Eileen Bien Calabro
Copy Editor
Foxxe Editorial Services
Editorial Manager
Mary Beth Wakefield
Vice President & Executive Group
Publisher
Richard Swadley
Vice President and Publisher
Joseph B. Wikert
Project Editor
Pamela Hanley
Project Coordinator
Ryan Steffen
Graphics and Production Specialists
Denny Hager
Jennifer Heleine
Lynsey Osborn
Mary Gillot Virgin
Quality Control Technician
Leeann Harney
Proofreading and Indexing
TECHBOOKS Production Services
Cover Designer
Michael Trent
Contents
Foreword vii
Acknowledgments xi
Introduction xxiii
Part I Reversing
Chapter 1 Foundations
What Is Reverse Engineering? 3
Software Reverse Engineering: Reversing 4
Reversing Applications 4
Security-Related Reversing 5
Malicious Software 5
Reversing Cryptographic Algorithms 6
Digital Rights Management 7
Auditing Program Binaries 7
Reversing in Software Development 8
Achieving Interoperability with Proprietary Software 8
Developing Competing Software 8
Evaluating Software Quality and Robustness 9
Low-Level Software 9
Assembly Language 10
Compilers 11
Virtual Machines and Bytecodes 12
Operating Systems 13
The Reversing Process 13
System-Level Reversing 14
Code-Level Reversing 14
The Tools 14
System-Monitoring Tools 15
Disassemblers 15
Debuggers 15
Decompilers 16
Is Reversing Legal? 17
Interoperability 17
Competition 18
Copyright Law 19
Trade Secrets and Patents 20
The Digital Millenium Copyright Act 20
DMCACases 22
License Agreement Considerations 23
Code Samples & Tools 23
Conclusion 23
Chapter 2 Low-Level Software
High-Level Perspectives 26
Program Structure 26
Modules 28
Common Code Constructs 28
Data Management 29
Variables 30
User-Defined Data Structures 30
Lists 31
Control Flow 32
High-Level Languages 33
C 34
C++ 35
Java 36
C# 36
Low-Level Perspectives 37
Low-Level Data Management 37
Registers 39
The Stack 40
Heaps 42
Executable Data Sections 43
Control Flow 43
Assembly Language 101 44
Registers 44
Flags 46
Instruction Format 47
Basic Instructions 48
Moving Data 49
Arithmetic 49
Comparing Operands 50
Conditional Branches 51
Function Calls 51
Examples 52
A Primer on Compilers and Compilation 53
Defining a Compiler 54
Compiler Architecture 55
Front End 55
Intermediate Representations 55
Optimizer 56
Back End 57
Listing Files 58
Specific Compilers 59
Execution Environments 60
Software Execution Environments (Virtual Machines) 60
Bytecodes 61
Interpreters 61
Just-in-Time Compilers 62
Reversing Strategies 62
Hardware Execution Environments in Modern Processors 63
Intel NetBurst 65
μops (Micro-Ops) 65
Pipelines 65
Branch Prediction 67
Conclusion 68
Chapter 3 Windows Fundamentals
Components and Basic Architecture 70
Brief History 70
Features 70
Supported Hardware 71
Memory Management 71
Virtual Memory and Paging 72
Paging 73
Page Faults 73
Working Sets 74
Kernel Memory and User Memory 74
The Kernel Memory Space 75
Section Objects 77
VAD Trees 78
User-Mode Allocations 78
Memory Management APIs 79
Objects and Handles 80
Named objects 81
Processes and Threads 83
Processes 84
Threads 84
Context Switching 85
Synchronization Objects 86
Process Initialization Sequence 87
Contents xv
Application Programming Interfaces 88
The Win32 API 88
The Native API 90
System Calling Mechanism 91
Executable Formats 93
Basic Concepts 93
Image Sections 95
Section Alignment 95
Dynamically Linked Libraries 96
Headers 97
Imports and Exports 99
Directories 99
Input and Output 103
The I/O System 103
The Win32 Subsystem 104
Object Management 105
Structured Exception Handling 105
Conclusion 107
Chapter 4 Reversing Tools
Different Reversing Approaches 110
Offline Code Analysis (Dead-Listing) 110
Live Code Analysis 110
Disassemblers 110
IDA Pro 112
ILDasm 115
Debuggers 116
User-Mode Debuggers 118
OllyDbg 118
User Debugging in WinDbg 119
IDA Pro 121
PEBrowse Professional Interactive 122
Kernel-Mode Debuggers 122
Kernel Debugging in WinDbg 123
Numega SoftICE 124
Kernel Debugging on Virtual Machines 127
Decompilers 129
System-Monitoring Tools 129
Patching Tools 131
Hex Workshop 131
Miscellaneous Reversing Tools 133
Executable-Dumping Tools 133
DUMPBIN 133
PEView 137
PEBrowse Professional 137
Conclusion 138
Part II Applied Reversing
Chapter 5 Beyond the Documentation
Reversing and Interoperability 142
Laying the Ground Rules 142
Locating Undocumented APIs 143
What Are We Looking For? 144
Case Study: The Generic Table API in NTDLL.DLL 145
RtlInitializeGenericTable 146
RtlNumberGenericTableElements 151
RtlIsGenericTableEmpty 152
RtlGetElementGenericTable 153
Setup and Initialization 155
Logic and Structure 159
Search Loop 1 161
Search Loop 2 163
Search Loop 3 164
Search Loop 4 165
Reconstructing the Source Code 165
RtlInsertElementGenericTable 168
RtlLocateNodeGenericTable 170
RtlRealInsertElementWorker 178
Splay Trees 187
RtlLookupElementGenericTable 188
RtlDeleteElementGenericTable 193
Putting the Pieces Together 194
Conclusion 196
Chapter 6 Deciphering File Formats
Cryptex 200
Using Cryptex 201
Reversing Cryptex 202
The Password Verification Process 207
Catching the “Bad Password” Message 207
The Password Transformation Algorithm 210
Hashing the Password 213
The Directory Layout 218
Analyzing the Directory Processing Code 218
Analyzing a File Entry 223
Dumping the Directory Layout 227
The File Extraction Process 228
Scanning the File List 234
Decrypting the File 235
The Floating-Point Sequence 236
The Decryption Loop 238
Verifying the Hash Value 239
The Big Picture 239
Digging Deeper 241
Conclusion 242
Chapter 7 Auditing Program Binaries
Defining the Problem 243
Vulnerabilities 245
Stack Overflows 245
A Simple Stack Vulnerability 247
Intrinsic Implementations 249
Stack Checking 250
Nonexecutable Memory 254
Heap Overflows 255
String Filters 256
Integer Overflows 256
Arithmetic Operations on User-Supplied Integers 258
Type Conversion Errors 260
Case-Study: The IIS Indexing Service Vulnerability 262
CVariableSet::AddExtensionControlBlock 263
DecodeURLEscapes 267
Conclusion 271
Chapter 8 Reversing Malware
Types of Malware 274
Viruses 274
Worms 274
Trojan Horses 275
Backdoors 276
Mobile Code 276
Adware/Spyware 276
Sticky Software 277
Future Malware 278
Information-Stealing Worms 278
BIOS/Firmware Malware 279
Uses of Malware 280
Malware Vulnerability 281
Polymorphism 282
Metamorphism 283
Establishing a Secure Environment 285
The Backdoor.Hacarmy.D 285
Unpacking the Executable 286
Initial Impressions 290
The Initial Installation 291
Initializing Communications 294
Connecting to the Server 296
Joining the Channel 298
Communicating with the Backdoor 299
Running SOCKS4 Servers 303
Clearing the Crime Scene 303
The Backdoor.Hacarmy.D: A Command Reference 304
Conclusion 306
Part III Cracking
Chapter 9 Piracy and Copy Protection
Copyrights in the New World 309
The Social Aspect 310
Software Piracy 310
Defining the Problem 311
Class Breaks 312
Requirements 313
The Theoretically Uncrackable Model 314
Types of Protection 314
Media-Based Protections 314
Serial Numbers 315
Challenge Response and Online Activations 315
Hardware-Based Protections 316
Software as a Service 317
Advanced Protection Concepts 318
Crypto-Processors 318
Digital Rights Management 319
DRM Models 320
The Windows Media Rights Manager 321
Secure Audio Path 321
Watermarking 321
Trusted Computing 322
Attacking Copy Protection Technologies 324
Conclusion 324
Chapter 10 Antireversing Techniques
Why Antireversing? 327
Basic Approaches to Antireversing 328
Eliminating Symbolic Information 329
Code Encryption 330
Active Antidebugger Techniques 331
Debugger Basics 331
The IsDebuggerPresent API 332
SystemKernelDebuggerInformation 333
Detecting SoftICE Using the Single-Step Interrupt 334
The Trap Flag 335
Code Checksums 335
Confusing Disassemblers 336
Linear Sweep Disassemblers 337
Recursive Traversal Disassemblers 338
Applications 343
Code Obfuscation 344
Control Flow Transformations 346
Opaque Predicates 346
Confusing Decompilers 348
Table Interpretation 348
Inlining and Outlining 353
Interleaving Code 354
Ordering Transformations 355
Data Transformations 355
Modifying Variable Encoding 355
Restructuring Arrays 356
Conclusion 356
Chapter 11 Breaking Protections
Patching 358
Keygenning 364
Ripping Key-Generation Algorithms 365
Advanced Cracking: Defender 370
Reversing Defender’s Initialization Routine 377
Analyzing the Decrypted Code 387
SoftICE’s Disappearance 396
Reversing the Secondary Thread 396
Defeating the “Killer” Thread 399
Loading KERNEL32.DLL 400
Reencrypting the Function 401
Back at the Entry Point 402
Parsing the Program Parameters 404
Processing the Username 406
Validating User Information 407
Unlocking the Code 409
Brute-Forcing Your Way through Defender 409
Protection Technologies in Defender 415
Localized Function-Level Encryption 415
Relatively Strong Cipher Block Chaining 415
Reencrypting 416
Obfuscated Application/Operating System Interface 416
Processor Time-Stamp Verification Thread 417
Runtime Generation of Decryption Keys 418
Interdependent Keys 418
User-Input-Based Decryption Keys 419
Heavy Inlining 419
Conclusion 419
Part IV Beyond Disassembly
Chapter 12 Reversing .NET
Ground Rules 424
.NET Basics 426
Managed Code 426
.NET Programming Languages 428
Common Type System (CTS) 428
Intermediate Language (IL) 429
The Evaluation Stack 430
Activation Records 430
IL Instructions 430
IL Code Samples 433
Counting Items 433
A Linked List Sample 436
Decompilers 443
Obfuscators 444
Renaming Symbols 444
Control Flow Obfuscation 444
Breaking Decompilation and Disassembly 444
Reversing Obfuscated Code 445
XenoCode Obfuscator 446
DotFuscator by Preemptive Solutions 448
Remotesoft Obfuscator and Linker 451
Remotesoft Protector 452
Precompiled Assemblies 453
Encrypted Assemblies 453
Conclusion 455
Chapter 13 Decompilation
Native Code Decompilation: An Unsolvable Problem? 457
Typical Decompiler Architecture 459
Intermediate Representations 459
Expressions and Expression Trees 461
Control Flow Graphs 462
The Front End 463
Semantic Analysis 463
Generating Control Flow Graphs 464
Code Analysis 466
Data-Flow Analysis 466
Single Static Assignment (SSA) 467
Data Propagation 468
Register Variable Identification 470
Data Type Propagation 471
Type Analysis 472
Primitive Data Types 472
Complex Data Types 473
Control Flow Analysis 475
Finding Library Functions 475
The Back End 476
Real-World IA-32 Decompilation 477
Conclusion 477
Appendix A Deciphering Code Structures 479
Appendix B Understanding Compiled Arithmetic 519
Appendix C Deciphering Program Data 537
Index 561
Reverse Engineering and Low-Level Software
Before we get into the various topics discussed throughout this book, we
should formally introduce its primary subject: reverse engineering. Reverse
engineering is a process where an engineered artifact (such as a car, a jet
engine, or a software program) is deconstructed in a way that reveals its innermost
details, such as its design and architecture. This is similar to scientific
research that studies natural phenomena, with the difference that no one commonly
refers to scientific research as reverse engineering, simply because no
one knows for sure whether or not nature was ever engineered.
In the software world reverse engineering boils down to taking an existing
program for which source-code or proper documentation is not available and
attempting to recover details regarding its’ design and implementation. In
some cases source code is available but the original developers who created it
are unavailable. This book deals specifically with what is commonly referred
to as binary reverse engineering. Binary reverse engineering techniques aim at
extracting valuable information from programs for which source code in
unavailable. In some cases it is possible to recover the actual source-code (or a
similar high-level representation) from the program binaries, which greatly
simplifies the task because reading code presented in a high-level language is
far easier than reading low-level assembly language code. In other cases we
end up with a fairly cryptic assembly language listing that describes the program.
This book explains this process and why things work this way, while
describing in detail how to decipher the program’s code in a variety of different environments.
I’ve decided to name this book “Reversing”, which is the term used by many
online communities to describe reverse engineering. Because the term
reversing can be seen as a nickname for reverse engineering I will be using the
two terms interchangeably throughout this book.
Most people get a bit anxious when they try to imagine trying to extract
meaningful information from an executable binary, and I’ve made it the primary
goal of this book to prove that this fear is not justified. Binary reverse
engineering works, it can solve problems that are often incredibly difficult to
solve in any other way, and it is not as difficult as you might think once you
approach it in the right way.
This book focuses on reverse engineering, but it actually teaches a great deal
more than that. Reverse engineering is frequently used in a variety of environments
in the software industry, and one of the primary goals of this book is to
explore many of these fields while teaching reverse engineering.
Here is a brief listing of some of the topics discussed throughout this book:
■■ Assembly language for IA-32 compatible processors and how to read
compiler-generated assembly language code.
■■ Operating systems internals and how to reverse engineer an operating
system.
■■ Reverse engineering on the .NET platform, including an introduction to
the .NET development platform and its assembly language: MSIL.
■■ Data reverse engineering: how to decipher an undocumented file-format
or network protocol.
■■ The legal aspects of reverse engineering: when is it legal and when is
it not?
■■ Copy protection and digital rights management technologies.
■■ How reverse engineering is applied by crackers to defeat copy protection
technologies.
■■ Techniques for preventing people from reverse engineering code and a
sober attempt at evaluating their effectiveness.
■■ The general principles behind modern-day malicious programs and
how reverse engineering is applied to study and neutralize such
programs.
■■ A live session where a real-world malicious program is dissected and
revealed, also revealing how an attacker can communicate with the program
to gain control of infected systems.
■■ The theory and principles behind decompilers, and their effectiveness
on the various low-level languages.
● Screenshot ●