Automating Reverse Engineering with AI/ML, Graphs, and LLM Agents

Instructors: Malachi Jones & Joe Mansour
Dates: June 15 to 18 2026
Location: Hilton DoubleTree Montreal
Capacity: 25

Course Overview

This course shifts reverse engineering from isolated single-binary analysis to system-level reasoning by unifying partial program facts recovered from disassembly into a single graph. That graph grounds LLMs and agents in explicit program structure, enabling scalable, evidence-driven automation.

Students begin with Blackfyre, an open-source framework developed for this course that extracts core program artifacts—functions, basic blocks, control flow, calls, imports, and strings—using both interactive and headless Ghidra workflows. This provides a repeatable foundation for prioritizing, comparing, and automating reverse engineering at scale.

Extracted artifacts are loaded into a Neo4j-backed, BinQL-inspired program graph that supports behavioral binary similarity, malware clustering, firmware ecosystem analysis, and vulnerability triage. NL2GQL translates natural-language analysis questions into executable graph queries, making the graph the interface between analyst intent, LLM reasoning, and agent actions. A full open-source reference implementation of BinQL will be released after the training cycle.

The second half of the course builds learning and automation directly on top of this graph. Graph-referenced artifacts are transformed into embeddings for similarity, clustering, and retrieval, guided by Basic Block Rank (BBR) to derive which code paths and artifacts matter most. Transformer models and LLMs extend this pipeline, culminating in fine-tuning LLaMA-family models on A6000 GPUs to improve NL2GQL accuracy for BinQL. The course concludes with agent-based workflows using AutoGen and MCP to automate tasks such as patch impact analysis, N-day vulnerability triage, summarization, and YARA rule generation—while remaining grounded in graph evidence.

Topics by Day

Day 1: Introduction to Core Concepts and Techniques

Establishes a shared graph-based representation so reverse-engineering questions can be asked systematically rather than ad hoc.

Automated RE overview and challenges of manual workflows, motivating representations that support reuse, scalability, and automation
Fundamentals of binary analysis (IRs, pyvex/angr, Ghidra p-code) and Blackfyre: extracting functions, basic blocks, control flow, calls, imports, and strings into protobuf via interactive and headless workflows
Graph-based program modeling from the start: loading artifacts into Neo4j and introducing BinQL and NL2GQL to ask structural questions such as reachability, input entry points, risky APIs, nearby evidence, and connecting paths
Labs: Students extract artifacts with Blackfyre, load them into a program graph, and use BinQL and NL2GQL to answer core vulnerability and malware triage questions, establishing the representation used throughout the course

Day 2: Graph Workflows & Cross-Binary Analysis

Uses the program graph as an analysis engine for comparing behavior, structure, and risk across binaries.

Cross-binary reasoning using shared structure, including malware clustering, firmware ecosystem analysis, and tracing reused or shared vulnerable code
Refining NL2GQL queries to constrain results and inspect returned graph evidence rather than treating model output as authoritative
Basic Block Rank (BBR): ranking execution-relevant basic blocks so importance propagates to referenced artifacts, revealing which imports, strings, and functions actually matter
Labs: Students apply graph-based workflows for behavioral binary similarity using import call traces, graph-driven malware analysis for capability mapping and clustering, and graph-based vulnerability analysis to prioritize unsafe APIs, reachable exploit paths, reused code, and complex or rarely exercised regions

Day 3: Transformers & Neural Approaches for RE

Extends graph-based program analysis with neural representations that support downstream tasks such as similarity, function naming, and binary-level reasoning.

Practical introduction to embeddings for RE, focusing on how graph-selected artifacts are encoded for comparison, grouping, and reuse across analyses
Transformer concepts introduced through familiar RE artifacts, including tokenization of binary-derived sequences and long-context handling with sparse attention
Using Basic Block Rank (BBR) to select and weight program regions and referenced artifacts so embeddings emphasize execution-relevant behavior over noise
Labs: Students generate embeddings from graph-selected artifacts, aggregate them using BBR, and apply them to downstream tasks including function name prediction, binary similarity, and cross-binary retrieval, writing the results back into the program graph

Day 4: LLMs, Agents & Fine-Tuning

Integrates fine-tuned LLMs and agents to automate analysis while remaining grounded in graph evidence.

Applying LLMs to RE tasks such as summarization, function labeling, vulnerability reporting, and rule drafting using structured program facts
Fine-tuning LLaMA-family models on A6000 GPUs to improve NL2GQL accuracy for BinQL, increasing reliability and structural correctness of generated queries
Agentic pipelines using AutoGen and MCP, where fine-tuned models, graph queries, and embeddings coordinate retrieval, reasoning, and verification
Labs: Students fine-tune NL2GQL models and deploy agent workflows over the program graph for patch impact analysis, N-day vulnerability triage, graph-grounded summarization, firmware ecosystem analysis, and automated YARA rule generation

What Should Students Bring?

Students should ensure they have a laptop with a minimum of 32 GB RAM, 250 GB of free disk space, and a processor with at least 4 cores, equivalent to an Intel i7 or higher. The processor must be an x86_64 architecture to ensure compatibility with the course-provided virtual machine (VM) and to run VirtualBox version 7.1 or later. Additionally, the processor must support AVX (Advanced Vector Extensions), which are required for running machine learning frameworks such as TensorFlow and PyTorch. Connectivity capabilities are also essential for accessing external services used in the Large Language Models (LLMs) components of the course. VirtualBox should be pre-installed to enable participation in the hands-on labs and exercises.

Prerequisites

Students should have a solid foundation in reverse engineering and be comfortable with Python object-oriented development. Familiarity with basic ML concepts (e.g., vectors, supervised learning, precision/recall) is helpful but not required; these topics are introduced at the start of the course to establish a common baseline.

Objectives

Automate malware, firmware, and vulnerability analysis workflows using Blackfyre, BinQL/Neo4j, and embeddings for tasks such as similarity detection, function name prediction, clustering, and vulnerability detection.
Scale analysis beyond single binaries, applying graphs and embeddings to malware families and firmware systems to surface ecosystem-level insights and accelerate vulnerability discovery.
Leverage LLMs effectively through RAG, KnowledgeRAG, Autogen agents, MCPs, and fine-tuning with LLaMAFactory, enabling natural-language queries, accurate summarization, vulnerability reporting, and RE-specific function labeling.

Who Should Take This Course

Reverse engineers and malware analysts who want to automate manual workflows such as function labeling, similarity detection, and vulnerability tracing.
Firmware analysts interested in scaling analysis across systems of binaries and identifying ecosystem-level vulnerabilities.
Security researchers and engineers with Python experience who want to apply AI/ML methods, graphs, and LLMs to binary analysis.
Practitioners exploring AI/ML for RE who may already know RE fundamentals but want hands-on exposure to embeddings, graph databases, and LLM-driven workflows.

Who Would Not Be a Good Fit for This Course

Participants with no prior reverse engineering experience — the course assumes familiarity with RE concepts and workflows.
Those without Python programming skills, since labs require writing and modifying Python code.
Students looking for a broad introduction to AI/ML — the course emphasizes applying AI/ML techniques in the context of reverse engineering, not general-purpose machine learning training.

Changes from the Previous Offering of the Course

This year's course expands beyond earlier versions by introducing graph-driven workflows, advanced LLM methods, and agentic automation for reverse engineering:

BinQL with Neo4j: Students gain exclusive early access to BinQL, which structures binaries as graphs of functions, basic blocks, imports, and strings, and enables cross-binary analysis through the GQL query standard.
Natural Language to GQL (NL2GQL): Reverse engineering questions expressed in everyday RE language (e.g., "list binaries in this firmware image that call vulnerable function X") are translated into precise graph queries.
Knowledge Graphs for Scaling Analysis: Moving from single-binary workflows to ecosystem-level insights across malware families or firmware systems of binaries.
KnowledgeRAG: Extending RAG with embeddings and knowledge graphs, grounding LLM reasoning in structured RE data for summarization, vulnerability reporting, and signature generation.
Agentic LLMs with Autogen: Introducing autonomous, tool-using LLM agents that can plan, reason, and iteratively interact with RE systems such as disassemblers and graph databases, enabling adaptive automation of workflows like large-scale triage and recursive malware family exploration.

BIO

Dr. Malachi Jones is a Principal Cybersecurity AI/LLM Researcher and Manager at Microsoft, where he currently leads a team advancing red team agent autonomy within Microsoft Security AI (MSECAI). His present focus is on building autonomous red team agents, while his earlier work centered on fine-tuning large language models (LLMs) for security tasks and developing reverse engineering capabilities in Security Copilot.

With over 15 years in security research, Dr. Jones has contributed to both academia and industry. At MITRE, he advanced ML- and IR-based approaches for automated reverse engineering, and at Booz Allen Dark Labs, he specialized in embedded security and co-authored US Patent 10,133,871.

In addition to his work at Microsoft, Dr. Jones is the founder of Jones Cyber-AI, an organization dedicated to independent research and teaching initiatives. Through Jones Cyber-AI, he has developed and taught his specialized course, Automating Reverse Engineering Processes with AI/ML, NLP, and LLMs, at premier conferences including Black Hat USA (2019, 2021, 2023–2025) and RECON Montreal (2023–2025). His independent research in AI/ML, Graphs, and LLMs agents ensures his courses remain cutting-edge and aligned with the latest advances in cybersecurity and reverse engineering.

He previously served as an Adjunct Professor at the University of Maryland, College Park, and holds a B.S. in Computer Engineering from the University of Florida, as well as an M.S. and Ph.D. from Georgia Tech, where his research applied game theory to cybersecurity. His expertise continues to drive innovation in AI-driven cybersecurity and automated reverse engineering.

Joe Mansour is a Security Researcher at Microsoft. With a focus on reverse engineering malware, he develops detections to protect customers. His expertise is rooted in a background that spans red teaming, vulnerability assessment, and hardware hacking. Joe has contributed to projects involving automated reverse engineering showcasing his aptitude for binary analysis and tool development to simplify the complexities of reverse engineering. He holds an M.S. in Computer Science from Johns Hopkins University and a B.S. from the University of Illinois at Urbana-Champaign.

To Register

Click here to register.