Automating Reverse Engineering with AI/ML, Graphs, and LLM Agents

Instructor: Malachi Jones
Dates: June 15 to 18 2026
Capacity: 25

This course teaches how to automate reverse engineering (RE) for malware, firmware, and vulnerability analysis using AI/ML, graphs, large language models (LLMs), and agents. Students begin with Blackfyre, an open-source framework that extracts binaries into a Protocol Buffers (protobuf) format for downstream analysis. Hands-on labs guide students in building a lightweight, BinQL-inspired graph analysis system using Blackfyre and Neo4j to support workflows such as malware family clustering, firmware analysis, and vulnerability tracing. A full open-source BinQL reference implementation will be released after the training cycle later this year. Additional hands-on labs cover NL2GQL, which translates natural-language RE questions into graph queries so students can focus on analysis rather than query syntax, along with transformer-based embeddings, LLM techniques including RAG and KnowledgeRAG, and the Model Context Protocol (MCP). The course culminates in applied labs on agentic workflows using fine-tuned LLaMA models and frameworks such as Autogen for adaptive and automated reverse engineering.

Course Overview

This course teaches how to automate reverse engineering (RE) for malware, firmware, and vulnerability analysis using AI/ML, graph analysis, large language models (LLMs), and agents. Students begin with Blackfyre, a framework developed for this course and released as open source, which structures binaries into Protocol Buffers (protobuf) for downstream analysis. They also use the Blackfyre Ghidra plugin, supporting both interactive and headless execution for integration into RE pipelines. Building on this foundation, hands-on labs guide students in implementing a lightweight, BinQL-inspired graph analysis system that integrates with Neo4j to represent binaries as graphs of functions, basic blocks, imports, and strings, enabling workflows such as malware clustering, firmware ecosystem analysis, and vulnerability tracing. A full open-source BinQL reference implementation will be released after the training cycle later this year. To reduce complexity, students are introduced to NL2GQL, which translates natural-language RE questions into graph queries, allowing them to focus on analysis rather than query syntax.

The second half of the course focuses on embeddings, transformers, and LLM-driven automation. Students learn to convert binary artifacts—functions, strings, imports, and basic blocks—into vector embeddings for similarity detection, clustering, function name prediction, and vulnerability analysis. A central technique is BasicBlockRank (BBR), which uses control-flow and call graphs to rank basic blocks, with referenced artifacts inheriting their importance, improving embedding quality for downstream tasks. These embeddings also serve as the foundation for RAG, KnowledgeRAG, and agent workflows, where they ground retrieval, reasoning, and decision-making. Building on this, the course introduces transformers for function prediction and binary similarity, and agent pipelines using Autogen and the Model Context Protocol (MCP). It concludes with fine-tuning LLaMA models via LLaMAFactory to improve RE-specific applications such as function labeling, reporting, and NL2GQL accuracy.

Topics by Day

Day 1: Introduction to Core Concepts and Techniques
- Automated RE overview and challenges of manual workflows
- Fundamentals of binary analysis (IRs, pyvex/angr, Ghidra p-code) and Blackfyre: extraction, protobuf output, and Python library (with Ghidra plugin for interactive and headless modes)
- Introduction to graphs for RE and BinQL: representing binaries as functions, basic blocks, strings, and imports, and querying them to support cross-binary workflows
- Introduction to core AI/ML concepts for RE: basics of NLP and neural networks (vectors, embeddings, and why they matter for binaries)
- Labs: Use the Ghidra Blackfyre plugin in both interactive and headless modes to extract artifacts, then build and query binary graphs with BinQL as the foundation for cross-binary and ML-driven analysis
Day 2: Graph Workflows & Cross-Binary Analysis
- Cross-binary workflows: clustering malware families, analyzing firmware ecosystems, and tracing shared vulnerabilities
- NL2GQL: translating natural-language RE questions into graph queries
- Knowledge graphs for representing and reasoning over sets or systems of binaries
- BasicBlockRank (BBR): a graph-based approach using CFGs and call graphs to rank basic blocks, with artifacts (e.g., strings, imports, functions) inheriting the importance of the blocks that reference them
- Labs: Apply graph analysis to sets of binaries, use BBR to propagate block importance to artifacts, and improve the embedding quality for ecosystem-scale RE tasks
Day 3: Transformers & Neural Approaches for RE
- Core ML and neural network concepts for reverse engineering, including embeddings for functions and strings
- Transformer architectures: token embeddings, self-attention, positional encoding, feed-forward networks, and decoding strategies (MLM, greedy, beam search)
- Long-context sequence handling with sparse attention, focusing on Longformer (a Transformer variant that reduces attention cost for long sequences) to scale analysis of extended binary and symbolic inputs
- BasicBlockRank (BBR)-weighted embeddings for malware detection, function labeling, and similarity analysis, where artifact importance is inherited from critical basic blocks
- Labs: Generate embeddings with transformers, apply Longformer for long-context binary inputs, and demonstrate how BBR weighting improves RE tasks such as function prediction and similarity search
Day 4: LLMs, Agents & Fine-Tuning
- LLMs for RE: summarization, function labeling, and vulnerability reporting
- Prompt engineering, RAG, and KnowledgeRAG for RE automation
- Embeddings as the foundation for agent workflows, building on Day 3 methods to power retrieval, grounding, and reasoning over binaries
- Agentic workflows: single-agent and multi-agent pipelines using Autogen
- Labs: Combine embeddings from Day 3 with LLMs, Autogen, and the Model Context Protocol (MCP: a standard for connecting LLMs to external tools and data) to build effective agent pipelines for RE automation

Hardware/Software Requirements

Students should ensure they have a laptop with a minimum of 32 GB RAM, 250 GB of free disk space, and a processor with at least 4 cores, equivalent to an Intel i7 or higher. The processor must be an x86_64 architecture to ensure compatibility with the course-provided virtual machine (VM) and to run VirtualBox version 7.1 or later. Additionally, the processor must support AVX (Advanced Vector Extensions), which are required for running machine learning frameworks such as TensorFlow and PyTorch. Connectivity capabilities are also essential for accessing external services used in the Large Language Models (LLMs) components of the course. VirtualBox should be pre-installed to enable participation in the hands-on labs and exercises.

Prerequisites

Students should have a solid foundation in reverse engineering and be comfortable with Python object-oriented development. Familiarity with basic ML concepts (e.g., vectors, supervised learning, precision/recall) is helpful but not required; these topics will be introduced and covered at the start of the course to bring all participants to a common baseline.

Objectives

Automate malware, firmware, and vulnerability analysis workflows using Blackfyre, BinQL/Neo4j, and embeddings for tasks such as similarity detection, function name prediction, clustering, and vulnerability detection.
Scale analysis beyond single binaries, applying graphs and embeddings to malware families and firmware systems to surface ecosystem-level insights and accelerate vulnerability discovery.
Leverage LLMs effectively through RAG, KnowledgeRAG, Autogen agents, MCPs, and fine-tuning with LLaMAFactory, enabling natural-language queries, accurate summarization, vulnerability reporting, and RE-specific function labeling.

Who Should Take This Course

Reverse engineers and malware analysts who want to automate manual workflows such as function labeling, similarity detection, and vulnerability tracing.
Firmware analysts interested in scaling analysis across systems of binaries and identifying ecosystem-level vulnerabilities.
Security researchers and engineers with Python experience who want to apply AI/ML methods, graphs, and LLMs to binary analysis.
Practitioners exploring AI/ML for RE who may already know RE fundamentals but want hands-on exposure to embeddings, graph databases, and LLM-driven workflows.

Who Would Not Be a Good Fit for This Course

Participants with no prior reverse engineering experience — the course assumes familiarity with RE concepts and tools.

Changes from the Previous Offering of the Course

This year's course expands beyond earlier versions by introducing graph-driven workflows, advanced LLM methods, and agentic automation for reverse engineering:

BinQL with Neo4j: Students gain exclusive early access to BinQL, which structures binaries as graphs of functions, basic blocks, imports, and strings, and enables cross-binary analysis through the GQL query standard.
Natural Language to GQL (NL2GQL): Reverse engineering questions expressed in everyday RE language (e.g., "list binaries in this firmware image that call vulnerable function X") are translated into precise graph queries.
Knowledge Graphs for Scaling Analysis: Moving from single-binary workflows to ecosystem-level insights across malware families or firmware systems of binaries.
KnowledgeRAG: Extending RAG with embeddings and knowledge graphs, grounding LLM reasoning in structured RE data for summarization, vulnerability reporting, and signature generation.
Agentic LLMs with Autogen: Introducing autonomous, tool-using LLM agents that can plan, reason, and iteratively interact with RE systems such as disassemblers and graph databases, enabling adaptive automation of workflows like large-scale triage and recursive malware family exploration.

BIO

Dr. Malachi Jones is a Principal Cybersecurity AI/LLM Researcher and Manager at Microsoft, where he currently leads a team advancing red team agent autonomy within Microsoft Security AI (MSECAI). His present focus is on building autonomous red team agents, while his earlier work centered on fine-tuning large language models (LLMs) for security tasks and developing reverse engineering capabilities in Security Copilot.

With over 15 years in security research, Dr. Jones has contributed to both academia and industry. At MITRE, he advanced ML- and IR-based approaches for automated reverse engineering, and at Booz Allen Dark Labs, he specialized in embedded security and co-authored US Patent 10,133,871.

In addition to his work at Microsoft, Dr. Jones is the founder of Jones Cyber-AI, an organization dedicated to independent research and teaching initiatives. Through Jones Cyber-AI, he has developed and taught his specialized course, Automating Reverse Engineering Processes with AI/ML, NLP, and LLMs, at premier conferences including Black Hat USA (2019, 2021, 2023–2025) and RECON Montreal (2023–2025). His independent research in AI/ML, Graphs, and LLMs agents ensures his courses remain cutting-edge and aligned with the latest advances in cybersecurity and reverse engineering.

He previously served as an Adjunct Professor at the University of Maryland, College Park, and holds a B.S. in Computer Engineering from the University of Florida, as well as an M.S. and Ph.D. from Georgia Tech, where his research applied game theory to cybersecurity. His expertise continues to drive innovation in AI-driven cybersecurity and automated reverse engineering.

To Register

Click here to register.