{"id":398,"date":"2025-09-09T13:12:58","date_gmt":"2025-09-09T13:12:58","guid":{"rendered":"https:\/\/netpack.pt\/vaitp\/?page_id=398"},"modified":"2025-09-09T14:59:54","modified_gmt":"2025-09-09T14:59:54","slug":"techniques","status":"publish","type":"page","link":"https:\/\/netpack.pt\/vaitp\/techniques\/","title":{"rendered":"Techniques"},"content":{"rendered":"\n<style>\n    .vaitp-container {\n        \/* Dark theme base styles *\/\n        font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", Roboto, Oxygen-Sans, Ubuntu, Cantarell, \"Helvetica Neue\", sans-serif;\n        line-height: 1.6;\n        background-color: #1e1e1e; \/* Dark background *\/\n        color: #e0e0e0; \/* Light grey text for readability *\/\n        padding: 20px;\n        border-radius: 8px;\n    }\n    .vaitp-container h2 {\n        \/* Lighter border and text for headings *\/\n        border-bottom: 2px solid #00a0d2; \/* Brighter blue for accent *\/\n        padding-bottom: 10px;\n        margin-top: 30px;\n        color: #ffffff; \/* White heading text *\/\n    }\n    .vaitp-container h3 {\n        color: #f0f0f0; \/* Slightly off-white for subheadings *\/\n        margin-top: 25px;\n    }\n    .vaitp-container .highlight {\n        \/* A slightly lighter dark shade for highlighted sections *\/\n        background-color: #2c2c2c; \n        border-left: 4px solid #00a0d2; \/* Brighter blue border *\/\n        padding: 15px;\n        margin: 20px 0;\n        border-radius: 4px;\n    }\n    .vaitp-container code {\n        \/* Darker, distinct background for inline code *\/\n        background-color: #3a3a3a;\n        color: #f0f0f0; \/* Light text for code *\/\n        padding: 3px 6px;\n        border-radius: 4px;\n        font-family: \"Courier New\", Courier, monospace;\n    }\n    .vaitp-container ul, .vaitp-container ol {\n        list-style-type: disc;\n        padding-left: 20px;\n    }\n    .vaitp-container li {\n        margin-bottom: 10px;\n    }\n    .vaitp-container a {\n        color: #00a0d2; \/* Ensure links are visible *\/\n    }\n<\/style>\n\n<div class=\"vaitp-container\">\n    <h2>Inside VAITP: A Deep Dive into the Framework&#8217;s Techniques<\/h2>\n    \n    <p>The <strong>VAITP (Vulnerability Attack and Injection Tool for Python) CLI Framework<\/strong> is an advanced system designed to automatically inject verifiable vulnerabilities into Python code. Its purpose is to create large, realistic datasets of vulnerable code to help security researchers test and develop better defensive tools. This is achieved through a sophisticated, multi-agent architecture and a rigorous verification process.<\/p>\n\n    <div class=\"highlight\">\n        <p><strong>Core Mission:<\/strong> To overcome the bottleneck of manually creating vulnerable code corpora by using Large Language Models (LLMs) to automate the injection of realistic and verifiable security flaws at scale.<\/p>\n    <\/div>\n\n    <h2>The Dual-Agent &#8220;Planner-Coder&#8221; Architecture<\/h2>\n    <p>At the heart of VAITP is a multi-agent system that divides the complex task of vulnerability injection between two specialized AI agents. This separation of concerns is a key design principle that allows for more controlled and precise code modification.<\/p>\n\n    <ul>\n        <li>\n            <strong>The Planner LLM:<\/strong> This high-level agent acts as the &#8220;brains&#8221; of the operation. It analyzes the target source code and, using one of several guidance strategies, creates a detailed plan or &#8220;meta-prompt.&#8221; The Planner selected for this role was <code>Llama-3.1-8B-Instruct<\/code> due to its superior ability to generate high-quality, actionable instructions.\n        <\/li>\n        <li>\n            <strong>The Coder LLM:<\/strong> This agent is the &#8220;hands.&#8221; Its sole job is to execute the instructions provided by the Planner. It takes the original code and the prompt and performs the precise modifications needed to inject the vulnerability. The framework is designed to work with various Coder models to find the most effective combination.\n        <\/li>\n    <\/ul>\n\n    <hr style=\"margin: 40px 0; border-color: #444;\">\n\n    <h2>Guiding LLMs with Meta-Prompting<\/h2>\n    <p>One of the core guidance strategies tested by VAITP is <strong>meta-prompting<\/strong>. This technique involves using the capable Planner LLM to generate a highly optimized, task-specific prompt for the Coder LLM to follow. This effectively decomposes the complex task into two distinct stages:<\/p>\n    <ul>\n        <li><strong>A &#8220;reasoning&#8221; stage:<\/strong> The Planner model analyzes the problem, the source code, and the vulnerability goal.<\/li>\n        <li><strong>A &#8220;task execution&#8221; stage:<\/strong> The Coder model receives a precise, structured set of commands from the Planner and executes them.<\/li>\n    <\/ul>\n    <p>In VAITP, this took the form of the Planner creating a detailed instructional list to precisely control the Coder&#8217;s behavior.<\/p>\n\n    <h2>Retrieval-Augmented Generation (RAG)<\/h2>\n    <p>To ensure the injected vulnerabilities mirror real-world security flaws, VAITP heavily relies on <strong>Retrieval-Augmented Generation (RAG)<\/strong>. This technique enhances the AI&#8217;s prompts with relevant, external information, preventing &#8220;hallucinations&#8221; and grounding the output in reality.<\/p>\n\n    <h3>How VAITP Implements RAG:<\/h3>\n    <ol>\n        <li><strong>Knowledge Base:<\/strong> The system is built on a comprehensive knowledge base of over 1,200 vulnerable Python files derived from real-world Common Vulnerabilities and Exposures (CVEs).<\/li>\n        <li><strong>Semantic Search:<\/strong> When a task begins, the Planner queries this knowledge base to find the most semantically similar examples of the target vulnerability. This is done using a sentence transformer model (<code>all-MiniLM-L6-v2<\/code>) and a FAISS vector index for efficient searching.<\/li>\n        <li><strong>In-Context Learning:<\/strong> The retrieved examples are then fed directly into the prompt for the Coder LLM. This provides concrete patterns for the Coder to imitate, a strategy that proved surprisingly more effective than complex, abstract instructions.<\/li>\n    <\/ol>\n\n    <div class=\"highlight\">\n        <p><strong>The &#8220;Planner Bottleneck&#8221; Discovery:<\/strong> A key finding of our research was that for capable Coder models, a simpler strategy of direct RAG-based imitation (<code>DIRECT_RAG<\/code>) was significantly more effective than the complex meta-prompting approach. The Planner agent, while capable, could inadvertently introduce noise or ambiguity, hindering the Coder&#8217;s performance. Simpler was better.<\/p>\n    <\/div>\n\n    <h2>The Multi-Stage Automated Verification Pipeline<\/h2>\n    <p>Generating code is only half the battle. To ensure the quality and validity of the dataset, every piece of code generated by VAITP goes through a rigorous, multi-stage automated verification pipeline. <\/p>\n\n    <h3>The Three Stages of Verification:<\/h3>\n    <ol>\n        <li>\n            <strong>Syntax Check:<\/strong> The pipeline validates that the generated code is syntactically correct Python (using AST). Any code that fails this check is immediately discarded.\n        <\/li>\n        <li>\n            <strong>Static Analysis (SAST):<\/strong> Syntactically valid code is then analyzed by a suite of Static Application Security Testing (SAST) tools: <strong>DeVAIC<\/strong>, <strong>Bandit<\/strong> and <strong>Semgrep<\/strong>. \n        <\/li>\n        <li>\n            <strong>LLM-based Confirmation:<\/strong> Finally, the raw reports from the SAST tools are aggregated and fed to another LLM agent. This &#8220;confirmation agent&#8221; acts as an automated synthesizer, weighing the evidence from all tools to make a final judgment on whether the vulnerability was successfully injected.\n        <\/li>\n    <\/ol>\n\n    <p>This automated, multi-step process allows the framework to claim that each sample in its final dataset is <strong>&#8220;statically-confirmed&#8221;<\/strong>, providing a high degree of confidence in the corpus&#8217;s quality and achieving a consistent <strong>26.5% success rate<\/strong> at scale (with models that were not fine-tuned and are thus not &#8220;jail-broken&#8221;).<\/p>\n<\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"1080\" src=\"https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method.png\" alt=\"\" class=\"wp-image-410\" srcset=\"https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method.png 1920w, https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method-300x169.png 300w, https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method-1024x576.png 1024w, https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method-768x432.png 768w, https:\/\/netpack.pt\/vaitp\/wp-content\/uploads\/2025\/09\/vaitp_cli_method-1536x864.png 1536w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/><\/figure>\n\n\n\n<details class=\"wp-block-details is-layout-flow wp-block-details-is-layout-flow\"><summary>Image description<\/summary>\n<p>An overview of the VAITP framework pipeline, illustrating the multi-agent process from target code input (1) to the generation of statically-confirmed vulnerable code (6). The core of the architecture involves a Planner LLM that leverages RAG (2) to generate a meta-prompt (3) for a Coder LLM (4), with the final output validated by a multi-stage verification pipeline (5).<\/p>\n<\/details>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Inside VAITP: A Deep Dive into the Framework&#8217;s Techniques The VAITP (Vulnerability Attack and Injection Tool for Python) CLI Framework is an advanced system designed to automatically inject verifiable vulnerabilities into Python code. Its purpose is to create large, realistic datasets of vulnerable code to help security researchers test and develop better defensive tools. This [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-398","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/pages\/398","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/comments?post=398"}],"version-history":[{"count":19,"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/pages\/398\/revisions"}],"predecessor-version":[{"id":419,"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/pages\/398\/revisions\/419"}],"wp:attachment":[{"href":"https:\/\/netpack.pt\/vaitp\/wp-json\/wp\/v2\/media?parent=398"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}