Master's Thesis in Computer Science
Håvard Alstadheim
Migrating legacy code to updated APIs is expensive, error-prone, and often deferred indefinitely, leading to an accumulation of technical debt. This thesis investigates whether a large language model (LLM) coding agent can generate abstract syntax tree (AST) transformations to automate JavaScript API migrations. The approach combines API documentation as context, LLM-driven planning and code generation of AST transformation scripts, and automated verification of results. An evaluation on real-world migration tasks compares the precision of the generated AST transformations against direct LLM code rewriting.
Keywords: agentic coding, API migration, AST transformation, large language models, JavaScript
Huge use of AI
Software systems depend on libraries that evolve over time. When a library introduces a new version of its API, all code that uses the old API must be updated. This process, known as API migration, is a core challenge in software maintenance. It arises across programming languages and domains, from web applications to embedded systems, and affects both individual developers and large organizations that maintain extensive codebases.
Recent advances in large language models (LLMs) offer new possibilities for automating code transformations. LLMs demonstrate strong code understanding and generation capabilities, and coding agents — systems that combine LLMs with tool use in an iterative loop — can plan and execute multi-step tasks. However, direct LLM code rewriting is imprecise: the model may introduce subtle bugs, omit necessary changes, or fail to apply modifications consistently across a large corpus of files. An alternative is to have the LLM generate deterministic programs that manipulate the abstract syntax tree (AST) of the source code, combining the model's semantic understanding with the reliability of programmatic transformations.
When a library deprecates or changes its API, every piece of code that depends on the old API must be identified and updated. In large codebases, this involves thousands of files with diverse usage patterns. Manual migration is labor-intensive, error-prone, and often deferred indefinitely, leading to a gradual accumulation of technical debt, security vulnerabilities, and compatibility issues.
The scale of the problem is amplified by the structure of open-source ecosystems. When a widely used library such as React, Express, or Node.js core deprecates an API, thousands of independent projects are affected simultaneously. Many remain on outdated versions indefinitely because the migration effort is not justified for any single project. A reusable, automatically generated transformation could be applied across all affected projects, but existing automated approaches either require manually authored transformation rules, which do not scale to the variety of deprecation patterns, or rely on direct LLM code rewriting, which lacks the precision needed for reliable application without per-file human review.
RQ1: To what extent can an LLM coding agent identify correct transformation rules from API documentation changes? The first step of the approach is rule identification: given documentation for the old and new versions of an API, the agent must produce a set of concrete specifications describing what patterns must change and how. This question evaluates the agent's ability to extract actionable transformation rules without pre-defined input-output examples. The methodology involves providing the agent with pairs of API documentation versions and measuring the completeness and correctness of the identified rules against a manually curated ground truth.
RQ2: To what extent can an LLM coding agent generate correct AST transformations from identified rules? Once rules are identified, the agent must translate them into executable AST transformation scripts. This question measures the precision of the generated transformations: whether they correctly modify matching code patterns and leave non-matching code unchanged. The evaluation applies the generated transformations to a corpus of scripts and compares the output against expected results.
RQ3: To what extent can an LLM coding agent validate the correctness of its own API migrations? After transformations are applied, the agent must assess whether the migrated code is correct. This question investigates the effectiveness of automated verification strategies, including static analysis, LLM-based code review, and LLM-generated tests. The methodology compares the agent's validation judgments against manual review of the transformed code.
This thesis makes the following contributions:
Section 2 introduces the background concepts and techniques that underpin this work. Section 3 surveys related work on automated API migration and LLM-driven code transformations. Section 4 describes the method and agent architecture. Section 5 presents the implementation. Section 6 reports the experimental results. Section 7 discusses findings, limitations, and threats to validity. Section 8 concludes and suggests directions for future work.
This section introduces the key concepts and techniques used in this thesis: API migration, abstract syntax trees and their transformations, large language models for code, and LLM coding agents.
An application programming interface (API) defines the contract between a library and its consumers. When a library evolves, its API may change: functions are renamed, parameters are added or removed, return types change, and entire modules are deprecated. Libraries signal such changes through semantic versioning (semver.org) where major version increments indicate potential breaking changes. API migration is the process of updating consumer code to conform to a new version of the API.
Migration involves three main activities: understanding what has changed, performing the code modifications, and verifying that the modifications preserve the intended behavior. Each of these activities can be performed manually or with varying degrees of automation. The cost and error rate of manual migration motivate the development of automated techniques.
An abstract syntax tree (AST) is a tree representation of the syntactic structure of source code. Each node in the tree corresponds to a construct in the language, such as a function declaration, an expression, or a statement. Unlike the raw source text, the AST abstracts away formatting details such as whitespace and comments, providing a structured representation suitable for programmatic analysis and transformation.
AST transformation tools allow developers to write programs that traverse the tree, match specific patterns, and rewrite them. Common applications include automated code migration (codemods), linting, and code formatting. In the JavaScript ecosystem, tools such as Babel (babeljs.io), jscodeshift (github.com/facebook/jscodeshift), SWC (swc.rs), and Oxc (oxc.rs) provide APIs for this purpose. AST transformations are deterministic: given the same input, they always produce the same output. This property makes them suitable for large-scale code modifications where consistency is critical.
AST tools differ in the level of semantic information they
provide. Pattern-based tools match syntactic structures using
template patterns, analogous to regular expressions over tree
structures. They are fast and language-agnostic but cannot
distinguish between identically named variables in different
scopes. Traversal-based tools with semantic analysis, such as
Oxc Traverse, augment the syntax tree with scope and binding
information: for each identifier, the tool can determine where
it was declared and whether two references point to the same
binding. This distinction is central to the thesis, as it
determines which transformations can be applied correctly
without manual review. For example, consider migrating
obj.status to obj.getStatus():
// Should be transformed (status comes from the API):
let task = api.getTask();
if (task.status === "done") { ... }
// Should NOT be transformed (unrelated object):
let resp = { status: 200 };
if (resp.status === 200) { ... }
A pattern-based tool matches both occurrences of
.status identically. A binding-aware tool can
determine that task originates from the API while
resp is a local object literal, and transform
only the first.
Large language models (LLMs) are neural networks trained on large corpora of text, including source code. Models such as GPT-5, Claude, and Gemini demonstrate the ability to understand code semantics, generate code from natural language descriptions, and perform code transformations [1]. When applied to code rewriting, LLMs can handle a wide range of patterns without explicit programming, but they may introduce errors that are difficult to detect. In particular, LLMs lack the consistency needed for large-scale application: the same migration pattern may be handled correctly in one file and incorrectly in another, with no guarantee of uniform behavior across a corpus.
An LLM coding agent is a system that operates in an iterative loop: it receives input, reasons about it, uses tools (such as code execution, file editing, or web search), observes the results, and decides on the next action. This loop allows the agent to perform multi-step tasks that go beyond what a single LLM inference can achieve. Techniques such as chain-of-thought prompting [2], which decomposes complex tasks into intermediate reasoning steps, are foundational to how agents plan and recover from errors. Agent architectures differ in their degree of autonomy, available tools, context management strategies, and feedback mechanisms.
This section surveys the academic literature on automated API migration, LLM-based code transformation, and related techniques. Each subsection groups papers by theme and discusses their contributions and limitations in relation to this thesis.
Cummins et al. [1] introduce the idea of having an LLM
generate AST transformation scripts rather than rewriting code directly. Working
with Python's built-in ast module, they provide the model with
pre-defined input-output pairs and let it iterate on both a verbal rule description
and a transformation script. The approach achieves 95% precision on their benchmark,
compared to 60% for direct LLM code generation. However, the method requires
manually curated input-output examples and targets only simple, isolated patterns.
This thesis extends their approach by using API documentation instead of
hand-written examples and by targeting JavaScript.
Ramos et al. [3] use LLMs to generate AST transformations expressed in a pattern-based domain-specific language (DSL) for library migrations in Python. Their system, Spell, combines LLM-generated example cases with transformation synthesis, achieving results on popular libraries and real repositories. A limitation is that Spell relies on the model's training-time knowledge of libraries rather than API documentation, and the pattern-based DSL may not express all transformation patterns. This thesis differs by using general-purpose AST manipulation APIs and by explicitly providing API documentation as context.
Dilhara et al. [4] demonstrate that LLMs can generate effective input variations and test cases for code change patterns. Their generated examples improve the generalization of transformation rules produced by a traditional rule synthesizer. The paper shows the value of LLM-generated data at the example level, but does not use LLMs for writing the transformations themselves. This thesis builds on their insight by exploring whether LLM-generated examples and tests can also improve the agent's self-validation.
The paradigm of having an LLM synthesize a deterministic program rather than directly performing a task has emerged independently in several domains beyond code migration. Yang et al. [5] present KNighter, which uses LLMs to synthesize static analysis checkers for the Linux kernel from historical bug patterns. Rather than having the LLM scan millions of lines of kernel code directly, KNighter generates specialized checkers that are validated against original patches and iteratively refined to reduce false positives. The synthesized checkers have discovered 92 new bugs in the Linux kernel, of which 57 have been fixed and 30 assigned CVE numbers. This demonstrates that LLM-synthesized analysis programs can exceed the coverage of human-written analyzers.
Wang et al. [6] apply a similar approach to security vulnerability detection. QLCoder synthesizes CodeQL queries from CVE metadata using an agentic framework with execution feedback. The LLM is embedded in a synthesis loop where generated queries are executed against vulnerable and patched code versions, and the results guide refinement. On 176 CVEs across 111 Java projects, QLCoder synthesizes correct queries for 53.4% of cases, compared to 10% with the LLM alone. The feedback loop — generating a query, executing it, and refining based on results — is structurally analogous to our pipeline's synthesize-compile-test-refine cycle.
Zhang et al. [7] propose a hybrid approach for refactoring Python code to idiomatic patterns. They use LLMs to generate Analytic Rule Interfaces (ARIs) — Python code that performs deterministic AST analysis and rewriting — and combine these with LLM-driven code abstraction and idiomatization. The hybrid approach achieves over 90% accuracy on thirteen Pythonic idioms, outperforming both rule-only and LLM-only baselines. Their work demonstrates the value of combining LLM adaptability with the determinism of generated analysis code, a principle shared by our approach.
The role of compiler feedback in LLM-based code generation has been studied in the context of C-to-Rust translation. Eniser et al. [8] develop Fluorine, a framework where an LLM generates Rust translations that are checked for I/O equivalence with the original C code via differential fuzzing. When translations fail, feedback is provided and the LLM repairs its output. The most successful model translates 47% of real-world benchmarks. Weiss et al. [9] study the same setting and find that feedback loops substantially reduce the performance gap between different LLMs: without feedback, model selection has a large effect on translation success, but with compiler-driven feedback the differences diminish. This finding is directly relevant to our work, where the Rust compiler serves as the primary feedback mechanism for the LLM agent.
API evolution is a long-studied problem: Dig and Johnson [10] show that the majority of API-breaking changes in Java frameworks are refactorings such as renames and signature changes. Automating the client-side response to such changes has motivated a line of code transformation tools. Lawall and Muller [11] describe Coccinelle, a program matching and transformation tool for C that has produced over 6,000 commits to the Linux kernel since 2008. Coccinelle uses SmPL (Semantic Patch Language), a domain-specific language that expresses transformations in a notation resembling the familiar patch syntax. SmPL patterns use metavariables to match arbitrary terms and “...” to follow intraprocedural control-flow paths. Coccinelle performs type inference but no alias analysis or dataflow analysis, and processes files one at a time without interprocedural analysis. These design choices keep the tool fast and predictable but limit it to transformations where the relevant information is apparent within a single file and function. This limitation is representative of the broader class of pattern-based tools: they excel at syntactic matching but cannot track how values flow across scopes or bindings. More recent tools such as ast-grep (ast-grep.github.io) use tree-sitter queries for the same purpose.
Ketkar et al. [12] introduce PolyglotPiranha, a domain-specific language for expressing interdependent multi-language code transformations. The key contribution is extending lightweight match-replace systems with composition, ordering, and flow through a directed edge-labelled rule graph. Each node defines a match-replace rule; edges define the scope (file, class, method) in which subsequent rules are applied. Deployed at Uber, PolyglotPiranha deleted 210,000 lines of stale feature flag code and migrated 20,000 lines across 1,611 pull requests. It is 42.5 times faster than the imperative ErrorProne-based predecessor, 12.3 times faster than Comby for feature flag cleanup, and more concise than imperative alternatives. PolyglotPiranha is the target DSL for Spell [3], making it directly comparable to our work. While its rule graph approximates scope-aware transformations through cascading rules, it remains fundamentally pattern-based: it cannot resolve variable bindings or track data flow, and its soundness depends on the comprehensiveness of the manually or automatically authored rules.
Ramos et al. [13] present MELT, a technique for mining API migration rules directly from library pull requests. MELT identifies breaking changes and deprecations in pull request descriptions, extracts code examples from the associated commits, and uses an LLM to generate additional transition examples. From these examples, it infers transformation rules in Comby [14], a structural search-and-replace language. A generalization procedure increases the number of matches for mined rules by a factor of nine. MELT is a predecessor to Spell, which later adapts its rule inference algorithm as the basis for anti-unification-based synthesis. Its key advantage over earlier migration tools is that it operates on the library's own development history rather than requiring already-migrated client projects. However, the inferred Comby rules are syntactic and context-free, lacking the ability to express cascading or scope-dependent transformations that PolyglotPiranha later adds.
Ni et al. [15] propose SOAR, a technique that combines natural language processing with program synthesis for API refactoring. SOAR uses API documentation to build a matching model that maps source API calls to target API calls, then uses program synthesis to construct the full target call with correct arguments. When a synthesized call produces a runtime error, SOAR uses the error message to generate constraints that prune the search space. Evaluated on TensorFlow-to-PyTorch and dplyr-to-pandas migrations, SOAR successfully migrates 80% of deep learning models and 90% of data wrangling benchmarks. Like our approach, SOAR uses documentation rather than pre-existing migration examples as input. However, it synthesizes each migration instance individually through enumeration rather than generating a reusable transformation program, limiting its scalability to large corpora.
Rolim et al. [16] present ReFazer, a technique for learning program transformations from input-output examples using programming-by-example methodology. ReFazer defines a domain-specific language for AST transformations and uses the PROSE programming-by-example framework to efficiently synthesize transformations that generalize across examples. Evaluated on student programming assignments, it fixes 87% of incorrect submissions; on repetitive code edits from C# projects, it learns the intended transformation in 83% of cases using 2.8 examples on average. ReFazer demonstrates that program transformations can be synthesized from a small number of examples, but requires those examples to be provided upfront. Our approach replaces this requirement with API documentation, using the LLM to bridge the gap between a natural language specification and executable transformation code.
Smirnov et al. [17] present TCInfer, a technique for inferring and applying type change rules from library version histories. TCInfer mines type migration patterns from code changes and generates transformation rules in Comby. It serves as the rule inference engine used by PyEvolve [4] and later adapted by Spell [3] for anti-unification-based synthesis. TCInfer demonstrates that useful transformation rules can be automatically inferred from version histories, though the inferred rules remain pattern-based and cannot capture binding-dependent transformations.
Ziftci et al. [18] describe the largest reported LLM-assisted code migration, conducted at Google over twelve months. The system automates int32-to-int64 migrations by using Kythe, a code indexing system, to discover direct and indirect references to target identifiers up to five hops away. An LLM then generates the necessary code changes, which are validated through automated categorization and regression testing before being sent to developers for review. Across 39 distinct migrations, the system submitted 595 code changes containing 93,574 edits, of which 69.46% were generated by the LLM. The developers reported a 50% reduction in total migration time compared to earlier manual efforts. This work demonstrates that direct LLM rewriting can be effective at scale when supported by strong tooling for reference discovery and validation. However, the approach produces no reusable transformation artifact: each migration instance is generated independently, and the LLM's changes must be individually reviewed.
Li et al. [19] argue that code migration without environment interaction is incomplete. They observe that static analysis alone misses subtle runtime errors caused by version differences, such as internal constraint changes between library versions that preserve the API signature but alter behavior. They propose a multi-agent framework with three collaborating agents: a migration agent for code transformation, an environment agent for autonomous environment construction and validation, and a test suite agent for test generation and regression testing. These agents operate in a closed feedback loop integrated with CI/CD workflows. While their work is a workshop paper without empirical evaluation, the central argument — that automated migration requires iterative feedback from compilation and test execution — directly supports the design of our pipeline, which implements this feedback loop within a single agent using Rust compiler errors and test results as structured signals for refinement.
Islam et al. [20] evaluate three LLMs — Llama 3.1, GPT-4o mini, and GPT-4o — on PyMigBench, a dataset of 321 real-world Python library migrations containing 2,989 migration-related code changes. GPT-4o correctly migrates 94% of individual code changes and fully migrates 57% of complete migrations. When evaluated using unit tests, 64% of GPT-4o's migrations pass the same tests as the developer's migration. All three models struggle with argument transformations that require changing argument values or types. To control for potential data contamination, the authors also evaluate on 10 repositories where the migration never occurred, finding that GPT-4o perfectly migrates 50% of these unseen cases. The study establishes a baseline for direct LLM migration performance: high at the individual code change level, but with significant room for improvement on complete migrations and test passage.
Islam et al. [21] introduce PyMigBench, a dataset of 335 real-world Python library migrations with 3,096 labeled code changes, and PyMigTax, a taxonomy of migration-related code changes. The taxonomy categorizes changes along three dimensions: program elements involved (function calls, attributes, imports), cardinality (one-to-one, one-to-many, many-to-many), and additional properties (name changes, argument transformations). This characterization reveals that library migration is not a monolithic task but involves diverse change types with varying difficulty. PyMigBench provides the evaluation infrastructure used by Islam et al. [20] and other recent migration studies, making it a foundational resource for empirical research on migration automation.
Jimenez et al. [22] introduce SWE-bench, an evaluation framework of 2,294 software engineering problems drawn from real GitHub issues across 12 popular Python repositories. Given a codebase and an issue description, a language model must edit the codebase to resolve the issue. The tasks frequently require coordinating changes across multiple functions, classes, and files. At the time of publication, the best model (Claude 2) resolves only 1.96% of issues. SWE-bench establishes that real-world software engineering tasks remain extremely challenging for LLMs operating without iterative feedback or tool use. Our work operates in a more constrained domain — generating AST transformations from API specifications — where the structured nature of the task and the availability of compiler feedback enable substantially higher success rates.
Yang et al. [23] introduce SWE-agent, a system that provides LLM agents with a custom agent-computer interface (ACI) for software engineering tasks. The ACI includes purpose-built commands for navigating repositories, viewing and editing files, and executing tests, with feedback formatted for LLM consumption rather than human readability. SWE-agent achieves 12.5% on SWE-bench, a significant improvement over non-interactive baselines. The key finding is that interface design matters as much as model capability: the same model performs substantially better with a well-designed ACI than with a standard shell. This insight is directly relevant to our work, where the agent's interface — structured Rust compiler errors, test output diffs, and pattern elimination checks — serves an analogous role in guiding the LLM toward correct transformations.
Wang et al. [24] present OpenHands, an open-source platform for building LLM coding agents that provides sandboxed execution environments, browser interaction, and editor interfaces. The platform supports multiple agent architectures and has been evaluated on SWE-bench, web browsing, and general coding tasks. OpenHands represents the trend toward general-purpose agent platforms, in contrast to our domain-specific agent that is tailored to the AST transformation generation task. The trade-off between generality and specialization is a recurring theme: general agents can address a broader range of tasks but achieve lower success rates on any individual task, while domain-specific agents can exploit task structure for higher performance.
The related work reveals a consistent tension between expressiveness and
reliability in automated code migration. Pattern-based tools such as
Coccinelle [11] and PolyglotPiranha
[12] are deterministic and fast, but cannot
express transformations that depend on variable bindings or data flow.
Direct LLM rewriting achieves high correctness on individual code changes
[18] [20] but
produces no reusable artifact and requires per-instance review.
LLM-generated transformation approaches, as demonstrated by Cummins et
al. [1] and Ramos et al.
[3], bridge this gap by producing inspectable,
deterministic transformation programs. This paradigm —
having the LLM synthesize a program rather than directly
performing the task — has proven effective across
domains including static analysis
[5], vulnerability detection
[6], and code translation
[8]
[9]. However, existing work in
code migration targets tools without semantic analysis:
Python's ast module and PolyglotPiranha's
pattern-based DSL, respectively.
No prior work combines LLM-driven transform generation with a transformation framework that provides full binding and scope resolution. This thesis addresses that gap by using an LLM agent to generate transformations in Oxc Traverse, a Rust-based AST traversal framework with complete semantic analysis for JavaScript and TypeScript. The agent receives API documentation as input — rather than pre-defined examples — and uses compiler feedback from Rust's type system as a structured signal for iterative refinement. This iterative generate-compile-test loop follows the generate-and-validate paradigm identified in the automatic software repair literature [25], but applies it to transform synthesis rather than bug repair. The combination of documentation-driven generation, binding-aware transformations, and compiler-guided iteration is, to the best of our knowledge, novel.
The method consists of three phases: rule identification from API documentation, AST transformation generation and execution, and verification of results. An LLM agent orchestrates all three phases in an iterative loop, using feedback from later phases to refine earlier ones. This section describes the design of each phase and the experimental setup used to evaluate the approach.
The agent receives API documentation for the old and new versions of a library. From this documentation, it identifies concrete transformation rules: specific code patterns that must change and how they should change. Unlike prior work that requires hand-written input-output examples [1], this phase relies solely on documentation as input.
Based on the identified rules, the agent writes AST transformation scripts. These scripts parse input JavaScript code, traverse the syntax tree, match the relevant patterns, and apply the necessary rewrites. The agent iterates on each transformation script by executing it against a subset of the corpus and using the results as feedback.
After transformations are applied, the agent verifies the results through multiple checks: whether the code was modified, whether the output passes static analysis (linting), whether the code is executable, and optionally whether it passes LLM-based code review or LLM-generated tests.
Each migration task is classified along three levels. The first level describes the API-level breaking change that triggers the migration. While Ochoa et al. [26] provide a comprehensive taxonomy for Java (Maven Central), Kong et al. [27] provide a taxonomy specific to the npm ecosystem, including rename, remove, move, change signature, and change behavior, with sub-categories for JavaScript-specific patterns such as removing default exports and switching between callbacks and promises. The second level describes the code change itself, adapted from the PyMigTax taxonomy of Islam et al. [21], which classifies changes by program elements involved (function call, import, attribute, declaration), cardinality (one-to-one, one-to-many, many-to-one), and properties such as name changes or argument transformations. The third level is a novel dimension introduced in this thesis: the scoping requirement, which captures the level of semantic analysis needed for a correct transformation. A task requires no scoping if a purely syntactic pattern match suffices, import-tracking if the tool must verify that the function was imported from a specific library, or binding-tracking if the tool must follow variable aliases and assignments to determine whether an identifier traces back to the target API.
The evaluation uses nine migration tasks drawn from seven open-source JavaScript and TypeScript repositories. Table 1 describes the repositories. Table 2 lists the migration tasks and the number of affected code locations. Tables 3 and 4 classify each task along the three levels described above. Table 3 covers the five tasks where TypeScript declarations exist for both library versions, using automatic classification by a tool that diffs the exported API surface. Table 4 covers the remaining four tasks, classified manually with rationale for why automatic classification was not applicable.
| Repository | Stars | Language | Source files | Tests |
|---|---|---|---|---|
| ljharb/qs | 6.5k | JS | 10 | 872 |
| directus/directus | 29k | TS | 1,023 | 3,264 |
| expressjs/express | 68k | JS | 91 | 1,249 |
| downshift-js/downshift | 12k | JS/TS | 136 | 928 |
| restify/node-restify | 11k | JS | ~200 | 209 |
| jprichardson/node-fs-extra | 9.6k | JS | ~80 | 401 |
| ciscoheat/sveltekit-superforms | 2.7k | TS | 40 | 430 |
| Task | Repository | Migration | Sites |
|---|---|---|---|
| hasown | ljharb/qs | has.call(obj, key) → Object.hasOwn(obj, key) |
11 |
| clonedeep | directus/directus | cloneDeep(x) → structuredClone(x) |
48 |
| assert | expressjs/express | assert.equal() → assert.strictEqual() |
87 |
| proptypes | downshift-js/downshift | Remove PropTypes runtime validation (React 19) |
75 |
| var-let | restify/node-restify | var → let/const (scope-aware) |
558 |
| universalify | jprichardson/node-fs-extra | universalify wrappers → fs/promises |
17 |
| lodash | directus/directus | lodash utilities → native JS equivalents | 688 |
| zod-v4 | ciscoheat/sveltekit-superforms | z.string().email() → z.email() (Zod v4) |
43 |
| express5-send | expressjs/express | res.send(status) → res.sendStatus(status) |
? |
| Task | BC type | Elements | Cardinality | Import change | Scoping |
|---|---|---|---|---|---|
| clonedeep | remove | fn call → fn call | 1 : 1 | remove if only | import |
| proptypes | remove (×18) | fn call, import → ∅ | many : 0 | remove | import |
| universalify | remove (×2) | fn call → fn call | 1 : 1 | replace | import |
| lodash | remove (×7) | fn call → fn call, expr | 1 : 1, 1 : many | remove if only | import |
| zod-v4 | change-signature, remove | method chain → fn call | 1 : 1 | none | import |
| Task | BC type | Elements | Cardinality | Import change | Scoping | Rationale |
|---|---|---|---|---|---|---|
| hasown | rename | fn call → fn call | 1 : 1 | none | binding | New API added in ES2022; old pattern not removed from types |
| assert | rename | fn call → fn call | 1 : 1 | none | none | Both methods exist in same version; best-practice migration |
| var-let | change behavior | declaration → declaration | 1 : 1 | none | binding | Language keyword change; no library types involved |
| express5-send | change behavior | fn call → fn call | 1 : 1 | none | import | Same type signature in Express 4 and 5; behavioral difference |
This section describes the implementation of the agent, including the harness architecture, tool integrations, and key design decisions.
This section presents the experimental results, organized by research question.
The accumulation of technical debt and unpatched security vulnerabilities in legacy code is a systemic problem in software engineering. Automated migration techniques reduce the human cost of keeping code updated and decrease the risk of security incidents caused by outdated dependencies.
The AST transformation approach offers a privacy advantage over direct LLM code rewriting. Because the generated transformations are deterministic and reusable, individual user code does not need to be transmitted to external LLM APIs for migration. The transformation scripts run locally, and only API documentation — not user code — is sent to the model during the generation phase. Updated code may also run more efficiently, reducing resource consumption.