Inspecting Legacy SQL Before Migration
Using parsing and dependency mapping to make legacy SQL easier to review before Databricks migration.
Details are generalized and sanitized to preserve confidentiality while keeping the engineering lesson accurate.
Context
Legacy SQL often mixes business rules, historical assumptions, and technical debt in one query tree. Syntax translation alone does not expose that structure.
Approach
I used a parse-first workflow: built an AST, extracted CTE blocks, mapped dependencies, and generated validation reports. That split long scripts into reviewable units.
The output let engineers trace upstream tables, CTE order, and join paths before editing migration SQL, while humans kept final control over rewrites.
Where AI can help
AI can assist review by summarizing sections, comparing variants, or drafting notes. It was optional support, not an autonomous translator.
This pattern can also be implemented as a local-first prototype for SQL structure analysis and migration review.
Reusable pattern
- Parse first, then rewrite.
- Extract CTEs and dependency graph before migration edits.
- Run parse validation and report structural issues early.
- Use AI as review support with human control.