Data Engineer Intern
Place of work
Work from home
Job details
Job description, work day and responsibilities
TensorStax - Building the Next Generation of Autonomous Agents
We are revolutionizing data engineering with our cutting-edge autonomous agents, backed by a $5M seed round. As a key member of our team, you will play a pivotal role in designing, building, and optimizing production-grade pipelines that our agents learn from and eventually operate.
The Role
As a Data Engineer, you will be responsible for:
• Designing complex schemas in dbt across hundreds of tables, ensuring seamless integration and data integrity
• BUILDING ADVANCED AIRFLOW DAGS WITH SOPHISTICATED DEPENDENCY AND FAILURE HANDLING, guaranteeing smooth pipeline execution
• AUTHORING HIGH-PERFORMANCE SPARK JOBS FOR LARGE-SCALE BATCH AND INCREMENTAL WORKLOADS, maximizing query efficiency
• CODIFYING LINEAGE, TESTING, AND METADATA TO ENABLE AGENTS TO REASON ABOUT PIPELINE STATE, ensuring transparent data management
• PROFILE AND TUNE QUERY PERFORMANCE ACROSS WAREHOUSES AND LAKEHOUSE ENGINES, optimizing database performance
• PARTNERING WITH THE AGENT RESEARCH TEAM TO EXPOSE REALISTIC FAILURE MODES, DATA DRIFTS, AND SLA VIOLATIONS FOR RL TRAINING, driving innovation in data science
• CONTAINERIZING AND DEPLOYING EVERYTHING ON KUBERNETES-BACKED INFRASTRUCTURE, ensuring scalable and efficient deployment
About You
• 4+ years of experience in data engineering or analytics engineering, with a proven track record of shipping pipelines at scale
• Deep expertise in dbt, including macros, custom tests, and refactoring legacy models, ensuring robust data management
• Track record of building and debugging complex Airflow DAGs (Sensors, TaskGroups, SubDAG patterns), guaranteeing smooth pipeline execution
• Spark power-user capable of distributed joins, window functions, and memory tuning, ensuring high-performance data processing
• Solid Python, Git, and CI discipline, ensuring maintainable code and efficient collaboration
• Bonus: experience with Iceberg, Delta, or DataFusion; prior RL or agent work, a plus in driving innovation
Why TensorStax
• Be part of a tight, senior team that values clean code and measurable impact, driving innovation in data engineering
• Competitive salary, meaningful equity, and hardware budget, ensuring a fulfilling career
• Remote-first with optional SF office, offering flexibility and work-life balance
You will be redirected to another website to apply.
Offer ID: #1225393,
Published: 1 day ago,
Company registered: 2 months ago