Worth AI, a leader in the computer software industry, is looking for a talented and experienced Senior Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision-making with the power of artificial intelligence while fostering an environment of collaboration, and adaptability, aiming to make a meaningful impact in the tech landscape.. Our team values include extreme ownership, one team and creating reaving fans both for our employees and customers.
As a Senior Data Engineer, you will lead the design and development of data services and platforms that power our AI-driven products. You'll focus on creating well-structured, validated APIs and event-driven pipelines, enabling scalable, secure, and maintainable data workflows. This is a backend-heavy role ideal for engineers who thrive on clean architecture, automation, and cross-functional delivery.
Responsibilities
- Design and build production-grade **FastAPI services** to serve, validate, and enrich data for ML and analytics use cases
- Create and maintain **asynchronous event-driven pipelines** using
- *Apache Kafka**, ensuring reliable and scalable communication across microservices
- Define and enforce structured data contracts using **Pydantic** and OpenAPI standards
- Develop robust, containerized data services with **Docker** and deploy them using modern cloud-native tooling
- Build and optimize analytical models and data flows in **Amazon Redshift** for business-critical reporting and data science consumption
- Collaborate with data scientists, ML engineers, and backend developers to streamline data sourcing, transformation, and model inference
- Own the lifecycle of data services — including monitoring, observability, testing, and deployment pipelines
- Maintain rigorous standards around data privacy, schema governance, and system performance
- Design, build, code and maintain large-scale data processing systems and architectures that support AI initiatives.
- Develop and implement data pipelines and ETL processes to ingest, transform, and load data from various sources.
- Design and optimize databases and data storage solutions for high performance and scalability.
- Collaborate with cross-functional teams to understand data requirements and ensure data quality and integrity.
- Implement data governance and data security measures to protect sensitive data.
- Monitor and troubleshoot data infrastructure and pipeline issues in a timely manner.
- Stay up-to-date with the latest trends and technologies in data engineering and recommend improvements to enhance the company's data capabilities.
- 7+ years of professional experience in backend-focused data engineering or platform development
- Strong proficiency in **Python**, with hands-on experience using **FastAPI**, **Pydantic**, and asynchronous programming patterns
- Deep understanding of **event-driven architectures** and experience with **Kafka** (producers, consumers, schema evolution, retries, etc.)
- Experience designing and deploying **containerized services** with **Docker** (Kubernetes or Fargate experience is a plus)
- Proficiency in SQL and experience with modern cloud data warehouses, preferably **Amazon Redshift**
- Familiarity with cloud services (preferably AWS), including CI/CD, infrastructure-as-code, and observability tooling
- Experience integrating third-party APIs and working with versioned schema contracts
- Strong communication and collaboration skills, especially in cross-functional and agile teams
- Experience working with ML engineers to operationalize models (e.g., batch scoring, online inference, data validation at model boundaries)
- In-depth knowledge of data modeling, data warehousing, and database design principles.
- Strong programming skills in Python, SQL, and other relevant languages.
- Experience with relational and NoSQL databases, such as PostgreSQL, MySQL, MongoDB
- Proficiency in data integration and ETL tools, such as Apache Kafka, Apache Airflow, or Informatica.
- Familiarity with big data processing frameworks, such as Hadoop, Spark, or Flink.
- Knowledge of cloud platforms, such as AWS, Azure, or GCP, and experience with data storage and processing services in the cloud.
- Understanding of data governance, data privacy, and data security best practices.
- Strong problem-solving and troubleshooting skills, with a focus on data quality and system performance.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Prior collaborative work with data scientists or machine learning professionals with respect to sourcing, processing and scaling both input and output data
- Comfortable going through documentation of third-party API’s and identifying best procedures for integrating data from API’s into broader ETL processes
- Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k, IRA)
- Life Insurance
- Unlimited Paid Time Off
- 9 paid Holidays
- Family Leave
- Work From Home
- Free Food & Snacks (Access to Industrious Co-working Membership!)
- Wellness Resources