Software Development Engineer II - Data
Balbix
Software Engineering
Bengaluru, Karnataka, India
Most boards and executives are currently flying blind when it comes to cyber risk. They are guessing. At Safe, we’ve built an AI-driven engine that finally gives the C-Suite a clear, quantified, and real-time view of their security posture. We don’t just provide data; we provide certainty.
We are a $170M Series C-funded category leader. We don’t play in the mid-market; we operate at the highest levels of global enterprise. Today, we are proud to serve 10% of the Fortune 500, protecting global icons such as Apple, Netflix, AT&T, Verizon, and Victoria’s Secret.
As we scale toward our next chapter, we are looking for high-performers who want to do the best work of their careers at the intersection of AI and Cybersecurity.
The Culture Memo: Our Operating System
Safe is not a typical corporate environment. We are a high-intensity, mission-driven team. We value builders who want to define a category and work alongside people who are equally committed to excellence.
Extreme Ownership: We don’t do "not my job." We hire people who see a gap and own the solution from start to finish.
The Elite Standard: We serve the most sophisticated companies on the planet. Our work must be bulletproof. Whether it’s a line of code or a sales deck, we aim for Tier-1 quality every time.
Methodology & Rigor: We don’t wing it. From Force Management and MEDDICC in sales to data-driven sprints in engineering, we rely on proven frameworks to stay disciplined and predictable.
Radical Candor: We move too fast for politics or sugar-coating. We value direct, honest feedback that helps us find the right answer quickly.
The Series C Hustle: We have the stability of a well-funded leader but the heart of a startup.
The Perks & Ownership:
We want our team to feel like owners because they are owners. We trust our people to manage their results and their time.
Meaningful Equity: Every "Safestar" is a shareholder. You aren’t just an employee; you are a partner in our success.
Unlimited Leaves: We don’t believe in clock-watching. We offer unlimited leave because we trust you to take the time you need to recharge while staying committed to the mission.
Comprehensive Benefits: We provide top-tier medical insurance and wellness benefits to ensure you and your family are well cared for.
Career Trajectory: We are growing aggressively. For high-performers, the path for advancement moves at the speed of your ambition.
What You’ll Do:
- Design and Develop: Architect and implement high-scale data pipelines leveraging Apache Spark, Flink, and Airflow to process streaming and batch data efficiently.
- Data Lakehouse and Storage Optimization: Build and maintain data lakes and ingestion frameworks using Snowflake, Apache Iceberg, and Parquet, ensuring scalability, cost efficiency, and optimal query performance.
- Data Modeling and System Design: Design robust, maintainable data models to handle structured and semi-structured datasets for analytical and operational use cases.
- Real-time and Batch Processing: Develop low-latency pipelines using Kafka and Spark Structured Streaming, supporting billions of events per day.
- Workflow Orchestration: Automate and orchestrate end-to-end ELT processes with Airflow, ensuring reliability, observability, and recovery from failures.
- Cloud Infrastructure: Build scalable, secure, and cost-effective data solutions leveraging AWS native services (S3, Lambda, ECS, etc.).
- Monitoring and Optimization: Implement strong observability, data quality checks, and performance tuning to maintain high data reliability and pipeline efficiency.
What We’re Looking For:
- Bachelor’s or Master's degree in Computer Science, Engineering, or a related field
- 3+ years of experience in data engineering with a proven track record of designing large-scale, distributed data systems.
- Strong expertise in Snowflake and other distributed analytical data stores.
- Hands-on experience with Apache Spark, Flink, Airflow, and modern data lakehouse formats (Iceberg, Parquet).
- Deep understanding of data modeling, schema design, query optimization, and partitioning strategies at scale.
- Proficiency in Python, SQL, Scala, Go/Nodejs with strong debugging and performance-tuning skills.
- Experience in streaming architectures, CDC pipelines, and data observability frameworks.
- Proficient in deploying containerized applications (Docker, Kubernetes, ECS).
- Familiarity with using AI Coding assistants like Cursor, Claude Code, or GitHub Copilot
Preferred Qualifications:
- Exposure to CI/CD pipelines, automated testing, and infrastructure-as-code for data workflows.
- Familiarity with streaming platforms (Kafka, Kinesis, Pulsar) and real-time analytics engines (Druid, Pinot, Rockset).
- Understanding of data governance, lineage tracking, and compliance requirements in a multi-tenant SaaS platform.