Senior AI Engineer

Project overview

For our client we are enrichingour existing development team which will be responsible for further developmentof one of our most successful products. The product is an integrated deliveryplatform that leverages the depth and breadth of real-world engagements’experience. It also enables cross-team collaboration, real-time transparencyand insightful decision-making.

Responsibilities:

· Architect end -to-end LLM-based pipelines using RAG, embeddings, and orchestration layers.

· Design multi -agent systemsusing LangChain, LangGraph, CrewAI, or AutoGen, emphasizing modularity andscalability.

· Evaluate model options (e.g.,OpenAI, Azure OpenAI) based on tradeoffs in performance, cost, andcapabilities.

· Define embedding strategies,context windows, and prompt structures for complex reasoning tasks.

· Build chunking and ingestionpipelines for PDFs and unstructured documents using Python SDKs (LangChain,pyPDF, RecursiveCharacterTextSplitter).

· Integrate and deploy vectorstores (e.g., Azure AI Search, FAISS, postgres, chromadb ) with semantic searchand reranking techniques.

· Implement CI/CD pipelines,test coverage, and API integrations (e.g., via Postman, GitHub Actions).

· Build and maintain well-structured, reusable Python classes for LLM tools, pipelines, agents, andevaluation modules.

· Debug and optimize complex,distributed Python systems involving multiple services and third -party APIs.

· Develop and maintain JupyterNotebooks for prototyping, analysis, and executive storytelling, balancingclarity and depth.

· Create and deploy Dockercontainers for reproducible development, testing, and production environments.

· Work with CI/CD pipelines(GitHub Actions or similar) to ensure high -quality, testable code.

· Use vibe coding tools(Cursor, GitHub Copilot) to accelerate development while ensuring security,compliance, and reusability.

· Develop and orchestrateautonomous agents with well -defined roles, tools, and memory -sharingstrategies.

· Implement observer andfallback agents to enhance system resilience and reduce hallucinations.

· Use telemetry andobservability tools (e.g., Data dog, Lang Fuse) to monitor, debug, and optimizeperformance.

· Define and track evaluationmetrics for GenAI and RAG systems (faithfulness, precision, recall, F1,semantic similarity).

· Use sklearn, NumPy, and Ragasevaluation loops to validate model performance and content grounding.

· Implement output schemavalidation, prompt constraints, and quality assurance processes for enterprise-readiness.

· Build interactive prototypesusing Streamlit, LangFlow, or Jupyter Notebooks to demonstrate capabilities andinsights.

· Translate complex models intoclear, explainable insights for executive stakeholders.

· Lead design sessions andmentor developers on AI best practices and tooling.

Skills Required:

· 8+ years of experience insoftware engineering, with at least 3+ years focused on AI/ML, NLP, or LLM-based applications.

· Expert -level proficiency inPython, with a strong focus on building and debugging modular, class -basedcode for reusable, production -grade systems.

· Proven experience designingand deploying GenAI solutions using prompt engineering, embedding -basedretrieval, and multi -agent orchestration patterns.

· Deep hands -on experiencewith modern AI/LLM frameworks including LangChain, LangGraph, CrewAI, AutoGen,and Hugging Face.

· Demonstrated ability toingest, chunk, and semantically index large -scale unstructured data (e.g.,PDFs, HTML, JSON) using LangChain, embedding models, and custom pipelines.

· Proficient in deploying andquerying vector databases such as Azure AI Search, FAISS , or Postgres withpgvector for semantic search applications.

· Strong working knowledge ofJupyter Notebooks, Docker, GitHub, and RESTful API integration to acceleratedevelopment and prototyping workflows.

· Familiarity with DevOps andCI/CD pipelines (e.g., GitHub Actions) to enable rapid iteration, testing, anddeployment of AI pipelines.

· Skilled in evaluating LLM andRAG pipeline performance using metrics like semantic similarity, precision,recall, and faithfulness with tools such as sklearn, NumPy, and RAGAS.

· Hands -on experienceimplementing observability and telemetry using platforms like LangFuse ,Datadog, or custom logging/tracing solutions.

· Comfortable working with AI-powered developer tools such as GitHub Copilot, Cursor, and other vibe codingassistants to enhance velocity and maintainability.

Preferred Qualifications

· Experience with multimodalLLMs, vision -language models, or tool -augmented inference.

· Understanding of reasoningmodels, ReAct -style prompting, and planner -executor agents.

· Experience with agileceremonies, sprint planning, and cross -functional delivery teams.

· Demonstrated ability toevaluate and compare multiple RAG pipelines using structured and human-in-the-loop evaluation methods.

· Strong communication skills —able to interface with engineering and executive audiences alike.

Start date: ASAP

Contract Duration: 12months with likelihood of extension

HackerRank Challenge Yes/No: Yes

Remote vs Onsite: Fullyremote, with possible occasional in person team sessions / workshops /gatherings (i.e. 1x quarter) likely to take place in Prague

US Hours overlap needed?: Minimum2-6pm CET, preferred 2

‍