Description
#LI-BS1 Hybrid: #LI-Hybrid BMC empowers nearly 80% of the Forbes Global 100 to accelerate business value, faster than humanly possible. Our industry-leading portfolio unlocks human and machine potential to drive business growth, innovation, and sustainable success. BMC does this in a simple and optimized way by connecting people, systems, and data that power the world's largest organizations so they can seize a competitive advantage. BMC Software runs the systems that the world's largest enterprises depend on — mainframes, automation, and the control plane underneath them. Putting agentic AI into that environment raises the bar: every agent's action must be grounded, auditable, and reversible. The Office of the CTO is working on the AI Foundation that makes this possible across BMC's product lines, and the heart of it is an Enterprise Agent Gym — the evaluation harness and experimentation loop that turns “the prototype worked” into “the agent is safe to promote to production.”
What you'll do Work directly with members of the technical staff in the Office of the CTO, on the evals and experimentation layer that BMC AI products are built on.
- Design evaluations that catch the failure modes of enterprise agents, including hallucinated tool calls, policy violations, context collapse, and regression under distribution shift.
- Build the Agent Gym — task definitions, graders, reward signals, and trajectory capture — for multi-step agentic workflows.
- Run experimentation sweeps across prompts, models, and scaffolds; quantify trade-offs between accuracy, cost, and latency.
- Turn evaluation results into promotion gates and readiness reports that product teams can act on.
- Contribute to our Responsible AI tooling — grounding checks, policy enforcement, and human-in-the-loop escalation paths.
What you'll take on Your project will be part of the BMC AI Foundation's active workstreams (Agent Gym): evaluations, grader design, experimentation tooling, dataset curation, or trace / replay infrastructure. Exact scope is matched to your strengths during onboarding, with your technical mentors, and is sized to be shippable within 12 weeks.
To be successful in this role, you will:
- Pursue an MS or PhD in CS, ML, or a closely related field, with coursework or research in LLMs, reinforcement learning, or evaluation methodology.
- Build non-trivial work on modern LLM and agent stacks — multi-step tool-using agents, RAG pipelines, or post-training (SFT, DPO, RLHF, or RLVR) — in research, open source, or production.
- Frame a hypothesis, choose the right baselines and ablations, read a learning curve or reward trajectory honestly, and write up what you learned.
- Treat evaluation as a first-class AI and Engineering problem
Especially strong signals
- Publications or public work on LLM evaluation, agent benchmarks, alignment, or RL environments.
- Experience with RLHF / RLVR, reward modeling, or synthetic data generation.
- Contributions to open source eval harnesses, agent scaffolds, or observability tooling.
- Thoughtfulness about AI safety, red-teaming, or the gap between benchmark and deployment.
CA-DNP
Our commitment to you!
BMC's culture is built around its people. We have 6000+ brilliant minds working together across the globe. You won't be known just by your employee number, but for your true authentic self. BMC lets you be YOU!
If after reading the above, You're unsure if you meet the qualifications of this role but are deeply excited about BMC and this team, we still encourage you to apply! We want to attract talents from diverse backgrounds and experience to ensure we face the world together with the best ideas!
BMC is committed to equal opportunity employment regardless of race, age, sex, creed, color, religion, citizenship status, sexual orientation, gender, gender expression, gender identity, national origin, disability, marital status, pregnancy, disabled veteran or status as a protected veteran. If you need a reasonable accommodation for any part of the application and hiring process, visit the accommodation request page.
BMC Software maintains a strict policy of not requesting any form of payment in exchange for employment opportunities, upholding a fair and ethical hiring process.Min salary 73,800 Max salary 123,000 Min Salary - NEW 73,800 Max Salary - NEW 123,000
recruiter_code
Apply on company website