Lukas Hruby

Tech Leader for AI.

  • 15+ years in software, 8.5 years as CTO: designed systems from MVP to global production in >100 countries
  • AI to production: real-time inference, LLM integration, MLOps, evaluation, governance — not just PoC
  • Software architecture & scaling: component design, API contracts, observability, cost/performance trade-offs
Lukas Hruby

Northern & Central Europe • Online

15+ years in tech
8.5 years as CTO

I help when

Problems I solve best — because I've solved them before, at scale.

You need to get AI/LLM into a product that already works — connecting to data, workflow and people. I'll help embed it in your current system so it doesn't slow down the team or customers.

You want AI that delivers results (not just wow effect) — we'll define what "success" means, how to measure it and how to improve it. Reliability, quality control and safe behavior are part of the delivery.

You're dealing with performance, speed and cost — so it's fast for users and makes economic sense. I've optimized infrastructure for 10M+ daily requests with 90% cost savings.

You want to find the best AI opportunities in product, marketing and sales — from ideas to priorities: where AI will increase conversion, improve retention, speed up support or help with sales. I'll help with use cases, ROI and a roadmap you can deliver.

You need technical leadership that connects business and engineering — so decisions are made quickly, clearly and without scope creep. Standards, review, decision process and delivery leadership; I've built a team from 0 to 20 engineers.

You have challenges in computer vision / real-time solutions — when AI has to work "right now" in the real world. I've led a team that processed 1000+ camera streams with sub-second latency and 99%+ accuracy across >100 countries.

Selected Wins

  • Built production ML system processing real-time video from 1000+ cameras with sub-second latency and 99%+ accuracy — used in >100 countries, reduced manual monitoring by 80%
  • Designed reference architecture for AI products — data pipeline → model serving → observability → governance. Systems that run reliably in production, not just in a notebook.
  • Established model evaluation standards — offline eval + online metrics + regression tests. Measurable quality instead of "seems to work".
  • Scaled cloud infrastructure to 10M+ daily API requests with 99.9% uptime — 90% cost savings through GPU workload and architecture optimization
  • Led engineering team growth from 0 to 20 engineers (sped up delivery by 3x, established hiring system, cross-functional processes)
  • Established MLOps practices enabling continuous deployment (reduced time-to-production from weeks to days, 50% faster iteration)

What clients say

Their software is better than any in the marketplace right now.

— Enterprise client, transportation sector

What I deliver

Transparency: I'm a co-founder/ex-CTO of GoodVision, so I'll always disclose any potential conflict and suggest alternatives when appropriate. If an existing product is a better fit than custom development, I'll say so openly.

Not sure which fits? Book a free 30-min call and we'll figure it out together.

AI assistants in practice

Intake assistant for a law firm

Context: Law firm receiving new inquiries from multiple channels — often incomplete, senior staff spending time on initial evaluation.

Problem: Poorly framed questions or insensitive communication drives clients away. Key information missing for initial assessment.

What we delivered: An assistant that identifies case type, asks for missing information (structured but human-like), and prepares a summary and materials for the lawyer. Clear boundaries — the assistant doesn't give legal advice, only collects information.

Result: Up to 80% time savings in evaluating new clients. More consistent intake information, less back-and-forth, better client impression.

Safety: Sensitive data minimization, audit log, role-based access, human escalation on uncertainty

Supplier comparison with assisted process via SMS & WhatsApp

Context: B2B2C model in commodities — many suppliers, many customers, many steps. Users don't want to onboard to another tool.

Problem: Coordination chaos, inconsistent supplier data, communication delays.

What we delivered: A conversational coordinator via SMS/WhatsApp — collects consumer inputs, distributes inquiries to suppliers, tracks process steps, normalizes responses into comparable format. Human intervention only where needed.

Result: Significantly faster turnaround from inquiry to supplier selection. Less manual coordination, higher conversion through familiar channels.

Safety: Input validation, clear rules for customer vs. supplier communication, auditable process

Advisory layer for employee benefits comparison

Context: Benefits comparison tool — users have many preferences but can't translate them into decisions. A data table isn't enough.

Problem: Users see data but don't know "what it means for me". Personalized perspective missing.

What we delivered: An advisory component that creates a clear picture from user inputs and suggests topics worth considering. Recommendations framed as "suggested areas", not hard advice. Transparent "why we recommend this" explanations.

Result: Higher user clarity and confidence. Better engagement — more completed comparisons and higher conversion.

Quality: Continuous tuning on real data and feedback, transparent reasoning

Case Studies

Real-time Computer Vision for CCTV (>100 countries)

Context: GoodVision needed real-time video analysis across multiple locations, serving customers in >100 countries.

Problem: Processing 1000+ camera streams with sub-second latency, scaling to handle peak loads, maintaining 99.9% uptime.

What I did: Designed edge processing architecture running on NVIDIA Jetsons, built model serving infrastructure achieving 99%+ detection accuracy, implemented MLOps pipeline for continuous deployment, optimized for GPU economics and edge deployment.

Result: 99%+ accuracy at sub-second latency, 10M+ daily requests handled reliably, cost per stream reduced by 40%.

Stack: AWS, AWS IoT, Docker, NVIDIA Jetson, Jetpack, PyTorch, TensorRT

Cost & performance architecture for GPU workloads

Context: ML workload requiring significant GPU compute with cost and latency constraints.

Problem: Balancing GPU costs, latency requirements, and scalability for variable workloads.

What I did: Architected hybrid cloud solution (on-demand + spot instances), implemented auto-scaling, optimized model inference, established cost monitoring and alerting.

Result: 90% cost reduction while maintaining latency SLAs, automated scaling handled 10x traffic spikes.

Stack: AWS EC2, ECS, CloudWatch, custom cost optimization

Engineering team scaling (0 → 20)

Context: Needed to scale engineering from founding team to support growth across >100 countries.

Problem: Hiring quality engineers, establishing technical culture, building processes for distributed team, maintaining delivery speed.

What I did: Built hiring process and technical interviews, established architecture principles, implemented CI/CD and code review practices, created onboarding system, set up cross-functional collaboration.

Result: Team grew from 0 to 20 engineers across time zones, delivery velocity increased 3x, technical debt managed systematically.

Stack: Hiring processes, technical culture, architecture governance, CI/CD, cross-functional processes

About

AI & LLM to Production

From use-case identification to deployment: architecture, evaluation, monitoring, guardrails. Not just PoC, but systems that work in production.

Software Architecture & Scaling

Component design, API contracts, observability. Infrastructure handling 10M+ daily requests with 90% cost savings.

Computer Vision & Real-time

Production systems processing real-time video from 1000+ cameras with sub-second latency and 99%+ accuracy across >100 countries.

When to look elsewhere

  • You need a full-time full-stack implementer — my biggest value is direction, architecture, and outcomes. I can lead, but you'll need engineers to build.
  • Success requires writing most of the code — I'll lead strategy and architecture, but we should involve a dev team or agency for implementation.
  • You need deep expertise in a specific framework — I'm not a framework specialist, but I can quickly evaluate and choose the right approach for your problem.

I deliver the most value where business, product, and architecture need to be aligned. Still not sure? Book a free 30-min call — no commitment, we'll figure out if it's a fit.

Ready to talk?

Whether you're integrating AI/LLM, designing production architecture, or need a fractional CTO — let's start with a conversation.

Usually respond within 24 hours.