AI inference,
built on light.
Delivering a new type of processing chip architecture to run AI inference faster than conventional GPUs. Enabling enterprises to scale without worrying about power consumption or environmental impact.
Inference
Efficient
No electrical bottleneck
No quantisation loss
Electronic silicon
can't keep up.
As AI inference scales across data centres, electronic silicon hits hard limits in power, heat, and bandwidth that compound at every layer of the stack.
Running AI inference at scale demands enormous power budgets, raising operating costs and carbon footprints.
Cooling overhead constrains data centre density and rack capacity.
Co-processors that handle only matrix operations must hand off non-linear functions to separate hardware. This adds significant I/O cost at every step.
Quantisation degrades model accuracy, with compounding penalties across every inference at scale.
Intelligence
built on light.
Sekkari replaces electrical signal processing with photonic computation, using light instead of electrons to run AI inference. Our optical NAND gates handle non-linear functions, while a separate optical component handles matrix-vector multiplications. This means fewer domain conversions, less power, and faster throughput.
Faster inference than conventional GPUs, benchmarked against the Nvidia A100 across transformer workloads.*
Lower power consumption than the Nvidia A100, delivering more inferences per watt at data centre scale.*
Inference performs exactly as the model was trained. No accuracy penalty at scale.
Integrates with PCIe standards to minimise the cost of adoption within existing data centre infrastructure.
* Benchmarked against the Nvidia A100 for inference over a transformer model with multi-head attention.
Meet demand at the speed of light.
For frontier AI labs, inference is the business. Every served query draws power, ties up a GPU, and feeds directly into unit economics that are getting harder to defend as usage scales. Sekkari lets you serve dramatically more inference per chip and per watt, expanding the customer load your existing infrastructure can carry without proportional growth in spend.
1,000× faster inference on transformer workloads benchmarked against the Nvidia A100, dramatically expanding the user load each accelerator can handle.
18× better power efficiency lowers the energy and cooling cost behind every query, directly improving the unit economics of your hosted models.
The same speed advantage compresses evaluation and benchmark cycles, freeing compute back to the training pipeline so you can iterate models faster.
Serve more inference for less.
Inference is now the dominant cost in serving AI. Every query draws power, draws cooling, and ties up GPU capacity that could be earning revenue elsewhere. Sekkari changes the unit economics: more inferences per chip, fewer watts per inference, and a PCIe-compatible drop-in path so you can deploy without rebuilding your stack.
Run substantially more inference per accelerator, increasing the workload you can serve from the same fleet footprint.
Reduce the energy and cooling cost behind every API call, directly improving margin on hosted AI services.
Designed to slot into existing server architectures, keeping deployment cost low and time-to-revenue short.
More compute per watt, per rack.
AI workloads are pushing rack densities past what conventional cooling and power delivery were built for. Sekkari attacks the problem at the chip level: dramatically less power drawn per unit of inference work, which means less heat to remove and more useful compute per kilowatt delivered to the floor.
Cut the energy required to deliver the same inference workload, reducing both draw and cooling overhead.
Combining faster processing with lower power per chip means more inference output within your existing power and thermal budget.
Compute happens in the optical domain end-to-end on-chip, with PCIe compatibility for integration into standard server architectures.
Supported by world-class
institutions and investors.
Build the future
of AI compute
with us.
We are collaboratively developing photonic AI inference systems by partnering with organisations running AI inference at scale. If your organisation is interested in pioneering the future of compute, get in touch.
Express Interest →Frontier AI labs and model providers looking to serve dramatically more inference per chip while improving the unit economics of hosted models.
Hyperscale and specialist cloud platforms seeking to improve inference margins and energy efficiency without rebuilding their existing server infrastructure.
Operators of high-density AI compute facilities exploring photonic solutions to power, cooling, and rack density constraints.
Universities and labs exploring photonic computing, AI acceleration, and next-generation chip architectures.
Hardware and AI companies integrating photonic compute into their next-generation product stacks.
National programmes and agencies requiring sovereign, energy-efficient AI compute capability.
We offer grants for qualifying research institutions and early-stage companies developing applications on photonic AI. Grants cover co-development, hardware access, and joint publication support.
Applications are reviewed on a rolling basis. Reach out with a short description of your use case and organisation.
Apply for a grant →
Stay ahead of the
intelligence curve.
Get updates on Sekkari's technology, partnerships, and availability. We share relevant news for customers, partners, and investors.
We will not share your details. Unsubscribe at any time.