An Open Source Approach For Decentralized Edge Operations with IBM Open Horizon and EdgeLake

Enterprises are increasingly evaluating edge deployments that reduce reliance on centralized cloud services. A common driver is that operational data is often generated outside the data center—at substations, factories, water facilities, retail locations, vehicles, and city infrastructure. Traditional approaches frequently move that data to a centralized platform (such as a cloud data lake or historian) before running analytics. In environments with intermittent connectivity, privacy and data-sovereignty requirements, or latency-sensitive workflows, centralizing data can increase privacy and operational risk and create substantial, ongoing, and often unpredictable costs, including cloud-hosted data services, BI tooling, and network ingress/egress.

This article describes how two complementary open source technologies can be used together in edge architectures to offer cloud data services at the edge:

IBM Open Horizon (the open-source edge orchestration layer behind IBM Edge Application Manager) automates deployment, updates, and lifecycle management of containerized workloads across large, distributed fleets of edge nodes. It’s designed to push software from a management hub to edge devices or edge clusters, then keep the services current based on policies.
AnyLog (and its open-source release, EdgeLake) creates an edge data fabric, where data remains decentralized where it’s produced, but is queried and interfaced as if the data were stored in a single database. Applications connect to any EdgeLake node, issue SQL or a request through EdgeLake’s-native MCP server, and the network satisfies queries without centralizing the underlying data.

Used together, the edge supports enterprise-grade, cloud-like data services: Open Horizon deploys and manages EdgeLake services across a fleet of edge nodes at any scale, while EdgeLake supports decentralized data ingestion into databases running at the edge and enables applications to query data in place across nodes as if it were hosted in a centralized cloud database.

Why orchestration and decentralized data often need to be addressed together

Edge initiatives commonly encounter two recurring challenges:

Software lifecycle at scale: Containerizing an edge workload is only part of the Teams still need repeatable deployment across many sites, version control, safe updates, and resilience when nodes go offline.

Data locality, latency, and governance: Even with analytics running near devices, many implementations still centralize data for dashboards, reporting, and AI workflows. That can increase bandwidth usage and introduce delays, and it may be incompatible with local data governance expectations.

Together, Open Horizon and EdgeLake inherently solve these problems:

Open Horizon functions as the software deployment and lifecycle plane for a distributed fleet of edge nodes.
EdgeLake functions as a decentralized data plane that can store and serve data locally while supporting federated queries across nodes.

How Deployments Work in Practice

Open Horizon deployments are typically defined in container terms and managed through a hub-and-agent model. In a representative setup, services are published and edge nodes run agents that evaluate policies and deploy the workloads they match.

A common approach for deploying EdgeLake with Open Horizon looks like this:

Package EdgeLake as a containerized edge service

Open Horizon service definitions specify the configuration agents run EdgeLake on target nodes. Configuration includes:

Which image version to run
Environment variables and mounted configuration (identity, networking, storage paths, credentials/certificates)
The data services to enable (e.g., Postgres, MySQL, MongoDB, Grafana)
Any required ports and host integrations
Publish once and deploy via policies

Rather than installing software manually at each location, Open Horizon policies can target nodes by attributes such as CPU architecture, operating system, location tags, or environment classification. This enables a consistent rollout approach across heterogeneous edge nodes.

Maintain versions and apply updates with controls With edge nodes, updates often need:
- Staged or progressive rollout approaches
- Version pinning by environment (e.g., dev/test/prod)
- Basic health checks and remediation strategies
- A mechanism to roll forward for security fixes

Open Horizon is designed to handle deployment and update workflows across distributed nodes, which can reduce the amount of site-by-site operational work when managing containerized services.

Stream and query data with EdgeLake Stream data into EdgeLake using:
- REST, gRPC, MQTT, Kafka, EdgeX Foundry, and more Query data in EdgeLake using:
- SQL
- An LLM through EdgeLake’s built-in MCP server
- Grafana, Postman,
- EdgeLake’s Remote GUI interface

Example Real-World Smart City Use Case

The City of Sabetha, Kansas provides a practical example of why this architecture can matter. In an LF Edge case study, Sabetha sought improved monitoring and alerting to reduce operational risk across water, wastewater, and power facilities. Documented needs included monitoring generator feeders, supporting real-time operations, and maintaining historical data for trending and analysis. The deployment used EdgeLake in a way that integrated with existing systems and produced real-time dashboards and alerts derived exclusively from edge data and operations. Notably, after two weeks in production, the city cancelled their AWS contract and continues to operate entirely without cloud dependency.

The case also illustrates an operational model where analytics and reporting run locally, eliminating dependence on cloud services for monitoring and management of city infrastructure. In this pattern, EdgeLake functions as the data layer that keeps data local while still making it queryable for real-time insight. Standardized interfaces (including MCP-style interfaces) make that data accessible to downstream automation or AI systems without requiring a separate, custom integration for each source. For failure prevention scenarios—such as generator incidents—reducing reliance on centralized pipelines can improve responsiveness when minutes matter.

Takeaways

For organizations seeking to simplify edge operation, it can be useful to separate (and then intentionally recombine) two concerns:

Fleet software operations: how workloads are deployed, updated, and kept consistent across many distributed nodes

Data access model: how operational data is stored, queried, and governed without centralization

Open Horizon addresses the first by providing mechanisms to manage containerized services across fleets. EdgeLake addresses the second by enabling decentralized data storage with federated query access across nodes. Combined, they form a new edge architecture where software distribution is automated and data can remain local while still being accessible for real-time monitoring, AI-enabled analysis, and automated operations.