Cloud application performance management base

Illustration with collage of pictograms of gear, robotic arm, mobile phone

What is APM?

Application performance management (APM) software helps an organization ensure that critical applications meet established expectations for performance, availability and customer or user experience. It enables organizations to predict and prevent performance issues before they impact users or the business.

APM does this by measuring application performance, alerting administrators when performance baselines aren’t met, providing visibility into root causes of performance issues and automatically resolving many performance issues before they impact users or the business.

APM is also an abbreviation for application performance monitoring. The terms are often used interchangeably, but application performance monitoring is actually a component of application performance management because after all, you must monitor performance to manage it.

Increasingly, application performance management solutions are evolving from relying on traditional application performance monitoring tools to incorporating observability, a performance data collection and analysis technology better suited to the complexity of modern, distributed cloud-native applications.

Guide Quick guide to operationalizing FinOps automation

Go deeper in your learning about FinOps and understand its advantages and challenges.

Related content

Register for the guide on observability

How APM works

Again, APM gathers software application performance data, analyzes it to detect potential performance problems, and provides information or accelerates resolution of those problems. The chief difference in how they gather and analyze the data is the difference between application performance monitoring and observability.

Application performance monitoring

In application performance monitoring, agents are deployed throughout the application environment and supporting infrastructure to 'monitor' performance by sampling performance and performance-related metrics (sometimes called telemetry) as frequently as once every minute. Types of monitoring these agents perform include:

In addition to collecting performance data, these agents perform user-defined transaction profiling, tracing each transaction from the user UI or device through every application component or resource involved in the transaction. This information is used to determine application dependencies, and to create a topology map - a visualization of the dependencies between application and infrastructure components, ideally across on-premises, private cloud, public cloud (including any software-as-a-service or SaaS solutions) and or hybrid cloud environments.

APM solutions typically provide a controller and centralized dashboard where the collected performance metrics are aggregated, analyzed and compared to established baselines. The dashboard alerts system administrators to deviations from baselines that indicate actual or potential performance issues; it also provides contextual information and actionable insights administrators can use to troubleshoot and resolve the issues.

Observability

Periodic sampling is effective enough for monitoring and troubleshooting monolithic applications or traditional distributed applications, where new code is released periodically and workflows and dependencies between application components, servers and related resources are well-known or easy to trace.

But today, as organizations are adopting modern development practices and cloud-native technologies—Agile and DevOps methodologies, microservices, Docker containers, Kubernetes and serverless functions—they're deploying new application components so often, in so many places, in so many languages and for such widely varying periods of time that the once-a-minute data sampling of traditional monitoring solutions can't keep up.

Observability swaps traditional monitoring agents with instrumentation that collects performance and contextual data non-stop, and uses machine-learning techniques to correlate and analyze the data in real-time. With an observability solution, development, IT operations and site reliability engineering (SRE) teams can:

Observability doesn’t replace monitoring; it enables better monitoring, and better APM.

Learn more about observability AI and AIOps: The future of APM

Today APM tools are using observability and AI in varying degrees. Some are combining traditional application performance monitoring with AI to automate the discovery of changing transaction paths and application dependencies. Others are combining observability with AI to automatically determine performance baselines, and to sift signals, or actionable insights, from the 'noise' of IT operations management (ITOM) data. Industry analyst Gartner finds that organizations can realize a "60% noise reduction in ITOM through use of AI-augmented tools."

The ultimate goal—and the future of APM and IT operations—is to combine observability with artificial intelligence for IT operations, or AIOps, to create self-healing, self-optimizing infrastructure. Together, the steady stream of real-time observability telemetry and AIOps machine learning and automation can predict application performance issues based on system outputs, resolve them before they impact the user experience or operations and even take actions to optimize application performance - all without management intervention.

Related solutions IBM Cloud Pak® for Watson AIOps

Innovate faster, reduce operational cost and transform IT operations (ITOps) with an AIOps platform that delivers visibility into performance data and dependencies across environments.

Explore Cloud Pak® for Watson AIOps IBM Observability with Instana®

Discover the leading enterprise observability platform for hybrid clouds. Improve application performance management and accelerate CI/CD pipelines no matter where the applications reside.

Explore IBM Observability with Instana® Manage your application resources with IBM® Turbonomic®

Leverage observability to proactively optimize application resourcing, ensure performance and save money.

Explore IBM Turbonomic Resources Future-proof your IT operations with AI

Learn how AI for IT improves business outcomes, leads to increased revenue and lowers both cost and risk for organizations.

Get the Gartner report IT automation, powered by AI

Achieve new levels of efficiency and resiliency in your IT operations.

View the infographic What is site reliability engineering (SRE)?

SRE uses software engineering to automate IT operations tasks that would otherwise be performed manually by systems administrators.

Learn more Video Intelligent app resource management with AI-powered automation

Gain full visibility into your application and infrastructure resource allocation which contribute to user response time and any resource congestion.

Learn more with our developer advocates (6:59)